]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
8 years agoocfs2: fix trans extend while flush truncate log
Junxiao Bi [Thu, 8 Sep 2016 06:57:15 +0000 (14:57 +0800)]
ocfs2: fix trans extend while flush truncate log

Orabug: 24759174

Every time,  ocfs2_extend_trans() included a credit for truncate log inode,
but as that inode had been managed by jbd2 running transaction first time,
it will not consume that credit until jbd2_journal_restart(). Since total
credits to extend always included the un-consumed ones, there will be more
and more un-consumed credit, at last jbd2_journal_restart() will fail due
to credit number over the half of max transction credit.

The following error was caught when unlink a large file with many extents.

[233096.013936] ------------[ cut here ]------------
[233096.018586] WARNING: CPU: 0 PID: 13626 at fs/jbd2/transaction.c:269 start_this_handle+0x4c3/0x510 [jbd2]()
[233096.028335] Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea parport_pc parport pcspkr i2c_piix4 i2c_core acpi_cpufreq ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod
[233096.081751] CPU: 0 PID: 13626 Comm: unlink Tainted: G        W       4.1.12-37.6.3.el6uek.x86_64 #2
[233096.088556] Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016
[233096.093125]  000000000000010d ffff88000018b768 ffffffff816bc5bc 000000000000010d
[233096.099082]  0000000000000000 ffff88000018b7a8 ffffffff81081475 ffff88000018b788
[233096.105038]  ffff88007a99a000 ffff88007b573390 00000000000000fb 0000000000000050
[233096.110540] Call Trace:
[233096.111893]  [<ffffffff816bc5bc>] dump_stack+0x48/0x5c
[233096.114637]  [<ffffffff81081475>] warn_slowpath_common+0x95/0xe0
[233096.117797]  [<ffffffff810814da>] warn_slowpath_null+0x1a/0x20
[233096.120984]  [<ffffffffa0080993>] start_this_handle+0x4c3/0x510 [jbd2]
[233096.124505]  [<ffffffffa0088f95>] ? __jbd2_log_start_commit+0xe5/0xf0 [jbd2]
[233096.128115]  [<ffffffff810c4eb3>] ? __wake_up+0x53/0x70
[233096.130924]  [<ffffffffa0080b41>] jbd2__journal_restart+0x161/0x1b0 [jbd2]
[233096.134523]  [<ffffffffa0080ba3>] jbd2_journal_restart+0x13/0x20 [jbd2]
[233096.137986]  [<ffffffffa06d1d94>] ocfs2_extend_trans+0x74/0x220 [ocfs2]
[233096.141407]  [<ffffffffa06d156a>] ? ocfs2_journal_dirty+0x3a/0x90 [ocfs2]
[233096.144921]  [<ffffffffa0692943>] ocfs2_replay_truncate_records+0x93/0x360 [ocfs2]
[233096.148819]  [<ffffffffa0697ace>] __ocfs2_flush_truncate_log+0x13e/0x3a0 [ocfs2]
[233096.152644]  [<ffffffffa0697304>] ? ocfs2_reserve_blocks_for_rec_trunc.clone.0+0x44/0x1f0 [ocfs2]
[233096.157310]  [<ffffffffa069f768>] ocfs2_remove_btree_range+0x458/0x7f0 [ocfs2]
[233096.161099]  [<ffffffffa0696777>] ? __ocfs2_find_path+0x187/0x2d0 [ocfs2]
[233096.164612]  [<ffffffffa06a2673>] ocfs2_commit_truncate+0x1b3/0x6f0 [ocfs2]
[233096.168204]  [<ffffffffa0744ac0>] ? ocfs2_xattr_tree_et_ops+0x60/0xfffffffffffe8c20 [ocfs2]
[233096.172539]  [<ffffffffa06d1a00>] ? ocfs2_journal_access_eb+0x20/0x20 [ocfs2]
[233096.176285]  [<ffffffff81202303>] ? __sb_end_write+0x33/0x70
[233096.179226]  [<ffffffffa06ca61d>] ocfs2_truncate_for_delete+0xbd/0x380 [ocfs2]
[233096.183009]  [<ffffffffa06ca294>] ? ocfs2_query_inode_wipe+0xf4/0x320 [ocfs2]
[233096.186738]  [<ffffffffa06caf76>] ocfs2_wipe_inode+0x136/0x6a0 [ocfs2]
[233096.190165]  [<ffffffffa06ca294>] ? ocfs2_query_inode_wipe+0xf4/0x320 [ocfs2]
[233096.193846]  [<ffffffffa06cb782>] ocfs2_delete_inode+0x2a2/0x3e0 [ocfs2]
[233096.197274]  [<ffffffff812298c9>] ? __inode_wait_for_writeback+0x69/0xc0
[233096.200736]  [<ffffffffa0732180>] ? __PRETTY_FUNCTION__.112282+0x20/0xffffffffffffb520 [ocfs2]
[233096.205146]  [<ffffffffa06cc298>] ocfs2_evict_inode+0x28/0x60 [ocfs2]
[233096.208462]  [<ffffffff8121b81b>] evict+0xab/0x1a0
[233096.211020]  [<ffffffffa0732180>] ? __PRETTY_FUNCTION__.112282+0x20/0xffffffffffffb520 [ocfs2]
[233096.215396]  [<ffffffff8121ba06>] iput_final+0xf6/0x190
[233096.218169]  [<ffffffff8121bb68>] iput+0xc8/0xe0
[233096.220586]  [<ffffffff8120f9b7>] do_unlinkat+0x1b7/0x310
[233096.223487]  [<ffffffff8106ae5b>] ? __do_page_fault+0x18b/0x480
[233096.226655]  [<ffffffff81126dbc>] ? __audit_syscall_entry+0xac/0x110
[233096.230009]  [<ffffffff810236cc>] ? do_audit_syscall_entry+0x6c/0x70
[233096.233346]  [<ffffffff81023823>] ? syscall_trace_enter_phase1+0x153/0x180
[233096.237103]  [<ffffffff8120fb26>] SyS_unlink+0x16/0x20
[233096.239800]  [<ffffffff816c122e>] system_call_fastpath+0x12/0x71
[233096.244346] ---[ end trace 28aa7410e69369cf ]---
[233096.247798] JBD2: unlink wants too many credits (251 > 128)

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: extend enough credits for freeing one truncate record while replaying truncate...
Xue jiufei [Fri, 25 Mar 2016 21:21:44 +0000 (14:21 -0700)]
ocfs2: extend enough credits for freeing one truncate record while replaying truncate records

Orabug: 24759174

Now function ocfs2_replay_truncate_records() first modifies tl_used,
then calls ocfs2_extend_trans() to extend transactions for gd and alloc
inode used for freeing clusters.  jbd2_journal_restart() may be called
and it may happen that tl_used in truncate log is decreased but the
clusters are not freed, which means these clusters are lost.  So we
should avoid extending transactions in these two operations.

Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Acked-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 102c2595aa193f598c0f4b1bf2037d168c80e551)

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: Fix double put of recount tree in ocfs2_lock_refcount_tree()
Ashish Samant [Thu, 15 Sep 2016 19:26:19 +0000 (12:26 -0700)]
ocfs2: Fix double put of recount tree in ocfs2_lock_refcount_tree()

Orabug: 24759721

In ocfs2_lock_refcount_tree, if ocfs2_read_refcount_block() returns error,
we do ocfs2_refcount_tree_put twice (once in ocfs2_unlock_refcount_tree
and once outside it), thereby reducing the refcount of the refcount tree
twice, but we dont delete the tree in this case. This will make refcnt
of the tree = 0 and the ocfs2_refcount_tree_put will eventually call
ocfs2_mark_lockres_freeing, setting OCFS2_LOCK_FREEING for the
refcount_tree->rf_lockres.

The error returned by ocfs2_read_refcount_block is propagated all the way
back and for next iteration of write, ocfs2_lock_refcount_tree gets the
same tree back from ocfs2_get_refcount_tree because we havent deleted the
tree. Now we have the same tree, but OCFS2_LOCK_FREEING is set for
rf_lockres and eventually, when _ocfs2_lock_refcount_tree is called in
this iteration, BUG_ON( __ocfs2_cluster_lock:1395 ERROR: Cluster lock
called on freeing lockres T00000000000000000386019775b08d! flags 0x81) is
triggerred.

Call stack:

(loop16,11155,0):ocfs2_lock_refcount_tree:482 ERROR: status = -5
(loop16,11155,0):ocfs2_refcount_cow_hunk:3497 ERROR: status = -5
(loop16,11155,0):ocfs2_refcount_cow:3560 ERROR: status = -5
(loop16,11155,0):ocfs2_prepare_inode_for_refcount:2111 ERROR: status = -5
(loop16,11155,0):ocfs2_prepare_inode_for_write:2190 ERROR: status = -5
(loop16,11155,0):ocfs2_file_write_iter:2331 ERROR: status = -5
(loop16,11155,0):__ocfs2_cluster_lock:1395 ERROR: bug expression:
lockres->l_flags & OCFS2_LOCK_FREEING

(loop16,11155,0):__ocfs2_cluster_lock:1395 ERROR: Cluster lock called on
freeing lockres T00000000000000000386019775b08d! flags 0x81

------------[ cut here ]------------
kernel BUG at fs/ocfs2/dlmglue.c:1395!

invalid opcode: 0000 [#1] SMP  CPU 0
Modules linked in: tun ocfs2 jbd2 xen_blkback xen_netback xen_gntdev ..
sd_mod crc_t10dif ext3 jbd mbcache

Pid: 11155, comm: loop16 Tainted: G        W   2.6.39-400.279.1.el5uek #1
Oracle Corporation ORACLE SERVER X5-2/ASM,MOTHERBOARD,1U
RIP: e030:[<ffffffffa082137c>]  [<ffffffffa082137c>]
__ocfs2_cluster_lock+0x31c/0x740 [ocfs2]
RSP: e02b:ffff88017c0138a0  EFLAGS: 00010086
RAX: 000000000000008b RBX: ffff8801b5374300 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004
RBP: ffff88017c013980 R08: 0000000000206db1 R09: ffff8800000bbf20
R10: 0000000000000000 R11: 0000000000000001 R12: ffff88017bd25848
R13: 1000000000000800 R14: ffff8801b5374948 R15: 0000000000000005
FS:  00007f8198d746e0(0000) GS:ffff8801d6600000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f641d181110 CR3: 000000017c15b000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process loop16 (pid: 11155, threadinfo ffff88017c010000, task
ffff8801b5374300)
Stack:
 ffff88017bd25880 0000000000000081 000000017c013920 ffff88017c013960
 000000000000001d 0000000000000001 ffff88017bd258b4 0000000000000000
 ffff880172006000 00000000a07fa410 ffff88017bd202b4 0000000000000000
Call Trace:
 [<ffffffffa08227de>] ocfs2_refcount_lock+0xae/0x130 [ocfs2]
 [<ffffffffa0846b89>] ? __ocfs2_lock_refcount_tree+0x29/0xe0 [ocfs2]
 [<ffffffff81509dde>] ? _raw_spin_lock+0xe/0x20
 [<ffffffffa0846b89>] __ocfs2_lock_refcount_tree+0x29/0xe0 [ocfs2]
 [<ffffffffa084d47d>] ocfs2_lock_refcount_tree+0xdd/0x320 [ocfs2]
 [<ffffffffa084de3b>] ocfs2_refcount_cow_hunk+0x1cb/0x440 [ocfs2]
 [<ffffffffa084e159>] ocfs2_refcount_cow+0xa9/0x1d0 [ocfs2]
 [<ffffffffa08291c7>] ? ocfs2_prepare_inode_for_refcount+0x67/0x200 [ocfs2]
 [<ffffffffa0829275>] ocfs2_prepare_inode_for_refcount+0x115/0x200 [ocfs2]
 [<ffffffffa081f394>] ? ocfs2_inode_unlock+0xd4/0x140 [ocfs2]
 [<ffffffffa082969b>] ocfs2_prepare_inode_for_write+0x33b/0x470 [ocfs2]
 [<ffffffffa0822620>] ? ocfs2_rw_lock+0x80/0x190 [ocfs2]
 [<ffffffffa082c150>] ocfs2_file_write_iter+0x220/0x8c0 [ocfs2]
 [<ffffffff81112c67>] ? mempool_free_slab+0x17/0x20
 [<ffffffff8119f2b1>] ? bio_free+0x61/0x70
 [<ffffffff811adece>] ? aio_kernel_free+0xe/0x10
 [<ffffffff811adb1e>] aio_write_iter+0x2e/0x30

Fix this by avoiding the second call to ocfs2_refcount_tree_put()

Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: Eric Ren <zren@suse.com>
Acked-by: Srinivas Eeda <srinivas.eeda@oracle.com>
8 years agoocfs2: Fix start offset to ocfs2_zero_range_for_truncate()
Ashish Samant [Wed, 3 Aug 2016 02:26:30 +0000 (19:26 -0700)]
ocfs2: Fix start offset to ocfs2_zero_range_for_truncate()

If we punch a hole on a reflink such that following conditions are met:

1. start offset is on a cluster boundary
2. end offset is not on a cluster boundary
3. (end offset is somewhere in another extent) or
   (hole range > MAX_CONTIG_BYTES(1MB)),

we dont COW the first cluster starting at the start offset. But in this
case, we were wrongly passing this cluster to
ocfs2_zero_range_for_truncate() to zero out. This will modify the cluster
in place and zero it in the source too.

Fix this by skipping this cluster in such a scenario.

Orabug: 24516161

Reported-by: Saar Maoz <saar.maoz@oracle.com>
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
8 years agoocfs2: improve recovery performance
Junxiao Bi [Mon, 18 Jul 2016 02:57:56 +0000 (10:57 +0800)]
ocfs2: improve recovery performance

Orabug: 24308229

Journal replay will be run when do recovery for a dead node,
to avoid the stale cache impact, all blocks of dead node's
journal inode were reload from disk. This hurts the performance,
check whether one block is cached before reload it can improve
a lot performance. In my test env, the time doing recovery was
improved from 120s to 1s.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
9 years agoocfs2: o2hb: fix hb hung time
Junxiao Bi [Fri, 27 Nov 2015 14:16:54 +0000 (22:16 +0800)]
ocfs2: o2hb: fix hb hung time

hr_last_timeout_start should be set as the last time where hb is still OK.
When hb write timeout, hung time will be (jiffies - hr_last_timeout_start).

Oracle-bug: 21862940

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Ryan Ding <ryan.ding@oracle.com>
9 years agoocfs2: o2hb: don't negotiate if last hb fail
Junxiao Bi [Mon, 21 Sep 2015 07:54:06 +0000 (15:54 +0800)]
ocfs2: o2hb: don't negotiate if last hb fail

Sometimes io error is returned when storage is down for a while.
Like for iscsi device, stroage is made offline when session timeout,
and this will make all io return -EIO. For this case, nodes shouldn't
do negotiate timeout but should fence self. So let nodes fence self
when o2hb_do_disk_heartbeat return an error, this is the same behavior
with o2hb without negotiate timer.

Oracle-bug: 21862940

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Ryan Ding <ryan.ding@oracle.com>
9 years agoocfs2: o2hb: add some user/debug log
Junxiao Bi [Tue, 22 Sep 2015 08:10:25 +0000 (16:10 +0800)]
ocfs2: o2hb: add some user/debug log

Oracle-bug: 21862940

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Ryan Ding <ryan.ding@oracle.com>
9 years agoocfs2: o2hb: add NEGOTIATE_APPROVE message
Junxiao Bi [Fri, 18 Sep 2015 05:57:33 +0000 (13:57 +0800)]
ocfs2: o2hb: add NEGOTIATE_APPROVE message

This message is used to re-queue write timeout timer and negotiate timer
when all nodes suffer a write hung to storage, this makes node not fence
self if storage down.

Oracle-bug: 21862940

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Ryan Ding <ryan.ding@oracle.com>
9 years agoocfs2: o2hb: add NEGO_TIMEOUT message
Junxiao Bi [Fri, 18 Sep 2015 05:47:39 +0000 (13:47 +0800)]
ocfs2: o2hb: add NEGO_TIMEOUT message

This message is sent to master node when non-master nodes's
negotiate timer expired. Master node records these nodes in
a bitmap which is used to do write timeout timer re-queue
decision.

Oracle-bug: 21862940

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Ryan Ding <ryan.ding@oracle.com>
9 years agoocfs2: o2hb: add negotiate timer
Junxiao Bi [Fri, 18 Sep 2015 07:15:31 +0000 (15:15 +0800)]
ocfs2: o2hb: add negotiate timer

When storage down, all nodes will fence self due to write timeout.
The negotiate timer is designed to avoid this, with it node will
wait until storage up again.

Negotiate timer working in the following way:

1. The timer expires before write timeout timer, its timeout is half
of write timeout now. It is re-queued along with write timeout timer.
If expires, it will send NEGO_TIMEOUT message to master node(node with
lowest node number). This message does nothing but marks a bit in a
bitmap recording which nodes are negotiating timeout on master node.

2. If storage down, nodes will send this message to master node, then
when master node finds its bitmap including all online nodes, it sends
NEGO_APPROVL message to all nodes one by one, this message will re-queue
write timeout timer and negotiate timer.
For any node doesn't receive this message or meets some issue when
handling this message, it will be fenced.
If storage up at any time, o2hb_thread will run and re-queue all the
timer, nothing will be affected by these two steps.

Oracle-bug: 21862940

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Ryan Ding <ryan.ding@oracle.com>
9 years agoocfs2: call ocfs2_abort when journal abort
Ryan Ding [Mon, 21 Dec 2015 03:01:12 +0000 (11:01 +0800)]
ocfs2: call ocfs2_abort when journal abort

orabug: 22293201

journal can not recover from abort state, so we should take following action to
prevent file system from corruption:

1. change to readonly filesystem when local mount. We can not afford further
   write, so change to RO state is reasonable.

2. panic when cluster mount. Because we can not release lock resource in this
   state, other node will hung when it require a lock owned by this node. So
   panic and remaster is a reasonable choise.

ocfs2_abort() will do all the above work.

Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
9 years agoocfs2: o2hb: increase unsteady iterations
Junxiao Bi [Thu, 14 Jan 2016 23:17:15 +0000 (15:17 -0800)]
ocfs2: o2hb: increase unsteady iterations

Oracle-bug: 21886612

When run multiple xattr test of ocfs2-test on a three-nodes cluster,
mount failed sometimes with the following message.

  o2hb: Unable to stabilize heartbeart on region D18B775E758D4D80837E8CF3D086AD4A (xvdb)

Stabilize heartbeat depends on the timing order to mount ocfs2 from
cluster nodes and how fast the tcp connections are established.  So
increase unsteady interations to leave more time for it.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit a84ac334dcb44c76f0b051513a6c27a2d747f883)
Reviewed-by: Ryan Ding <ryan.ding@oracle.com>
9 years agoocfs2: return non-zero st_blocks for inline data
John Haxby [Wed, 9 Dec 2015 05:30:22 +0000 (16:30 +1100)]
ocfs2: return non-zero st_blocks for inline data

Some versions of tar assume that files with st_blocks == 0 do not contain
any data and will skip reading them entirely.  See also commit
9206c561554c ("ext4: return non-zero st_blocks for inline data").

Signed-off-by: John Haxby <john.haxby@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Acked-by: Gang He <ghe@suse.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit ca426103429543e7a9be9017537fc3ffc37b5724)
Orabug: 22218243
Signed-off-by: John Haxby <john.haxby@oracle.com>
9 years agoocfs2: fix SGID not inherited issue
Junxiao Bi [Mon, 7 Dec 2015 03:00:34 +0000 (11:00 +0800)]
ocfs2: fix SGID not inherited issue

Oracle-bug: 22311520

commit 8f1eb48758aa ("ocfs2: fix umask ignored issue") introduced an issue,
SGID of sub dir was not inherited from its parents dir. It is because SGID
is set into "inode->i_mode" in ocfs2_get_init_inode(), but is overwritten
by "mode" which don't have SGID set later.

Fixes: 8f1eb48758aa ("ocfs2: fix umask ignored issue")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Acked-by: Srinivas Eeda <srinivas.eeda@oracle.com>
9 years agoocfs2: fix umask ignored issue
Junxiao Bi [Mon, 30 Nov 2015 21:44:50 +0000 (13:44 -0800)]
ocfs2: fix umask ignored issue

Oracle-bug: 22155833

New created file's mode is not masked with umask, and this makes umask not
work for ocfs2 volume.

Fixes: 702e5bc ("ocfs2: use generic posix ACL infrastructure")
Acked-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Gang He <ghe@suse.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 8f1eb48758aacf6c1ffce18179295adbf3bd7640)

9 years agoRevert "ocfs2: change ip_unaligned_aio to of type mutex from atomit_t"
Ryan Ding [Fri, 16 Oct 2015 08:26:40 +0000 (16:26 +0800)]
Revert "ocfs2: change ip_unaligned_aio to of type mutex from atomit_t"

This reverts commit c18ceab01240fd4c354b78d877571b729908e4a3.

Test shows ip_unaligned_aio will cost much cpu clock when doing aio+dio(in a
function named mutex_spin_on_owner), and will significant affect performance in
a system with poor cpu.

The cause is we should not call mutex_unlock(see the comments above
mutex_unlock) in ocfs2_dio_end_io, which will be in irq context when doing
aio+dio.

Revert the patch to use wait_event/wake_up_all to do the work.

Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
9 years agoocfs2: fix a performance issue with synced buffer io
Ryan Ding [Tue, 20 Oct 2015 08:47:33 +0000 (16:47 +0800)]
ocfs2: fix a performance issue with synced buffer io

orabug: 20396205

If we flush data with WB_SYNC_ALL which is set in struct writeback_control. It
will be transfered to a bio with WRITE_SYNC flag(that is done in the interface
block_write_full_page()). And after multi-queue is introduced to kernel block
layer, a bio with SYNC flag will be sent to disk without queue. It will affect
the performance significantly if the disk has a poor iops.

This patch is a work around to this. Use filemap_flush() to try to flush dirty
pages with WB_SYNC_NONE flag.
* In journal=order mode, this is safe because the following
  jbd2_journal_force_commit() will ensure data integrity.
* In journal=writeback mode, we will call filemap_write_and_wait_range() to
  meet the semantics of O_SYNC & O_DIRECT.

It should help to improve performance with direct io (in the case when direct
io fall to buffer io), and buffer io with O_SYNC.

Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
9 years agorevert commit ff8fb335221e2c446b0d4cbea26be371fd2feb64
Tariq Saeed [Wed, 16 Sep 2015 22:23:02 +0000 (15:23 -0700)]
revert commit ff8fb335221e2c446b0d4cbea26be371fd2feb64

Orabug: 21696932

CRS/RAC install fails if modinfo does not display the version string.

Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
9 years agoadd OCFS2_LOCK_RECURSIVE arg_flags to ocfs2_cluster_lock() to prevent hang
Tariq Saeed [Fri, 4 Sep 2015 22:39:03 +0000 (15:39 -0700)]
add OCFS2_LOCK_RECURSIVE arg_flags to ocfs2_cluster_lock() to prevent hang

Orabug: 21793017

ocfs2_setattr called by chmod command  holds cluster wide inode lock
(Orabug 21685187) when calling posix_acl_chmod. This
latter function in turn calls ocfs2_iop_get_acl and ocfs2_iop_set_acl.
These two are also called directly from vfs layer for getfacl/setfacl
commands and therefore acquire the cluster wide inode lock. If a remote
conversion request comes after the first inode lock in ocfs2_setattr,
OCFS2_LOCK_BLOCKED will be set in l_flags. This will cause the second
call to inode lock from the  ocfs2_iop_get|set_acl() to block indefinetly.
The new flag OCFS2_LOCK_RECURSIVE will be used to prevent this blocking.

Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoocfs2: direct write will call ocfs2_rw_unlock() twice when doing aio+dio
Ryan Ding [Mon, 7 Sep 2015 05:38:00 +0000 (13:38 +0800)]
ocfs2: direct write will call ocfs2_rw_unlock() twice when doing aio+dio

ocfs2_file_write_iter() is usng the wrong return value ('written').  This
will cause ocfs2_rw_unlock() be called both in write_iter & end_io,
triggering a BUG_ON.

This issue was introduced by commit 7da839c47589 ("ocfs2: use
__generic_file_write_iter()").

Orabug: 21612107
Fixes: 7da839c47589 ("ocfs2: use __generic_file_write_iter()")
Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit aa1057b3dec478b20c77bad07442318ae36d893c)

Conflicts:
fs/ocfs2/file.c
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoocfs2_iop_set/get_acl() are also called from the VFS so we must take inode lock
Tariq Saeed [Thu, 3 Sep 2015 04:55:40 +0000 (21:55 -0700)]
ocfs2_iop_set/get_acl() are also called from the VFS so we must take inode lock

ocfs2_iop_set/get_acl() are also called from the VFS so we must take inode lock

Orabug: 20189959

This bug in mainline code is pointed out by Mark Fasheh. When ocfs2_iop_set_acl
and ocfs2_iop_ge_acl are entered from VFS layer, inode lock is not held. This
seems to be regression from older kernels. The patch is to fix that.

Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoBUG_ON(lockres->l_level != DLM_LOCK_EX && !checkpointed) tripped in ocfs2_ci_checkpointed
Tariq Saeed [Wed, 2 Sep 2015 21:37:41 +0000 (14:37 -0700)]
BUG_ON(lockres->l_level != DLM_LOCK_EX && !checkpointed) tripped in ocfs2_ci_checkpointed

Orabug: 20189959

PID: 614    TASK: ffff882a739da580  CPU: 3   COMMAND: "ocfs2dc"
 #0 [ffff882ecc3759b0] machine_kexec at ffffffff8103b35d
 #1 [ffff882ecc375a20] crash_kexec at ffffffff810b95b5
 #2 [ffff882ecc375af0] oops_end at ffffffff815091d8
 #3 [ffff882ecc375b20] die at ffffffff8101868b
 #4 [ffff882ecc375b50] do_trap at ffffffff81508bb0
 #5 [ffff882ecc375ba0] do_invalid_op at ffffffff810165e5
 #6 [ffff882ecc375c40] invalid_op at ffffffff815116fb
    [exception RIP: ocfs2_ci_checkpointed+208]
    RIP: ffffffffa0a7e940  RSP: ffff882ecc375cf0  RFLAGS: 00010002
    RAX: 0000000000000001  RBX: 000000000000654b  RCX: ffff8812dc83f1f8
    RDX: 00000000000017d9  RSI: ffff8812dc83f1f8  RDI: ffffffffa0b2c318
    RBP: ffff882ecc375d20   R8: ffff882ef6ecfa60   R9: ffff88301f272200
    R10: 0000000000000000  R11: 0000000000000000  R12: ffffffffffffffff
    R13: ffff8812dc83f4f0  R14: 0000000000000000  R15: ffff8812dc83f1f8
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff882ecc375d28] ocfs2_check_meta_downconvert at ffffffffa0a7edbd [ocfs2]
 #8 [ffff882ecc375d38] ocfs2_unblock_lock at ffffffffa0a84af8 [ocfs2]
 #9 [ffff882ecc375dc8] ocfs2_process_blocked_lock at ffffffffa0a85285 [ocfs2]
#10 [ffff882ecc375e18] ocfs2_downconvert_thread_do_work at ffffffffa0a85445 [ocfs2]
#11 [ffff882ecc375e68] ocfs2_downconvert_thread at ffffffffa0a854de [ocfs2]
#12 [ffff882ecc375ee8] kthread at ffffffff81090da7
#13 [ffff882ecc375f48] kernel_thread_helper at ffffffff81511884
assert is tripped because the tran is not checkpointed and the lock level is PR.

Some time ago, chmod command had been executed. As result, the following call
chain left the inode cluster lock in PR state, latter on causing the assert.
system_call_fastpath
 -> my_chmod
  -> sys_chmod
   -> sys_fchmodat
    -> notify_change
     -> ocfs2_setattr
      -> posix_acl_chmod
       -> ocfs2_iop_set_acl
        -> ocfs2_set_acl
         -> ocfs2_acl_set_mode
Here is how.
1119 int ocfs2_setattr(struct dentry *dentry, struct iattr *attr)
1120 {
1247         ocfs2_inode_unlock(inode, 1); <<< WRONG thing to do.
..
1258         if (!status && attr->ia_valid & ATTR_MODE) {
1259                 status =  posix_acl_chmod(inode, inode->i_mode);

519 posix_acl_chmod(struct inode *inode, umode_t mode)
520 {
..
539         ret = inode->i_op->set_acl(inode, acl, ACL_TYPE_ACCESS);

287 int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, ...
288 {
289         return ocfs2_set_acl(NULL, inode, NULL, type, acl, NULL, NULL);

224 int ocfs2_set_acl(handle_t *handle,
225                          struct inode *inode, ...
231 {
..
252                                 ret = ocfs2_acl_set_mode(inode, di_bh,
253                                                          handle, mode);

168 static int ocfs2_acl_set_mode(struct inode *inode, struct buffer_head ...
170 {
183         if (handle == NULL) {
                   >>> BUG: inode lock not held in ex at this point <<<
184                 handle = ocfs2_start_trans(OCFS2_SB(inode->i_sb),
185                                            OCFS2_INODE_UPDATE_CREDITS);

ocfs2_setattr.#1247 we unlock and at #1259 call posix_acl_chmod. When we reach
ocfs2_acl_set_mode.#181 and do trans, the inode cluster lock is not held in EX
mode (it should be). How this could have happended?

We are the lock master, were holding lock EX and have released it in
ocfs2_setattr.#1247. Note that there are no holders of this lock at
this point. Another node needs the lock in PR, and we downconvert from
EX to PR. So the inode lock is PR when do the trans in
ocfs2_acl_set_mode.#184. The trans stays in core (not flushed to disc).
Now another node want the lock in EX, downconvert thread gets kicked (the
one that tripped assert abovt), finds an unflushed trans but the lock is
not EX (it is PR). If the lock was at EX, it would have flushed the trans
ocfs2_ci_checkpointed -> ocfs2_start_checkpoint before downconverting (to NULL)
for the request.

ocfs2_setattr must not drop inode lock ex in this code path. If it does,
takes it again before the trans, say in ocfs2_set_acl, another cluster node can
get in between, execute another setattr, overwriting the one in progress
on this node, resulting in a mode acl size combo that is a mix of the two.

Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
9 years agoNFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock
Tariq Saeed [Tue, 2 Jun 2015 17:58:19 +0000 (10:58 -0700)]
NFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock

Orabug: 20933419

NFS on a 2 node ocfs2 cluster each node exporting dir. The lock causing
the hang is the global bit map inode lock.  Node 1 is master, has
the lock granted in PR mode; Node 2 is in the converting list (PR ->
EX). There are no holders of the lock on the master node so it should
downconvert to NL and grant EX to node 2 but that does not happen.
BLOCKED + QUEUED in lock res are set and it is on osb blocked list.
Threads are waiting in __ocfs2_cluster_lock on BLOCKED.  One thread wants
EX, rest want PR. So it is as though the downconvert thread needs to be
kicked to complete the conv.

The hang is caused by an EX req coming into  __ocfs2_cluster_lock on
the heels of a PR req after it sets BUSY (drops l_lock, releasing EX
thread), forcing the incoming EX to wait on BUSY without doing anything.
PR has called ocfs2_dlm_lock, which  sets the node 1 lock from NL ->
PR, queues ast.

At this time, upconvert (PR ->EX) arrives from node 2, finds conflict with
node 1 lock in PR, so the lock res is put on dlm thread's dirty listt.

After ret from ocf2_dlm_lock, PR thread now waits behind EX on BUSY till
awoken by ast.

Now it is dlm_thread that serially runs dlm_shuffle_lists, ast,  bast,
in that order.  dlm_shuffle_lists ques a bast on behalf of node 2
(which will be run by dlm_thread right after the ast).  ast does its
part, sets UPCONVERT_FINISHING, clears BUSY and wakes its waiters. Next,
dlm_thread runs  bast. It sets BLOCKED and kicks dc thread.  dc thread
runs ocfs2_unblock_lock, but since UPCONVERT_FINISHING set, skips doing
anything and reques.

Inside of __ocfs2_cluster_lock, since EX has been waiting on BUSY ahead
of PR, it wakes up first, finds BLOCKED set and skips doing anything
but clearing UPCONVERT_FINISHING (which was actually "meant" for the
PR thread), and this time waits on BLOCKED.  Next, the PR thread comes
out of wait but since UPCONVERT_FINISHING is not set, it skips updating
the l_ro_holders and goes straight to wait on BLOCKED. So there, we
have a hang! Threads in __ocfs2_cluster_lock wait on BLOCKED, lock
res in osb blocked list. Only when dc thread is awoken, it will run
ocfs2_unblock_lock and things will unhang.

One way to fix this is to wake the dc thread on the flag after clearing
UPCONVERT_FINISHING

Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
10 years agoocfs2: call ocfs2_journal_access_di() before ocfs2_journal_dirty() in ocfs2_write_end...
yangwenfang [Fri, 30 Jan 2015 02:11:19 +0000 (13:11 +1100)]
ocfs2: call ocfs2_journal_access_di() before ocfs2_journal_dirty() in ocfs2_write_end_nolock()

After we call ocfs2_journal_access_di() in ocfs2_write_begin(),
jbd2_journal_restart() may also be called, in this function transaction
A's t_updates-- and obtains a new transaction B.  If
jbd2_journal_commit_transaction() is happened to commit transaction A,
when t_updates==0, it will continue to complete commit and unfile buffer.

So when jbd2_journal_dirty_metadata(), the handle is pointed a new
transaction B, and the buffer head's journal head is already freed,
jh->b_transaction == NULL, jh->b_next_transaction == NULL, it returns
EINVAL, So it triggers the BUG_ON(status).

thread 1:                             jbd2:
ocfs2_write_begin                     jbd2_journal_commit_transaction
ocfs2_write_begin_nolock
  ocfs2_start_trans
    jbd2__journal_start(t_updates+1,
                       transaction A)
    ocfs2_journal_access_di
    ocfs2_write_cluster_by_desc
      ocfs2_mark_extent_written
        ocfs2_change_extent_flag
          ocfs2_split_extent
            ocfs2_extend_rotate_transaction
              jbd2_journal_restart
              (t_updates-1,transaction B) t_updates==0
                                        __jbd2_journal_refile_buffer

ocfs2_write_end
ocfs2_write_end_nolock
    ocfs2_journal_dirty
        jbd2_journal_dirty_metadata(bug)
   ocfs2_commit_trans

In ext4, I found that: jbd2_journal_get_write_access() called by

ext4_write_end.
ext4_write_begin
    ext4_journal_start
        __ext4_journal_start_sb
            ext4_journal_check_start
            jbd2__journal_start

ext4_write_end
    ext4_mark_inode_dirty
        ext4_reserve_inode_write
            ext4_journal_get_write_access
                jbd2_journal_get_write_access
        ext4_mark_iloc_dirty
            ext4_do_update_inode
                ext4_handle_dirty_metadata
                    jbd2_journal_dirty_metadata

So I think we should put ocfs2_journal_access_di before
  ocfs2_journal_dirty in the ocfs2_write_end.  and it works well after my
  modification.

Signed-off-by: vicky <vicky.yangwenfang@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 19bf7feab124221625b5c811b6192fff4e0cbb96)

10 years agoocfs2: avoid access invalid address when read o2dlm debug messages
jiangyiwen [Fri, 30 Jan 2015 02:11:19 +0000 (13:11 +1100)]
ocfs2: avoid access invalid address when read o2dlm debug messages

The following case will lead to a lockres is freed but is still in use.

cat /sys/kernel/debug/o2dlm/locking_state dlm_thread
lockres_seq_start
    -> lock dlm->track_lock
    -> get resA
                                                resA->refs decrease to 0,
                                                call dlm_lockres_release,
                                                and wait for "cat" unlock.
Although resA->refs is already set to 0,
increase resA->refs, and then unlock
                                                lock dlm->track_lock
                                                    -> list_del_init()
                                                    -> unlock
                                                    -> free resA

In such a race case, invalid address access may occurs.  So we should
delete list res->tracking before resA->refs decrease to 0.

Signed-off-by: jiangyiwen <jiangyiwen@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit e87e805fe4a1cf38031ae0669e3a91c8a8251279)

10 years agoocfs2: make 'buffered' as the default coherency option
Wengang Wang [Fri, 20 Dec 2013 04:54:42 +0000 (12:54 +0800)]
ocfs2: make 'buffered' as the default coherency option

Orabug: 17988729

Customers upgrading to uek2 and above will see the default coherency option
set to 'full' which impacts -ve performance. This patch changes coherence
option to buffered which keeps the default behaviour same as old(UEK1).
If an application that does direct i/o needs cache coherency then they can
use mount option 'coherency=full'

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Acked-by: Srinivas Eeda <srinivas.eeda@oracle.com>
(cherry picked from commit 020a20029508d2d7f36470bebd23f053de4b0dbe)

10 years agoocfs2: Suppress the error message from being printed in ocfs2_rename
Xiaowei.Hu [Fri, 24 May 2013 06:23:16 +0000 (14:23 +0800)]
ocfs2: Suppress the error message from being printed in ocfs2_rename

Did same thing with Goldwyn Rodrigues last patch.

While removing a non-empty directory, the kernel dumps a message:
(mv,29521,1):ocfs2_rename:1474 ERROR: status = -39

Orabug: 16790405
Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
(cherry picked from commit 92a8dfa5f424bf48ae39e4749c680c0cf5db4fd6)

10 years agoocfs2: Tighten free bit calculation in the global bitmap
Sunil Mushran [Tue, 8 Nov 2011 21:00:19 +0000 (13:00 -0800)]
ocfs2: Tighten free bit calculation in the global bitmap

When clearing bits in the global bitmap, we do not test the current bit value.
This patch tightens the code by considering the possiblity that the bit being
cleared was already cleared.

Now this should not happen. But we are seeing stray instances in which free
bit count in the global bitmap exceeds the total bit count. In each instance
the bitmap is correct. Only the free bit count is incorrect.

This patch checks the current bit value and increments the free bit count
only if the bit was previously set. It also prints information to allow
us to debug further.

Orabug: 17342255

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
(cherry picked from commit d1726d8617f7d27c54d12b50c9a20b248ebd0c66)

10 years agoocfs2/trivial: Limit unaligned aio+dio write messages to once per day
Sunil Mushran [Wed, 7 Sep 2011 18:39:30 +0000 (11:39 -0700)]
ocfs2/trivial: Limit unaligned aio+dio write messages to once per day

It was printing more frequently.

Orabug: 17342255

Signed-off-cy: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
(cherry picked from commit 8a2aa282cf9e003337f6c6949b1c7ed78347d59c)

10 years agoocfs2/trivial: Print message indicating unaligned aio+dio write
Sunil Mushran [Mon, 15 Aug 2011 20:49:11 +0000 (13:49 -0700)]
ocfs2/trivial: Print message indicating unaligned aio+dio write

Print a message indicating unaligned aio+dio writes. It prints a message
once per 24 hrs.

Orabug: 17342255

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
(cherry picked from commit dd146cf4c67bfce3a1fe1495c2f68d149d1c6db0)

10 years agoLinux 4.1 v4.1 v4.1test
Linus Torvalds [Mon, 22 Jun 2015 05:05:43 +0000 (22:05 -0700)]
Linux 4.1

10 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Linus Torvalds [Sun, 21 Jun 2015 00:26:01 +0000 (17:26 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending

Pull scsi target fixes from Nicholas Bellinger:
 "Apologies for the late pull request.

  Here are the outstanding target-pending fixes for v4.1 code.

  The series contains three patches from Sagi + Co that address a few
  iser-target issues that have been uncovered during recent testing at
  Mellanox.

  Patch #1 has a v3.16+ stable tag, and #2-3 have v3.10+ stable tags"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iser-target: Fix possible use-after-free
  iser-target: release stale iser connections
  iser-target: Fix variable-length response error completion

10 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Sat, 20 Jun 2015 20:54:22 +0000 (13:54 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "A smattering of fixes,

  mgag200:
      don't accept modes that aren't aligned properly as hw can't do it

  i915:
      two regression fixes

  radeon:
      one query to allow userspace fixes
      one oops fixer for older hw with new options enabled"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: don't probe MST on hw we don't support it on
  drm/radeon: Add RADEON_INFO_VA_UNMAP_WORKING query
  drm/mgag200: Reject non-character-cell-aligned mode widths
  Revert "drm/i915: Don't skip request retirement if the active list is empty"
  drm/i915: Always reset vma->ggtt_view.pages cache on unbinding

10 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 19 Jun 2015 17:36:50 +0000 (07:36 -1000)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Michael Turquette:
 "Very late clk regression fixes for the ARM-based AT91 platform.

  These went unnoticed by me until recently, hence the late pull
  request"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: at91: fix h32mx prototype inclusion in pmc header
  clk: at91: trivial: typo in peripheral clock description
  clk: at91: fix PERIPHERAL_MAX_SHIFT definition
  clk: at91: pll: fix input range validity check

10 years agoMerge tag 'sound-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Fri, 19 Jun 2015 17:34:14 +0000 (07:34 -1000)]
Merge tag 'sound-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Nothing looks scary, just a few usual HD-audio regression fixes and
  fixup, in addition to a minor Kconfig dependency fix for the old MIPS
  drivers"

* tag 'sound-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix unused label skip_i915
  ALSA: hda - Fix noisy outputs on Dell XPS13 (2015 model)
  ALSA: mips: let SND_SGI_O2 select SND_PCM
  ALSA: hda - Fix audio crackles on Dell Latitude E7x40
  ALSA: hda - adding a DAC/pin preference map for a HP Envy TS machine

10 years agoMerge branch 'ccf/atmel-fixes-for-4.1' of https://github.com/bbrezillon/linux-at91...
Michael Turquette [Fri, 19 Jun 2015 14:37:14 +0000 (07:37 -0700)]
Merge branch 'ccf/atmel-fixes-for-4.1' of https://github.com/bbrezillon/linux-at91 into clk-fixes

10 years agoclk: at91: fix h32mx prototype inclusion in pmc header
Nicolas Ferre [Thu, 28 May 2015 13:07:21 +0000 (15:07 +0200)]
clk: at91: fix h32mx prototype inclusion in pmc header

Trivial fix that prevents to compile this pmc clock driver if h32mx clock is
present but smd clock isn't.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Fixes: bcc5fd49a0fd ("clk: at91: add a driver for the h32mx clock")
Cc: <stable@vger.kernel.org> # 3.18+
10 years agoclk: at91: trivial: typo in peripheral clock description
Nicolas Ferre [Wed, 17 Jun 2015 13:22:51 +0000 (15:22 +0200)]
clk: at91: trivial: typo in peripheral clock description

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
10 years agoclk: at91: fix PERIPHERAL_MAX_SHIFT definition
Boris Brezillon [Thu, 28 May 2015 12:01:08 +0000 (14:01 +0200)]
clk: at91: fix PERIPHERAL_MAX_SHIFT definition

Fix the PERIPHERAL_MAX_SHIFT definition (3 instead of 4) and adapt the
round_rate and set_rate logic accordingly.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: "Wu, Songjun" <Songjun.Wu@atmel.com>
10 years agoclk: at91: pll: fix input range validity check
Boris Brezillon [Fri, 27 Mar 2015 22:53:15 +0000 (23:53 +0100)]
clk: at91: pll: fix input range validity check

The PLL impose a certain input range to work correctly, but it appears that
this input range does not apply on the input clock (or parent clock) but
on the input clock after it has passed the PLL divisor.
Fix the implementation accordingly.

Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Jonas Andersson <jonas@microbit.se>
10 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Fri, 19 Jun 2015 03:02:27 +0000 (17:02 -1000)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c documentation fix from Wolfram Sang:
 "Here is a small documentation fix for I2C.

  We already had a user who unsuccessfully tried to get the new slave
  framework running with the currently broken example.  So, before this
  happens again, I'd like to have this how-to-use section fixed for 4.1
  already.  So that no more hacking time is wasted"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: slave: fix the example how to instantiate from userspace

10 years agorevert "cpumask: don't perform while loop in cpumask_next_and()"
Andrew Morton [Thu, 18 Jun 2015 18:01:11 +0000 (11:01 -0700)]
revert "cpumask: don't perform while loop in cpumask_next_and()"

Revert commit 534b483a86e6 ("cpumask: don't perform while loop in
cpumask_next_and()").

This was a minor optimization, but it puts a `struct cpumask' on the
stack, which consumes too much stack space.

Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoMerge tag 'drm-intel-fixes-2015-06-18' of git://anongit.freedesktop.org/drm-intel...
Dave Airlie [Fri, 19 Jun 2015 01:58:39 +0000 (11:58 +1000)]
Merge tag 'drm-intel-fixes-2015-06-18' of git://anongit.freedesktop.org/drm-intel into drm-fixes

one fix, one revert
* tag 'drm-intel-fixes-2015-06-18' of git://anongit.freedesktop.org/drm-intel:
  Revert "drm/i915: Don't skip request retirement if the active list is empty"
  drm/i915: Always reset vma->ggtt_view.pages cache on unbinding

10 years agoMerge branch 'drm-fixes-4.1' of git://people.freedesktop.org/~deathsimple/linux into...
Dave Airlie [Fri, 19 Jun 2015 01:55:29 +0000 (11:55 +1000)]
Merge branch 'drm-fixes-4.1' of git://people.freedesktop.org/~deathsimple/linux into drm-fixes

two radeon fixes
one MST fix,
one query addition, destined for stable, and to fix a regression
* 'drm-fixes-4.1' of git://people.freedesktop.org/~deathsimple/linux:
  drm/radeon: don't probe MST on hw we don't support it on
  drm/radeon: Add RADEON_INFO_VA_UNMAP_WORKING query

10 years agodrm/radeon: don't probe MST on hw we don't support it on
Dave Airlie [Thu, 18 Jun 2015 04:29:18 +0000 (14:29 +1000)]
drm/radeon: don't probe MST on hw we don't support it on

If you do radeon.mst=1 on a gpu without mst hw, and then
plug some mst hw it will oops instead of falling back.

So check we have DCE5 at least before proceeding.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agodrm/radeon: Add RADEON_INFO_VA_UNMAP_WORKING query
Michel Dänzer [Tue, 16 Jun 2015 08:28:16 +0000 (17:28 +0900)]
drm/radeon: Add RADEON_INFO_VA_UNMAP_WORKING query

This tells userspace that it's safe to use the RADEON_VA_UNMAP operation
of the DRM_RADEON_GEM_VA ioctl.

Cc: stable@vger.kernel.org
(NOTE: Backporting this commit requires at least backports of commits
26d4d129b6042197b4cbc8341c0618f99231af2f,
48afbd70ac7b6aa62e8d452091023941d8085f8a and
c29c0876ec05d51a93508a39b90b92c29ba6423d as well, otherwise using
RADEON_VA_UNMAP runs into trouble)

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agoMerge tag 'trace-fix-filter-4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 18 Jun 2015 06:56:57 +0000 (20:56 -1000)]
Merge tag 'trace-fix-filter-4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing filter fix from Steven Rostedt:
 "Vince Weaver reported a warning when he added perf event filters into
  his fuzzer tests.  There's a missing check of balanced operations when
  parenthesis are used, and this triggers a WARN_ON() and when reading
  the failure, the filter reports no failure occurred.

  The operands were not being checked if they match, this adds that"

* tag 'trace-fix-filter-4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Have filter check for balanced ops

10 years agoMerge git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Thu, 18 Jun 2015 06:54:47 +0000 (20:54 -1000)]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm bugfix from Marcelo Tosatti:
 "Rrestore APIC migration functionality"

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: fix lapic.timer_mode on restore

10 years agoKconfig: disable Media Controller for DVB
Mauro Carvalho Chehab [Tue, 16 Jun 2015 09:26:59 +0000 (06:26 -0300)]
Kconfig: disable Media Controller for DVB

Since when we start discussions about the usage Media Controller for
complex hardware, one thing become clear: the way it is, MC fails to
map anything different than capture/output/m2m video-only streaming.

The point is that MC has entities named as devnodes, but the only
devnode used (before the DVB patches) is MEDIA_ENT_T_DEVNODE_V4L.
Due to the way MC got implemented, however, this entity actually
doesn't represent the devnode, but the hardware I/O engine that
receives data via DMA.

By coincidence, such DMA is associated with the V4L device node
on webcam hardware, but this is not true even for other V4L2
devices. For example, on USB hardware, the DMA is done via the
USB controller. The data passes though a in-kernel filter that
strips off the URB headers. Other V4L2 devices like radio may not
even have DMA. When it have, the DMA is done via ALSA, and not
via the V4L devnode.

In other words, MC is broken as a whole, but tagging it as BROKEN
right now would do more harm than good.

So, instead, let's mark, for now, the DVB part as broken and
block all new changes to MC while we fix this mess, whith
we hopefully will do for the next Kernel version.

Requested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Thu, 18 Jun 2015 06:49:26 +0000 (20:49 -1000)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "This fixes the following issues:

   - Crash in caam hash due to uninitialised buffer lengths.

   - Alignment issue in caam RNG that may lead to non-random output"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: caam - fix RNG buffer cache alignment
  crypto: caam - improve initalization for context state saves

10 years agomm: shmem_zero_setup skip security check and lockdep conflict with XFS
Hugh Dickins [Sun, 14 Jun 2015 16:48:09 +0000 (09:48 -0700)]
mm: shmem_zero_setup skip security check and lockdep conflict with XFS

It appears that, at some point last year, XFS made directory handling
changes which bring it into lockdep conflict with shmem_zero_setup():
it is surprising that mmap() can clone an inode while holding mmap_sem,
but that has been so for many years.

Since those few lockdep traces that I've seen all implicated selinux,
I'm hoping that we can use the __shmem_file_setup(,,,S_PRIVATE) which
v3.13's commit c7277090927a ("security: shmem: implement kernel private
shmem inodes") introduced to avoid LSM checks on kernel-internal inodes:
the mmap("/dev/zero") cloned inode is indeed a kernel-internal detail.

This also covers the !CONFIG_SHMEM use of ramfs to support /dev/zero
(and MAP_SHARED|MAP_ANONYMOUS).  I thought there were also drivers
which cloned inode in mmap(), but if so, I cannot locate them now.

Reported-and-tested-by: Prarit Bhargava <prarit@redhat.com>
Reported-and-tested-by: Daniel Wagner <wagi@monom.org>
Reported-and-tested-by: Morten Stevens <mstevens@fedoraproject.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoi2c: slave: fix the example how to instantiate from userspace
Wolfram Sang [Mon, 15 Jun 2015 17:51:46 +0000 (19:51 +0200)]
i2c: slave: fix the example how to instantiate from userspace

I copied the wrong shell code into the documentation. Sorry to all who
tried to get sense out of this current example :/ Slight rewording while
we are here.

Reported-by: Tim Bakker <bakkert@mymail.vcu.edu>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
10 years agotracing: Have filter check for balanced ops
Steven Rostedt [Mon, 15 Jun 2015 21:50:25 +0000 (17:50 -0400)]
tracing: Have filter check for balanced ops

When the following filter is used it causes a warning to trigger:

 # cd /sys/kernel/debug/tracing
 # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
 # cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: No error

 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 1223 at kernel/trace/trace_events_filter.c:1640 replace_preds+0x3c5/0x990()
 Modules linked in: bnep lockd grace bluetooth  ...
 CPU: 3 PID: 1223 Comm: bash Tainted: G        W       4.1.0-rc3-test+ #450
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
  0000000000000668 ffff8800c106bc98 ffffffff816ed4f9 ffff88011ead0cf0
  0000000000000000 ffff8800c106bcd8 ffffffff8107fb07 ffffffff8136b46c
  ffff8800c7d81d48 ffff8800d4c2bc00 ffff8800d4d4f920 00000000ffffffea
 Call Trace:
  [<ffffffff816ed4f9>] dump_stack+0x4c/0x6e
  [<ffffffff8107fb07>] warn_slowpath_common+0x97/0xe0
  [<ffffffff8136b46c>] ? _kstrtoull+0x2c/0x80
  [<ffffffff8107fb6a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff81159065>] replace_preds+0x3c5/0x990
  [<ffffffff811596b2>] create_filter+0x82/0xb0
  [<ffffffff81159944>] apply_event_filter+0xd4/0x180
  [<ffffffff81152bbf>] event_filter_write+0x8f/0x120
  [<ffffffff811db2a8>] __vfs_write+0x28/0xe0
  [<ffffffff811dda43>] ? __sb_start_write+0x53/0xf0
  [<ffffffff812e51e0>] ? security_file_permission+0x30/0xc0
  [<ffffffff811dc408>] vfs_write+0xb8/0x1b0
  [<ffffffff811dc72f>] SyS_write+0x4f/0xb0
  [<ffffffff816f5217>] system_call_fastpath+0x12/0x6a
 ---[ end trace e11028bd95818dcd ]---

Worse yet, reading the error message (the filter again) it says that
there was no error, when there clearly was. The issue is that the
code that checks the input does not check for balanced ops. That is,
having an op between a closed parenthesis and the next token.

This would only cause a warning, and fail out before doing any real
harm, but it should still not caues a warning, and the error reported
should work:

 # cd /sys/kernel/debug/tracing
 # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
 # cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: Meaningless filter expression

And give no kernel warning.

Link: http://lkml.kernel.org/r/20150615175025.7e809215@gandalf.local.home
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: stable@vger.kernel.org # 2.6.31+
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
10 years agoALSA: hda - Fix unused label skip_i915
Takashi Iwai [Tue, 16 Jun 2015 10:23:36 +0000 (12:23 +0200)]
ALSA: hda - Fix unused label skip_i915

When CONFIG_SND_HDA_I915=n, we get a compile warning:
  sound/pci/hda/hda_intel.c: In function ‘azx_probe_continue’:
  sound/pci/hda/hda_intel.c:1882:2: warning: label ‘skip_i915’ defined but not used [-Wunused-label]

Fix it by putting again ifdef to it.  Sigh.

Fixes: bf06848bdbe5 ('ALSA: hda - Continue probing even if i915 binding fails')
Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
10 years agocrypto: caam - fix RNG buffer cache alignment
Steve Cornelius [Mon, 15 Jun 2015 23:52:59 +0000 (16:52 -0700)]
crypto: caam - fix RNG buffer cache alignment

The hwrng output buffers (2) are cast inside of a a struct (caam_rng_ctx)
allocated in one DMA-tagged region. While the kernel's heap allocator
should place the overall struct on a cacheline aligned boundary, the 2
buffers contained within may not necessarily align. Consenquently, the ends
of unaligned buffers may not fully flush, and if so, stale data will be left
behind, resulting in small repeating patterns.

This fix aligns the buffers inside the struct.

Note that not all of the data inside caam_rng_ctx necessarily needs to be
DMA-tagged, only the buffers themselves require this. However, a fix would
incur the expense of error-handling bloat in the case of allocation failure.

Cc: stable@vger.kernel.org
Signed-off-by: Steve Cornelius <steve.cornelius@freescale.com>
Signed-off-by: Victoria Milhoan <vicki.milhoan@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
10 years agocrypto: caam - improve initalization for context state saves
Steve Cornelius [Mon, 15 Jun 2015 23:52:56 +0000 (16:52 -0700)]
crypto: caam - improve initalization for context state saves

Multiple function in asynchronous hashing use a saved-state block,
a.k.a. struct caam_hash_state, which holds a stash of information
between requests (init/update/final). Certain values in this state
block are loaded for processing using an inline-if, and when this
is done, the potential for uninitialized data can pose conflicts.
Therefore, this patch improves initialization of state data to
prevent false assignments using uninitialized data in the state block.

This patch addresses the following traceback, originating in
ahash_final_ctx(), although a problem like this could certainly
exhibit other symptoms:

kernel BUG at arch/arm/mm/dma-mapping.c:465!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = 80004000
[00000000] *pgd=00000000
Internal error: Oops: 805 [#1] PREEMPT SMP
Modules linked in:
CPU: 0    Not tainted  (3.0.15-01752-gdd441b9-dirty #40)
PC is at __bug+0x1c/0x28
LR is at __bug+0x18/0x28
pc : [<80043240>]    lr : [<8004323c>]    psr: 60000013
sp : e423fd98  ip : 60000013  fp : 0000001c
r10: e4191b84  r9 : 00000020  r8 : 00000009
r7 : 88005038  r6 : 00000001  r5 : 2d676572  r4 : e4191a60
r3 : 00000000  r2 : 00000001  r1 : 60000093  r0 : 00000033
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c53c7d  Table: 1000404a  DAC: 00000015
Process cryptomgr_test (pid: 1306, stack limit = 0xe423e2f0)
Stack: (0xe423fd98 to 0xe4240000)
fd80:                                                       11807fd1 80048544
fda0: 88005000 e4191a00 e5178040 8039dda0 00000000 00000014 2d676572 e4191008
fdc0: 88005018 e4191a60 00100100 e4191a00 00000000 8039ce0c e423fea8 00000007
fde0: e4191a00 e4227000 e5178000 8039ce18 e419183c 80203808 80a94a44 00000006
fe00: 00000000 80207180 00000000 00000006 e423ff08 00000000 00000007 e5178000
fe20: e41918a4 80a949b4 8c4844e2 00000000 00000049 74227000 8c4844e2 00000e90
fe40: 0000000e 74227e90 ffff8c58 80ac29e0 e423fed4 8006a350 8c81625c e423ff5c
fe60: 00008576 e4002500 00000003 00030010 e4002500 00000003 e5180000 e4002500
fe80: e5178000 800e6d24 007fffff 00000000 00000010 e4001280 e4002500 60000013
fea0: 000000d0 804df078 00000000 00000000 00000000 00000000 00000000 00000000
fec0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
fee0: 00000000 00000000 e4227000 e4226000 e4753000 e4752000 e40a5000 e40a4000
ff00: e41e7000 e41e6000 00000000 00000000 00000000 e423ff14 e423ff14 00000000
ff20: 00000400 804f9080 e5178000 e4db0b40 00000000 e4db0b80 0000047c 00000400
ff40: 00000000 8020758c 00000400 ffffffff 0000008a 00000000 e4db0b40 80206e00
ff60: e4049dbc 00000000 00000000 00000003 e423ffa4 80062978 e41a8bfc 00000000
ff80: 00000000 e4049db4 00000013 e4049db0 00000013 00000000 00000000 00000000
ffa0: e4db0b40 e4db0b40 80204cbc 00000013 00000000 00000000 00000000 80204cfc
ffc0: e4049da0 80089544 80040a40 00000000 e4db0b40 00000000 00000000 00000000
ffe0: e423ffe0 e423ffe0 e4049da0 800894c4 80040a40 80040a40 00000000 00000000
[<80043240>] (__bug+0x1c/0x28) from [<80048544>] (___dma_single_dev_to_cpu+0x84)
[<80048544>] (___dma_single_dev_to_cpu+0x84/0x94) from [<8039dda0>] (ahash_fina)
[<8039dda0>] (ahash_final_ctx+0x180/0x428) from [<8039ce18>] (ahash_final+0xc/0)
[<8039ce18>] (ahash_final+0xc/0x10) from [<80203808>] (crypto_ahash_op+0x28/0xc)
[<80203808>] (crypto_ahash_op+0x28/0xc0) from [<80207180>] (test_hash+0x214/0x5)
[<80207180>] (test_hash+0x214/0x5b8) from [<8020758c>] (alg_test_hash+0x68/0x8c)
[<8020758c>] (alg_test_hash+0x68/0x8c) from [<80206e00>] (alg_test+0x7c/0x1b8)
[<80206e00>] (alg_test+0x7c/0x1b8) from [<80204cfc>] (cryptomgr_test+0x40/0x48)
[<80204cfc>] (cryptomgr_test+0x40/0x48) from [<80089544>] (kthread+0x80/0x88)
[<80089544>] (kthread+0x80/0x88) from [<80040a40>] (kernel_thread_exit+0x0/0x8)
Code: e59f0010 e1a01003 eb126a8d e3a03000 (e5833000)
---[ end trace d52a403a1d1eaa86 ]---

Cc: stable@vger.kernel.org
Signed-off-by: Steve Cornelius <steve.cornelius@freescale.com>
Signed-off-by: Victoria Milhoan <vicki.milhoan@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
10 years agoKVM: x86: fix lapic.timer_mode on restore
Radim Krčmář [Fri, 5 Jun 2015 18:57:41 +0000 (20:57 +0200)]
KVM: x86: fix lapic.timer_mode on restore

lapic.timer_mode was not properly initialized after migration, which
broke few useful things, like login, by making every sleep eternal.

Fix this by calling apic_update_lvtt in kvm_apic_post_state_restore.

There are other slowpaths that update lvtt, so this patch makes sure
something similar doesn't happen again by calling apic_update_lvtt
after every modification.

Cc: stable@vger.kernel.org
Fixes: f30ebc312ca9 ("KVM: x86: optimize some accesses to LVTT and SPIV")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
10 years agodrm/mgag200: Reject non-character-cell-aligned mode widths
Adam Jackson [Mon, 15 Jun 2015 20:16:15 +0000 (16:16 -0400)]
drm/mgag200: Reject non-character-cell-aligned mode widths

Turns out 1366x768 does not in fact work on this hardware.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agoALSA: hda - Fix noisy outputs on Dell XPS13 (2015 model)
Takashi Iwai [Mon, 15 Jun 2015 18:36:12 +0000 (20:36 +0200)]
ALSA: hda - Fix noisy outputs on Dell XPS13 (2015 model)

The new Dell XPS13 also requires the similar quirk for fixing the
noisy outputs.  (But, as the codec was changed, now the fixup for
Latitude is used instead.)

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=99851
Signed-off-by: Takashi Iwai <tiwai@suse.de>
10 years agoRevert "drm/i915: Don't skip request retirement if the active list is empty"
Jani Nikula [Mon, 15 Jun 2015 09:59:37 +0000 (12:59 +0300)]
Revert "drm/i915: Don't skip request retirement if the active list is empty"

This reverts commit 0aedb1626566efd72b369c01992ee7413c82a0c5.

I messed things up while applying [1] to drm-intel-fixes. Rectify.

[1] http://mid.gmane.org/1432827156-9605-1-git-send-email-ville.syrjala@linux.intel.com

Fixes: 0aedb1626566 ("drm/i915: Don't skip request retirement if the active list is empty")
Cc: stable@vger.kernel.org
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
10 years agoALSA: mips: let SND_SGI_O2 select SND_PCM
Nicholas Mc Guire [Sun, 14 Jun 2015 17:16:59 +0000 (19:16 +0200)]
ALSA: mips: let SND_SGI_O2 select SND_PCM

Fix the missing dependency on PCM stuff.

[Add the same fix for HAL2, too -- tiwai]

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
10 years agoALSA: hda - Fix audio crackles on Dell Latitude E7x40
Takashi Iwai [Mon, 15 Jun 2015 09:59:32 +0000 (11:59 +0200)]
ALSA: hda - Fix audio crackles on Dell Latitude E7x40

We still got a report that the audio crackles and noises occur with
the recent 4.1 kernels on Dell machines.  These machines seem to need
similar workarounds that have been applied to the recent Dell XPS 13
models.  Since the codec of these machines (Dell Latitute E7240 and
E7440) is different from XPS 13's one, we need a new fixup entry.

Also, it was confirmed that the previous workaround to disable the
widget power-save (commit [219f47e4f964: ALSA: hda - Disable widget
power-saving for ALC292 & co]) is no longer needed after this fix.
So, this patch includes the partial revert of the commit, too.

Reported-and-tested-by: Mihai Donțu <mihai.dontu@gmail.com>
Tested-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
10 years agoALSA: hda - adding a DAC/pin preference map for a HP Envy TS machine
Hui Wang [Mon, 15 Jun 2015 09:43:39 +0000 (17:43 +0800)]
ALSA: hda - adding a DAC/pin preference map for a HP Envy TS machine

On a HP Envy TouchSmart laptop, there are 2 speakers (main speaker
and subwoofer speaker), 1 headphone and 2 DACs, without this fixup,
the headphone will be assigned to a DAC and the 2 speakers will be
assigned to another DAC, this assignment makes the surround-2.1
channels invalid.

To fix it, here using a DAC/pin preference map to bind the main
speaker to 1 DAC and the subwoofer speaker will be assigned to another
DAC.

Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
10 years agodrm/i915: Always reset vma->ggtt_view.pages cache on unbinding
Chris Wilson [Thu, 11 Jun 2015 07:06:08 +0000 (08:06 +0100)]
drm/i915: Always reset vma->ggtt_view.pages cache on unbinding

With the introduction of multiple views of an obj in the same vm, each
vma was taught to cache its copy of the pages (so that different views
could have different page arrangements). However, this missed decoupling
those vma->ggtt_view.pages when the vma released its reference on the
obj->pages. As we don't always free the vma, this leads to a possible
scenario (e.g. execbuffer interrupted by the shrinker) where the vma
points to a stale obj->pages, and explodes.

Fixes regression from commit fe14d5f4e5468c5b80a24f1a64abcbe116143670
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date:   Wed Dec 10 17:27:58 2014 +0000

    drm/i915: Infrastructure for supporting different GGTT views per object

Tvrtko says, if someone else will be confused how this can happen, key
is the reservation execbuffer path. That puts the VMA on the exec_list
which prevents i915_vma_unbind and i915_gem_vma_destroy from fully
destroying the VMA. So the VMA is left existing as an empty object in
the list - unbound and disassociated with the backing store. Kind of a
cached memory object. And then re-using it needs to clear the cached
pages pointer which is fixed above.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1227892
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[Jani: Added Tvrtko's explanation to commit message.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
10 years agoLinux 4.1-rc8 v4.1-rc8
Linus Torvalds [Mon, 15 Jun 2015 01:51:10 +0000 (15:51 -1000)]
Linux 4.1-rc8

10 years agoMerge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Linus Torvalds [Mon, 15 Jun 2015 01:48:26 +0000 (15:48 -1000)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine fixes from Vinod Koul:
 "Here are hopefully last set of fixes for 4.1. This time we have:

   - fixing pause capability reporting on both dmaengine pause & resume
     support by Krzysztof

   - locking fix fir at_xdmac by Ludovic

   - slave configuration fix for at_xdmac by Ludovic"

* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: Fix choppy sound because of unimplemented resume
  dmaengine: at_xdmac: rework slave configuration part
  dmaengine: at_xdmac: lock fixes

10 years agoMerge tag 'ntb-4.1' of git://github.com/jonmason/ntb
Linus Torvalds [Mon, 15 Jun 2015 01:46:43 +0000 (15:46 -1000)]
Merge tag 'ntb-4.1' of git://github.com/jonmason/ntb

Pull NTB fixes from Jon Mason:
 "I apologize for the tardiness of this request.  Here are a couple of
  last minute NTB bug fixes for v4.1:

  NTB bug fixes to address issues in unmapping the MW reg base and
  vbase, and an uninitialized variable on Atom platforms"

* tag 'ntb-4.1' of git://github.com/jonmason/ntb:
  ntb: initialize max_mw for Atom before using it
  ntb: iounmap MW reg and vbase in error path

10 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Linus Torvalds [Mon, 15 Jun 2015 01:38:57 +0000 (15:38 -1000)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus

Pull more MIPS fixes from Ralf Baechle:
 "Another round of 4.1 MIPS fixes, one fix to a MIPS-specific #if
  condition in lib/mpi, one fix to the MIPS GIC irqchip driver and one
  SSB fix.

  Details:
   - fix handling of clock in chipco SSB driver.
   - fix two MIPS-specific #if conditions to correctly work for GCC 5.1.
   - fix damage to R6 pgtable bits done by XPA support.
   - fix possible crash due to unloading modules that contain statically
     defined platform devices.
   - fix disabling of the MSA ASE on context switch to also work
     correctly when a new thread/process has the CPU for the very first
     time.

  This is part of linux-next and has been beaten to death on
  Imagination's test farm.

  While things are not looking too grim this pull request also means the
  rate of fixes for 4.1 remains nearly constant so I'd not be unhappy if
  you'd delay the release"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MPI: MIPS: Fix compilation error with GCC 5.1
  IRQCHIP: mips-gic: Don't nest calls to do_IRQ()
  MIPS: MSA: bugfix - disable MSA correctly for new threads/processes.
  MIPS: Loongson: Do not register 8250 platform device from module.
  MIPS: Cobalt: Do not build MTD platform device registration code as module.
  SSB: Fix handling of ssb_pmu_get_alp_clock()
  MIPS: pgtable-bits: Fix XPA damage to R6 definitions.

10 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 15 Jun 2015 00:53:02 +0000 (14:53 -1000)]
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irqchip fix from Thomas Gleixner:
 "A single fix for an off by one bug in the sunxi irqchip driver"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip: sunxi-nmi: Fix off-by-one error in irq iterator

10 years agoMerge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 15 Jun 2015 00:03:11 +0000 (14:03 -1000)]
Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull lockdep fix from Ingo Molnar:
 "A lockdep/modules unload race fix that can oops"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep: Fix a race between /proc/lock_stat and module unload

10 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 15 Jun 2015 00:00:13 +0000 (14:00 -1000)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "A regression fix for a crash, and a Intel HSW uncore PMU driver fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization"
  perf/x86/intel/uncore: Fix CBOX bit wide and UBOX reg on Haswell-EP

10 years agoMerge tag 'sound-4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Sun, 14 Jun 2015 23:55:24 +0000 (13:55 -1000)]
Merge tag 'sound-4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Most of commits are regression fixes for HD-audio: a few corner case
  fixes for regmap transition, and i915 binding issues.

  In addition, a quirk for another USB-audio device supporting DSD"

* tag 'sound-4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Abort the probe without i915 binding for HSW/BDW
  ALSA: hda - Re-add the lost fake mute support
  ALSA: hda - Continue probing even if i915 binding fails
  ALSA: hda - Don't actually write registers for caps overwrites
  ALSA: hda - fix number of devices query on hotplug
  ALSA: usb-audio: add native DSD support for JLsounds I2SoverUSB

10 years agoMPI: MIPS: Fix compilation error with GCC 5.1
Jaedon Shin [Fri, 12 Jun 2015 09:04:14 +0000 (18:04 +0900)]
MPI: MIPS: Fix compilation error with GCC 5.1

This patch fixes mips compilation error:

lib/mpi/generic_mpih-mul1.c: In function 'mpihelp_mul_1':
lib/mpi/longlong.h:651:2: error: impossible constraint in 'asm'

Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
Cc: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/10546/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
10 years agoIRQCHIP: mips-gic: Don't nest calls to do_IRQ()
Rabin Vincent [Fri, 12 Jun 2015 08:01:56 +0000 (10:01 +0200)]
IRQCHIP: mips-gic: Don't nest calls to do_IRQ()

The GIC chained handlers use do_IRQ() to call the subhandlers.  This
means that irq_enter() calls get nested, which leads to preempt count
looking like we're in nested interrupts, which in turn leads to all
system time being accounted as IRQ time in account_system_time().

Fix it by using generic_handle_irq().  Since these same functions are
used in some systems (if cpu_has_veic) from a low-level vectored
interrupt handler which does not go throught do_IRQ(), we need to do it
conditionally.

Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Reviewed-by: Andrew Bresticker <abrestic@chromium.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: tglx@linutronix.de
Cc: jason@lakedaemon.net
Patchwork: https://patchwork.linux-mips.org/patch/10545/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
10 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sat, 13 Jun 2015 06:54:16 +0000 (20:54 -1000)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix uninitialized struct station_info in cfg80211_wireless_stats(),
    from Johannes Berg.

 2) Revert commit attempt to fix ipv6 protocol resubmission, it adds
    regressions.

 3) Endless loops can be created in bridge port lists, fix from Nikolay
    Aleksandrov.

 4) Don't WARN_ON() if sk->sk_forward_alloc is non-zero in
    sk_clear_memalloc, it is a legal situation during swap deactivation.
    Fix from Mel Gorman.

 5) Fix order of disabling interrupts and unlocking NAPI in enic driver
    to avoid a race.  From Govindarajulu Varadarajan.

 6) High and low register writes are swapped when programming the start
    of periodic output in igb driver.  From Richard Cochran.

 7) Fix device rename handling in mpls stack, from Robert Shearman.

 8) Do not trigger compaction synchronously when optimistically trying
    to allocate an order 3 page in alloc_skb_with_frags() and
    skb_page_frag_refill().  From Shaohua Li.

 9) Authentication with COOKIE_ECHO is not handled properly in SCTP, fix
    from Marcelo Ricardo Leitner.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  Doc: networking: Fix URL for wiki.wireshark.org in udplite.txt
  sctp: allow authenticating DATA chunks that are bundled with COOKIE_ECHO
  net: don't wait for order-3 page allocation
  mpls: handle device renames for per-device sysctls
  net: igb: fix the start time for periodic output signals
  enic: fix memory leak in rq_clean
  enic: check return value for stat dump
  enic: unlock napi busy poll before unmasking intr
  net, swap: Remove a warning and clarify why sk_mem_reclaim is required when deactivating swap
  bridge: fix multicast router rlist endless loop
  tipc: disconnect socket directly after probe failure
  Revert "ipv6: Fix protocol resubmission"
  cfg80211: wext: clear sinfo struct before calling driver

10 years agoDoc: networking: Fix URL for wiki.wireshark.org in udplite.txt
Masanari Iida [Fri, 12 Jun 2015 15:23:21 +0000 (00:23 +0900)]
Doc: networking: Fix URL for wiki.wireshark.org in udplite.txt

This patch fix URL (http to https) for wiki.wireshark.org.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agosctp: allow authenticating DATA chunks that are bundled with COOKIE_ECHO
Marcelo Ricardo Leitner [Thu, 11 Jun 2015 17:49:46 +0000 (14:49 -0300)]
sctp: allow authenticating DATA chunks that are bundled with COOKIE_ECHO

Currently, we can ask to authenticate DATA chunks and we can send DATA
chunks on the same packet as COOKIE_ECHO, but if you try to combine
both, the DATA chunk will be sent unauthenticated and peer won't accept
it, leading to a communication failure.

This happens because even though the data was queued after it was
requested to authenticate DATA chunks, it was also queued before we
could know that remote peer can handle authenticating, so
sctp_auth_send_cid() returns false.

The fix is whenever we set up an active key, re-check send queue for
chunks that now should be authenticated. As a result, such packet will
now contain COOKIE_ECHO + AUTH + DATA chunks, in that order.

Reported-by: Liu Wei <weliu@redhat.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 12 Jun 2015 18:35:19 +0000 (11:35 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block layer fixes from Jens Axboe:
 "Remember about a week ago when I sent the last pull request for 4.1?
  Well, I lied.  Now, I don't want to shift the blame, but Dan, Ming,
  and Richard made a liar out of me.

  Here are three small patches that should go into 4.1.  More
  specifically, this pull request contains:

   - A Kconfig dependency for the pmem block driver, so it can't be
     selected if HAS_IOMEM isn't availble.  From Richard Weinberger.

   - A fix for genhd, making the ext_devt_lock softirq safe.  This makes
     lockdep happier, since we also end up grabbing this lock on release
     off the softirq path.  From Dan Williams.

   - A blk-mq software queue release fix from Ming Lei.

  Last two are headed to stable, first fixes an issue introduced in this
  cycle"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: pmem: Add dependency on HAS_IOMEM
  block: fix ext_dev_lock lockdep report
  blk-mq: free hctx->ctxs in queue's release handler

10 years agoMerge tag 'md/4.1-rc7-fixes' of git://neil.brown.name/md
Linus Torvalds [Fri, 12 Jun 2015 18:33:03 +0000 (11:33 -0700)]
Merge tag 'md/4.1-rc7-fixes' of git://neil.brown.name/md

Pull three more md fixes from Neil Brown:
 "Hasn't been a good cycle for md has it :-(

  The main issue fixed here is a rare race which can result in two
  reshape threads running at once, which doesn't end well.

  Also a minor issue with a write to a sysfs file returning the wrong
  value.  Backports to 4.0-stable are indicated"

* tag 'md/4.1-rc7-fixes' of git://neil.brown.name/md:
  md: make sure MD_RECOVERY_DONE is clear before starting recovery/resync
  md: Close race when setting 'action' to 'idle'.
  md: don't return 0 from array_state_store

10 years agoMerge git://git.infradead.org/intel-iommu
Linus Torvalds [Fri, 12 Jun 2015 18:28:57 +0000 (11:28 -0700)]
Merge git://git.infradead.org/intel-iommu

Pull VT-d hardware workarounds from David Woodhouse:
 "This contains a workaround for hardware issues which I *thought* were
  never going to be seen on production hardware.  I'm glad I checked
  that before the 4.1 release...

  Firstly, PASID support is so broken on existing chips that we're just
  going to declare the old capability bit 28 as 'reserved' and change
  the VT-d spec to move PASID support to another bit.  So any existing
  hardware doesn't support SVM; it only sets that (now) meaningless bit
  28.

  That patch *wasn't* imperative for 4.1 because we don't have PASID
  support yet.  But *even* the extended context tables are broken — if
  you just enable the wider tables and use none of the new bits in them,
  which is precisely what 4.1 does, you find that translations don't
  work.  It's this problem which I thought was caught in time to be
  fixed before production, but wasn't.

  To avoid triggering this issue, we now *only* enable the extended
  context tables on hardware which also advertises "we have PASID
  support and we actually tested it this time" with the new PASID
  feature bit.

  In addition, I've added an 'intel_iommu=ecs_off' command line
  parameter to allow us to disable it manually if we need to"

* git://git.infradead.org/intel-iommu:
  iommu/vt-d: Only enable extended context tables if PASID is supported
  iommu/vt-d: Change PASID support to bit 40 of Extended Capability Register

10 years agoiommu/vt-d: Only enable extended context tables if PASID is supported
David Woodhouse [Fri, 12 Jun 2015 09:15:49 +0000 (10:15 +0100)]
iommu/vt-d: Only enable extended context tables if PASID is supported

Although the extended tables are theoretically a completely orthogonal
feature to PASID and anything else that *uses* the newly-available bits,
some of the early hardware has problems even when all we do is enable
them and use only the same bits that were in the old context tables.

For now, there's no motivation to support extended tables unless we're
going to use PASID support to do SVM. So just don't use them unless
PASID support is advertised too. Also add a command-line bailout just in
case later chips also have issues.

The equivalent problem for PASID support has already been fixed with the
upcoming VT-d spec update and commit bd00c606a ("iommu/vt-d: Change
PASID support to bit 40 of Extended Capability Register"), because the
problematic platforms use the old definition of the PASID-capable bit,
which is now marked as reserved and meaningless.

So with this change, we'll magically start using ECS again only when we
see the new hardware advertising "hey, we have PASID support and we
actually tested it this time" on bit 40.

The VT-d hardware architect has promised that we are not going to have
any reason to support ECS *without* PASID any time soon, and he'll make
sure he checks with us before changing that.

In the future, if hypothetical new features also use new bits in the
context tables and can be seen on implementations *without* PASID support,
we might need to add their feature bits to the ecs_enabled() macro.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
10 years agomd: make sure MD_RECOVERY_DONE is clear before starting recovery/resync
NeilBrown [Fri, 12 Jun 2015 10:05:04 +0000 (20:05 +1000)]
md: make sure MD_RECOVERY_DONE is clear before starting recovery/resync

MD_RECOVERY_DONE is normally cleared by md_check_recovery after a
resync etc finished.  However it is possible for raid5_start_reshape
to race and start a reshape before MD_RECOVERY_DONE is cleared.  This
can lean to multiple reshapes running at the same time, which isn't
good.

To make sure it is cleared before starting a reshape, and also clear
it when reaping a thread, just to be safe.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomd: Close race when setting 'action' to 'idle'.
NeilBrown [Fri, 12 Jun 2015 09:51:27 +0000 (19:51 +1000)]
md: Close race when setting 'action' to 'idle'.

Checking ->sync_thread without holding the mddev_lock()
isn't really safe, even after flushing the workqueue which
ensures md_start_sync() has been run.

While this code is waiting for the lock, md_check_recovery could reap
the thread itself, and then start another thread (e.g. recovery might
finish, then reshape starts).  When this thread gets the lock
md_start_sync() hasn't run so it doesn't get reaped, but
MD_RECOVERY_RUNNING gets cleared.  This allows two threads to start
which leads to confusion.

So don't both if MD_RECOVERY_RUNNING isn't set, but if it is do
the flush and the test and the reap all under the mddev_lock to
avoid any race with md_check_recovery.

Signed-off-by: NeilBrown <neilb@suse.de>
Fixes: 6791875e2e53 ("md: make reconfig_mutex optional for writes to md sysfs files.")
Cc: stable@vger.kernel.org (v4.0+)
10 years agomd: don't return 0 from array_state_store
NeilBrown [Fri, 12 Jun 2015 09:46:44 +0000 (19:46 +1000)]
md: don't return 0 from array_state_store

Returning zero from a 'store' function is bad.
The return value should be either len length of the string
or an error.

So use 'len' if 'err' is zero.

Fixes: 6791875e2e53 ("md: make reconfig_mutex optional for writes to md sysfs files.")
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: stable@vger.kernel (v4.0+)
10 years agodmaengine: Fix choppy sound because of unimplemented resume
Krzysztof Kozlowski [Wed, 10 Jun 2015 08:17:07 +0000 (17:17 +0900)]
dmaengine: Fix choppy sound because of unimplemented resume

Some drivers implement only pause operation (no resuming). Example is
pl330 where pause is needed for getting residuum. pl330 does not support
resume operation, transfer must be stopped after pause.

However for slaves this is exposed always as "pause and resume" which
introduces subtle errors on Odroid U3 board (Exynos4412 with pl330).
After adding pause function to pl330 driver the audio playback
(utilizing DMA) gets choppy after some time (approximately 24 hours).

Fix this by exposing "cmd_pause" if and only if pause and resume are
implemented.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reported-by: gabriel@unseen.is
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: <stable@vger.kernel.org>
Fixes: 88987d2c7534 ("dmaengine: pl330: add DMA_PAUSE feature")
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
10 years agoALSA: hda - Abort the probe without i915 binding for HSW/BDW
Takashi Iwai [Fri, 12 Jun 2015 05:53:58 +0000 (07:53 +0200)]
ALSA: hda - Abort the probe without i915 binding for HSW/BDW

The previous patch tried to continue the probe if i915 binding fails.
For for simplicity reason, we haven't implemented abort even for
controller chips that are dedicated for HDMI/DP on HSW and BDW.
However, Mengdong suggested that this can be dangerous; BIOS may
disable gfx power well although the PCI entry for HD-audio is left,
and this may result in the unexpected behavior, kernel errors, etc.

For avoiding this situation, abort the probe at i915 binding failure
only for HSW/BDW chips selectively.  For other chips, it still
continues.

Fixes: bf06848bdbe5 ('ALSA: hda - Continue probing even if i915 binding fails')
Reported-by: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
10 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 12 Jun 2015 00:35:14 +0000 (17:35 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "i915 and radeon fixes:

  i915:
      fix for connector oops regression
      DDC probing fix

  radeon:
      two radeon reverts, along with a freeze workaround and a fix"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: Make sure radeon_vm_bo_set_addr always unreserves the BO
  Revert "drm/radeon: adjust pll when audio is not enabled"
  Revert "drm/radeon: don't share plls if monitors differ in audio support"
  drm/radeon: fix freeze for laptop with Turks/Thames GPU.
  drm/i915: Fix DDC probe for passive adapters
  drm/i915: Properly initialize SDVO analog connectors

10 years agonet: don't wait for order-3 page allocation
Shaohua Li [Thu, 11 Jun 2015 23:50:48 +0000 (16:50 -0700)]
net: don't wait for order-3 page allocation

We saw excessive direct memory compaction triggered by skb_page_frag_refill.
This causes performance issues and add latency. Commit 5640f7685831e0
introduces the order-3 allocation. According to the changelog, the order-3
allocation isn't a must-have but to improve performance. But direct memory
compaction has high overhead. The benefit of order-3 allocation can't
compensate the overhead of direct memory compaction.

This patch makes the order-3 page allocation atomic. If there is no memory
pressure and memory isn't fragmented, the alloction will still success, so we
don't sacrifice the order-3 benefit here. If the atomic allocation fails,
direct memory compaction will not be triggered, skb_page_frag_refill will
fallback to order-0 immediately, hence the direct memory compaction overhead is
avoided. In the allocation failure case, kswapd is waken up and doing
compaction, so chances are allocation could success next time.

alloc_skb_with_frags is the same.

The mellanox driver does similar thing, if this is accepted, we must fix
the driver too.

V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
V2: make the changelog clearer

Cc: Eric Dumazet <edumazet@google.com>
Cc: Chris Mason <clm@fb.com>
Cc: Debabrata Banerjee <dbavatar@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoMerge tag 'drm-intel-fixes-2015-06-11' of git://anongit.freedesktop.org/drm-intel...
Dave Airlie [Fri, 12 Jun 2015 00:11:50 +0000 (10:11 +1000)]
Merge tag 'drm-intel-fixes-2015-06-11' of git://anongit.freedesktop.org/drm-intel into drm-fixes

Fix for the regression Linus called out, and another for probing
dongles.

* tag 'drm-intel-fixes-2015-06-11' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Fix DDC probe for passive adapters
  drm/i915: Properly initialize SDVO analog connectors

10 years agoMerge branch 'drm-fixes-4.1' of git://people.freedesktop.org/~agd5f/linux into drm...
Dave Airlie [Fri, 12 Jun 2015 00:11:14 +0000 (10:11 +1000)]
Merge branch 'drm-fixes-4.1' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Two regression reverts, and two fixes, one for a dpm boot freeze.

* 'drm-fixes-4.1' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: Make sure radeon_vm_bo_set_addr always unreserves the BO
  Revert "drm/radeon: adjust pll when audio is not enabled"
  Revert "drm/radeon: don't share plls if monitors differ in audio support"
  drm/radeon: fix freeze for laptop with Turks/Thames GPU.

10 years agompls: handle device renames for per-device sysctls
Robert Shearman [Thu, 11 Jun 2015 18:58:26 +0000 (19:58 +0100)]
mpls: handle device renames for per-device sysctls

If a device is renamed and the original name is subsequently reused
for a new device, the following warning is generated:

sysctl duplicate entry: /net/mpls/conf/veth0//input
CPU: 3 PID: 1379 Comm: ip Not tainted 4.1.0-rc4+ #20
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
 0000000000000000 0000000000000000 ffffffff81566aaf 0000000000000000
 ffffffff81236279 ffff88002f7d7f00 0000000000000000 ffff88000db336d8
 ffff88000db33698 0000000000000005 ffff88002e046000 ffff8800168c9280
Call Trace:
 [<ffffffff81566aaf>] ? dump_stack+0x40/0x50
 [<ffffffff81236279>] ? __register_sysctl_table+0x289/0x5a0
 [<ffffffffa051a24f>] ? mpls_dev_notify+0x1ff/0x300 [mpls_router]
 [<ffffffff8108db7f>] ? notifier_call_chain+0x4f/0x70
 [<ffffffff81470e72>] ? register_netdevice+0x2b2/0x480
 [<ffffffffa0524748>] ? veth_newlink+0x178/0x2d3 [veth]
 [<ffffffff8147f84c>] ? rtnl_newlink+0x73c/0x8e0
 [<ffffffff8147f27a>] ? rtnl_newlink+0x16a/0x8e0
 [<ffffffff81459ff2>] ? __kmalloc_reserve.isra.30+0x32/0x90
 [<ffffffff8147ccfd>] ? rtnetlink_rcv_msg+0x8d/0x250
 [<ffffffff8145b027>] ? __alloc_skb+0x47/0x1f0
 [<ffffffff8149badb>] ? __netlink_lookup+0xab/0xe0
 [<ffffffff8147cc70>] ? rtnetlink_rcv+0x30/0x30
 [<ffffffff8149e7a0>] ? netlink_rcv_skb+0xb0/0xd0
 [<ffffffff8147cc64>] ? rtnetlink_rcv+0x24/0x30
 [<ffffffff8149df17>] ? netlink_unicast+0x107/0x1a0
 [<ffffffff8149e4be>] ? netlink_sendmsg+0x50e/0x630
 [<ffffffff8145209c>] ? sock_sendmsg+0x3c/0x50
 [<ffffffff81452beb>] ? ___sys_sendmsg+0x27b/0x290
 [<ffffffff811bd258>] ? mem_cgroup_try_charge+0x88/0x110
 [<ffffffff811bd5b6>] ? mem_cgroup_commit_charge+0x56/0xa0
 [<ffffffff811d7700>] ? do_filp_open+0x30/0xa0
 [<ffffffff8145336e>] ? __sys_sendmsg+0x3e/0x80
 [<ffffffff8156c3f2>] ? system_call_fastpath+0x16/0x75

Fix this by unregistering the previous sysctl table (registered for
the path containing the original device name) and re-registering the
table for the path containing the new device name.

Fixes: 37bde79979c3 ("mpls: Per-device enabling of packet input")
Reported-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agonet: igb: fix the start time for periodic output signals
Richard Cochran [Thu, 11 Jun 2015 12:51:30 +0000 (14:51 +0200)]
net: igb: fix the start time for periodic output signals

When programming the start of a periodic output, the code wrongly places
the seconds value into the "low" register and the nanoseconds into the
"high" register.  Even though this is backwards, it slipped through my
testing, because the re-arming code in the interrupt service routine is
correct, and the signal does appear starting with the second edge.

This patch fixes the issue by programming the registers correctly.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 years agoblock: pmem: Add dependency on HAS_IOMEM
Richard Weinberger [Mon, 4 May 2015 18:58:57 +0000 (20:58 +0200)]
block: pmem: Add dependency on HAS_IOMEM

Not all architectures have io memory.

Fixes:
drivers/block/pmem.c: In function ‘pmem_alloc’:
drivers/block/pmem.c:146:2: error: implicit declaration of function ‘ioremap_nocache’ [-Werror=implicit-function-declaration]
  pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
  ^
drivers/block/pmem.c:146:18: warning: assignment makes pointer from integer without a cast [enabled by default]
  pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
                  ^
drivers/block/pmem.c:182:2: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration]
  iounmap(pmem->virt_addr);
  ^

Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
10 years agoMerge tag 'trace-rb-bm-fix-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 11 Jun 2015 21:00:10 +0000 (14:00 -0700)]
Merge tag 'trace-rb-bm-fix-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull ring buffer benchmark buglet fix from Steven Rostedt:
 "Wang Long fixed a minor bug in the module parameter for the ring
  buffer benchmark, where the produce_fifo was being ignored and the
  producer thread's priority was being set with the consumer_fifo
  parameter"

* tag 'trace-rb-bm-fix-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ring-buffer-benchmark: Fix the wrong sched_priority of producer

10 years agoblock: fix ext_dev_lock lockdep report
Dan Williams [Thu, 11 Jun 2015 03:47:14 +0000 (23:47 -0400)]
block: fix ext_dev_lock lockdep report

 =================================
 [ INFO: inconsistent lock state ]
 4.1.0-rc7+ #217 Tainted: G           O
 ---------------------------------
 inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
 swapper/6/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
  (ext_devt_lock){+.?...}, at: [<ffffffff8143a60c>] blk_free_devt+0x3c/0x70
 {SOFTIRQ-ON-W} state was registered at:
   [<ffffffff810bf6b1>] __lock_acquire+0x461/0x1e70
   [<ffffffff810c1947>] lock_acquire+0xb7/0x290
   [<ffffffff818ac3a8>] _raw_spin_lock+0x38/0x50
   [<ffffffff8143a07d>] blk_alloc_devt+0x6d/0xd0  <-- take the lock in process context
[..]
  [<ffffffff810bf64e>] __lock_acquire+0x3fe/0x1e70
  [<ffffffff810c00ad>] ? __lock_acquire+0xe5d/0x1e70
  [<ffffffff810c1947>] lock_acquire+0xb7/0x290
  [<ffffffff8143a60c>] ? blk_free_devt+0x3c/0x70
  [<ffffffff818ac3a8>] _raw_spin_lock+0x38/0x50
  [<ffffffff8143a60c>] ? blk_free_devt+0x3c/0x70
  [<ffffffff8143a60c>] blk_free_devt+0x3c/0x70    <-- take the lock in softirq
  [<ffffffff8143bfec>] part_release+0x1c/0x50
  [<ffffffff8158edf6>] device_release+0x36/0xb0
  [<ffffffff8145ac2b>] kobject_cleanup+0x7b/0x1a0
  [<ffffffff8145aad0>] kobject_put+0x30/0x70
  [<ffffffff8158f147>] put_device+0x17/0x20
  [<ffffffff8143c29c>] delete_partition_rcu_cb+0x16c/0x180
  [<ffffffff8143c130>] ? read_dev_sector+0xa0/0xa0
  [<ffffffff810e0e0f>] rcu_process_callbacks+0x2ff/0xa90
  [<ffffffff810e0dcf>] ? rcu_process_callbacks+0x2bf/0xa90
  [<ffffffff81067e2e>] __do_softirq+0xde/0x600

Neil sees this in his tests and it also triggers on pmem driver unbind
for the libnvdimm tests.  This fix is on top of an initial fix by Keith
for incorrect usage of mutex_lock() in this path: 2da78092dda1 "block:
Fix dev_t minor allocation lifetime".  Both this and 2da78092dda1 are
candidates for -stable.

Fixes: 2da78092dda1 ("block: Fix dev_t minor allocation lifetime")
Cc: <stable@vger.kernel.org>
Cc: Keith Busch <keith.busch@intel.com>
Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
10 years agodrm/radeon: Make sure radeon_vm_bo_set_addr always unreserves the BO
Michel Dänzer [Thu, 11 Jun 2015 09:38:38 +0000 (18:38 +0900)]
drm/radeon: Make sure radeon_vm_bo_set_addr always unreserves the BO

Some error paths didn't unreserve the BO. This resulted in a deadlock
down the road on the next attempt to reserve the (still reserved) BO.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90873
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
10 years agoRevert "drm/radeon: adjust pll when audio is not enabled"
Alex Deucher [Wed, 10 Jun 2015 05:30:54 +0000 (01:30 -0400)]
Revert "drm/radeon: adjust pll when audio is not enabled"

This reverts commit 7fe04d6fa824ccea704535a597dc417c8687f990.

Fixes some systems at the expense of others.  Need to properly
fix the pll divider selection.

bug:
https://bugzilla.kernel.org/show_bug.cgi?id=99651

Cc: stable@vger.kernel.org
10 years agoRevert "drm/radeon: don't share plls if monitors differ in audio support"
Alex Deucher [Wed, 10 Jun 2015 05:29:14 +0000 (01:29 -0400)]
Revert "drm/radeon: don't share plls if monitors differ in audio support"

This reverts commit a10f0df0615abb194968fc08147f3cdd70fd5aa5.

Fixes some systems at the expense of others.  Need to properly
fix the pll divider selection.

bug:
https://bugzilla.kernel.org/show_bug.cgi?id=99651

Cc: stable@vger.kernel.org
10 years agodrm/radeon: fix freeze for laptop with Turks/Thames GPU.
Jérôme Glisse [Fri, 5 Jun 2015 17:33:57 +0000 (13:33 -0400)]
drm/radeon: fix freeze for laptop with Turks/Thames GPU.

Laptop with Turks/Thames GPU will freeze if dpm is enabled. It seems
the SMC engine is relying on some state inside the CP engine. CP needs
to chew at least one packet for it to get in good state for dynamic
power management.

This patch simply disabled and re-enable DPM after the ring test which
is enough to avoid the freeze.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>