]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
13 years agoBtrfs: remove some verbose warnings
Chris Mason [Wed, 25 Jan 2012 19:06:49 +0000 (14:06 -0500)]
Btrfs: remove some verbose warnings

Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agoBtrfs: fix reservations in btrfs_page_mkwrite
Chris Mason [Wed, 25 Jan 2012 18:47:40 +0000 (13:47 -0500)]
Btrfs: fix reservations in btrfs_page_mkwrite

Josef fixed btrfs_page_mkwrite to properly release reserved
extents if there was an error.  But if we fail to get a reservation
and we fail to dirty the inode (for ENOSPC reasons), we'll end up
trying to release a reservation we never had.

This makes sure we only release if we were able to reserve.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agoBtrfs: use larger system chunks
Chris Mason [Mon, 16 Jan 2012 13:13:11 +0000 (08:13 -0500)]
Btrfs: use larger system chunks

system chunks by default are very small.  This makes them slightly
larger and also fixes the conditional checks to make sure we don't
allocate a billion of them at once.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 96bdc7dc61fb1b1e8e858dafb13abee8482ba064)

13 years agoBtrfs: add a delalloc mutex to inodes for delalloc reservations
Josef Bacik [Fri, 13 Jan 2012 17:09:22 +0000 (12:09 -0500)]
Btrfs: add a delalloc mutex to inodes for delalloc reservations

I was using i_mutex for this, but we're getting bogus lockdep warnings by doing
that and theres no real way to get rid of those, so just stop using i_mutex to
protect delalloc metadata reservations and use a delalloc mutex instead.  This
shouldn't be contended often at all, only if you are writing and mmap writing to
the file at the same time.  Thanks,

(cherry picked from commit f248679e86fead40cc78e724c7181d6bec1a2046)

Signed-off-by: Josef Bacik <josef@redhat.com>
13 years agoBtrfs: protect orphan block rsv with spin_lock
Josef Bacik [Fri, 2 Dec 2011 20:44:12 +0000 (15:44 -0500)]
Btrfs: protect orphan block rsv with spin_lock

We've been seeing warnings coming out of the orphan commit stuff forever from
ceph.  Turns out it's because we're racing with checking if the orphan block
reserve is set, because we clear it outside of the spin_lock.  So leave the
normal fastpath checks where they are, but take the spin_lock and _recheck_ to
make sure we haven't had an orphan block rsv added in the meantime.  Then clear
the root's orphan block rsv and release the lock.  With this patch a user said
the warnings went away and they usually showed up pretty soon after he started
ceph.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
(cherry picked from commit 90290e19820e3323ce6b9c2888eeb68bf29c278b)

13 years agoBtrfs: don't call btrfs_throttle in file write
Josef Bacik [Fri, 13 Jan 2012 00:10:12 +0000 (19:10 -0500)]
Btrfs: don't call btrfs_throttle in file write

Btrfs_throttle will make us wait if there is a currently committing transaction
until we can open new transactions, which is ridiculous since we don't actually
start any transactions within the file write path anyway, so all this does is
introduce big latencies if we have a sync/fsync heavy workload going on while
somebody else is trying to do work.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 45a8090e626ab470c91142954431a93846030b0d)

13 years agoBtrfs: release space on error in page_mkwrite
Josef Bacik [Fri, 13 Jan 2012 00:10:12 +0000 (19:10 -0500)]
Btrfs: release space on error in page_mkwrite

If updating the inode gave us an ENOSPC we were just returning in page_mkwrite,
which is a problem since we make our reservation right before trying to update
the inode, so fix the out label so that we actually free our reservation.
Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit ec39e180fd3188c983c94603634bfcd019f42ae7)

13 years agoBtrfs: fix btrfsck error 400 when truncating a compressed
Miao Xie [Fri, 13 Jan 2012 00:10:12 +0000 (19:10 -0500)]
Btrfs: fix btrfsck error 400 when truncating a compressed

Reproduce steps:
 # mkfs.btrfs /dev/sdb5
 # mount /dev/sdb5 -o compress=lzo /mnt
 # dd if=/dev/zero of=/mnt/tmpfile bs=128K count=1
 # sync
 # truncate -s 64K /mnt/tmpfile
 root 5 inode 257 errors 400

This is because of the wrong if condition, which is used to check if we should
subtract the bytes of the dropped range from i_blocks/i_bytes of i-node or not.
When we truncate a compressed extent, btrfs substracts the bytes of the whole
extent, it's wrong. We should substract the real size that we truncate, no
matter it is a compressed extent or not. Fix it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit f70a9a6b94af86fca069a7552ab672c31b457786)

13 years agoBtrfs: do not use btrfs_end_transaction_throttle everywhere
Josef Bacik [Fri, 13 Jan 2012 00:10:12 +0000 (19:10 -0500)]
Btrfs: do not use btrfs_end_transaction_throttle everywhere

A user reported a problem where things like open with O_CREAT would take up to
30 seconds when he had nfs activity on the same mount.  This is because all of
our quick metadata operations, like create, symlink etc all do
btrfs_end_transaction_throttle, which if the transaction is blocked will wait
for the commit to complete before it returns.  This adds a ridiculous amount of
latency and isn't really needed.  The normal btrfs_end_transaction will mark the
transaction as blocked and wake the transaction kthread up if it thinks the
transaction needs to end (this being in the running out of global reserve space
scenario), and this is all that is really needed since we've already done
everything we're going to do, we just need to return.  This should help people
with the latency they were seeing when using synchronous heavy workloads.
Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 7ad85bb76a61801362701b77c5cee5aa09f35369)

13 years agoBtrfs: fix possible deadlock when opening a seed device
Li Zefan [Wed, 7 Dec 2011 03:38:24 +0000 (11:38 +0800)]
Btrfs: fix possible deadlock when opening a seed device

The correct lock order is uuid_mutex -> volume_mutex -> chunk_mutex,
but when we mount a filesystem which has backing seed devices, we have
this lock chain:

    open_ctree()
        lock(chunk_mutex);
        read_chunk_tree();
            read_one_dev();
                open_seed_devices();
                    lock(uuid_mutex);

and then we hit a lockdep splat.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit b367e47fb3a70f5d24ebd6faf7d42436d485fb2d)

13 years agoBtrfs: update global block_rsv when creating a new block group
Li Zefan [Wed, 7 Dec 2011 02:39:22 +0000 (10:39 +0800)]
Btrfs: update global block_rsv when creating a new block group

A bug was triggered while using seed device:

    # mkfs.btrfs /dev/loop1
    # btrfstune -S 1 /dev/loop1
    # mount -o /dev/loop1 /mnt
    # btrfs dev add /dev/loop2 /mnt

btrfs: block rsv returned -28
------------[ cut here ]------------
WARNING: at fs/btrfs/extent-tree.c:5969 btrfs_alloc_free_block+0x166/0x396 [btrfs]()
...
Call Trace:
...
[<f7b7c31c>] btrfs_cow_block+0x101/0x147 [btrfs]
[<f7b7eaa6>] btrfs_search_slot+0x1b8/0x55f [btrfs]
[<f7b7f844>] btrfs_insert_empty_items+0x42/0x7f [btrfs]
[<f7b7f8c1>] btrfs_insert_item+0x40/0x7e [btrfs]
[<f7b8ac02>] btrfs_make_block_group+0x243/0x2aa [btrfs]
[<f7bb3f53>] __btrfs_alloc_chunk+0x672/0x70e [btrfs]
[<f7bb41ff>] init_first_rw_device+0x77/0x13c [btrfs]
[<f7bb5a62>] btrfs_init_new_device+0x664/0x9fd [btrfs]
[<f7bbb65a>] btrfs_ioctl+0x694/0xdbe [btrfs]
[<c04f55f7>] do_vfs_ioctl+0x496/0x4cc
[<c04f5660>] sys_ioctl+0x33/0x4f
[<c07b9edf>] sysenter_do_call+0x12/0x38
---[ end trace 906adac595facc7d ]---

Since seed device is readonly, there's no usable space in the filesystem.
Afterwards we add a sprout device to it, and the kernel creates a METADATA
block group and a SYSTEM block group where comes free space we can reserve,
but we still get revervation failure because the global block_rsv hasn't
been updated accordingly.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit c7c144db531fda414e532adac56e965ce332e2a5)

13 years agoBtrfs: rewrite btrfs_trim_block_group()
Li Zefan [Thu, 29 Dec 2011 06:47:27 +0000 (14:47 +0800)]
Btrfs: rewrite btrfs_trim_block_group()

There are various bugs in block group trimming:

- It may trim from offset smaller than user-specified offset.
- It may trim beyond user-specified range.
- It may leak free space for extents smaller than specified minlen.
- It may truncate the last trimmed extent thus leak free space.
- With mixed extents+bitmaps, some extents may not be trimmed.
- With mixed extents+bitmaps, some bitmaps may not be trimmed (even
none will be trimmed). Even for those trimmed, not all the free space
in the bitmaps will be trimmed.

I rewrite btrfs_trim_block_group() and break it into two functions.
One is to trim extents only, and the other is to trim bitmaps only.

Before patching:

# fstrim -v /mnt/
/mnt/: 1496465408 bytes were trimmed

After patching:

# fstrim -v /mnt/
/mnt/: 2193768448 bytes were trimmed

And this matches the total free space:

# btrfs fi df /mnt
Data: total=3.58GB, used=1.79GB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=205.12MB, used=97.14MB
Metadata: total=8.00MB, used=0.00

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit 7fe1e641502616220437079258506196bc4d8cbf)

13 years agoBtrfs: simplfy calculation of stripe length for discard operation
Li Zefan [Thu, 1 Dec 2011 06:06:42 +0000 (14:06 +0800)]
Btrfs: simplfy calculation of stripe length for discard operation

For btrfs raid, while discarding a range of space, we'll need to know
the start offset and length to discard for each device, and it's done
in btrfs_map_block().

However the calculation is a bit complex for raid0 and raid10, so I
reimplement it based on a fact that:

        dev1          dev2           dev3    (raid0)
        -----------------------------------
        s0 s3 s6      s1 s4 s7       s2 s5

Each device has (total_stripes / nr_dev) stripes, or plus one.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit ec9ef7a13be4dcce964c8503e8999087945e5b9e)

13 years agoBtrfs: don't pre-allocate btrfs bio
Li Zefan [Thu, 1 Dec 2011 04:55:47 +0000 (12:55 +0800)]
Btrfs: don't pre-allocate btrfs bio

We pre-allocate a btrfs bio with fixed size, and then may re-allocate
memory if we find stripes are bigger than the fixed size. But this
pre-allocation is not necessary.

Also we don't have to calcuate the stripe number twice.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit de11cc12df17337979e0929d2831887432f236ca)

13 years agoBtrfs: don't pass a trans handle unnecessarily in volumes.c
Li Zefan [Thu, 8 Dec 2011 07:07:24 +0000 (15:07 +0800)]
Btrfs: don't pass a trans handle unnecessarily in volumes.c

Some functions never use the transaction handle passed to them.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit 125ccb0ae6806dbec31abf4a85448971df3b4e39)

13 years agoBtrfs: reserve metadata space in btrfs_ioctl_setflags()
Li Zefan [Thu, 29 Dec 2011 05:39:50 +0000 (13:39 +0800)]
Btrfs: reserve metadata space in btrfs_ioctl_setflags()

Check and reserve space for btrfs_update_inode().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit 4da6f1a332f6c16b6594c7892f13c31459b9b1c8)

13 years agoBtrfs: remove BUG_ON()s in btrfs_ioctl_setflags()
Li Zefan [Thu, 29 Dec 2011 05:36:45 +0000 (13:36 +0800)]
Btrfs: remove BUG_ON()s in btrfs_ioctl_setflags()

We can recover from errors and return -errno to user space.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit f062abf089ff705e09bbaa6fa1e2fd7688a0f2ea)

13 years agoBtrfs: check the return value of io_ctl_init()
Li Zefan [Mon, 9 Jan 2012 06:36:28 +0000 (14:36 +0800)]
Btrfs: check the return value of io_ctl_init()

It can return -ENOMEM.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit 706efc6630c2722602541a6a2fc5900a4e38456a)

13 years agoBtrfs: avoid possible NULL deref in io_ctl_drop_pages()
Li Zefan [Mon, 9 Jan 2012 06:27:42 +0000 (14:27 +0800)]
Btrfs: avoid possible NULL deref in io_ctl_drop_pages()

If we run into some failure path in io_ctl_prepare_pages(),
io_ctl->pages[] array may have some NULL pointers.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit a1ee5a45818acc7f9c13e560827cf3e8735ac919)

13 years agoBtrfs: add pinned extents to on-disk free space cache correctly
Li Zefan [Tue, 10 Jan 2012 08:41:01 +0000 (16:41 +0800)]
Btrfs: add pinned extents to on-disk free space cache correctly

I got this while running xfstests:

[24256.836098] block group 317849600 has an wrong amount of free space
[24256.836100] btrfs: failed to load free space cache for block group 317849600

We should clamp the extent returned by find_first_extent_bit(),
so the start of the extent won't smaller than the start of the
block group.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit db804f23a72bada58f083dfad6a65d019ddb3bd4)

13 years agoBtrfs: revamp clustered allocation logic
Alexandre Oliva [Fri, 14 Oct 2011 15:10:36 +0000 (12:10 -0300)]
Btrfs: revamp clustered allocation logic

Parameterize clusters on minimum total size, minimum chunk size and
minimum contiguous size for at least one chunk, without limits on
cluster, window or gap sizes.  Don't tolerate any fragmentation for
SSD_SPREAD; accept it for metadata, but try to keep data dense.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 1bb91902dc90e25449893e693ad45605cb08fbe5)

13 years agoBtrfs: don't set up allocation result twice
Alexandre Oliva [Mon, 28 Nov 2011 14:36:17 +0000 (12:36 -0200)]
Btrfs: don't set up allocation result twice

We store the allocation start and length twice in ins, once right
after the other, but with intervening calls that may prevent the
duplicate from being optimized out by the compiler.  Remove one of the
assignments.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit fc7c1077ceb99c35e5f9d0ce03dc7740565bb2bf)

13 years agoBtrfs: test free space only for unclustered allocation
Alexandre Oliva [Mon, 12 Dec 2011 06:48:19 +0000 (04:48 -0200)]
Btrfs: test free space only for unclustered allocation

Since the clustered allocation may be taking extents from a different
block group, there's no point in spin-locking and testing the current
block group free space before attempting to allocate space from a
cluster, even more so when we might refrain from even trying the
cluster in the current block group because, after the cluster was set
up, not enough free space remained.  Furthermore, cluster creation
attempts fail fast when the block group doesn't have enough free
space, so the test was completely superfluous.

I've move the free space test past the cluster allocation attempt,
where it is more useful, and arranged for a cluster in the current
block group to be released before trying an unclustered allocation,
when we reach the LOOP_NO_EMPTY_SIZE stage, so that the free space in
the cluster stands a chance of being combined with additional free
space in the block group so as to succeed in the allocation attempt.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit a5f6f719a5cd7caeee8ed8137cf3f94c3bbebc65)

13 years agoBtrfs: use bigger metadata chunks on bigger filesystems
Chris Mason [Fri, 6 Jan 2012 20:47:38 +0000 (15:47 -0500)]
Btrfs: use bigger metadata chunks on bigger filesystems

The 256MB chunk is a little small on a huge FS.  This scales up the
chunk size.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 1100373f8aa69e377386499350496e3d8565605f)

13 years agoBtrfs: lower the bar for chunk allocation
Chris Mason [Fri, 6 Jan 2012 20:41:34 +0000 (15:41 -0500)]
Btrfs: lower the bar for chunk allocation

The chunk allocation code has tried to keep a pretty tight lid on creating new
metadata chunks.  This is partially because in the past the reservation
code didn't give us an accurate idea of how much space was being used.

The new code is much more accurate, so we're able to get rid of some of these
checks.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit cf1d72c9ceec391d34c48724da57282e97f01122)

13 years agoBtrfs: run chunk allocations while we do delayed refs
Chris Mason [Fri, 6 Jan 2012 20:23:57 +0000 (15:23 -0500)]
Btrfs: run chunk allocations while we do delayed refs

Btrfs tries to batch extent allocation tree changes to improve performance
and reduce metadata trashing.  But it doesn't allocate new metadata chunks
while it is doing allocations for the extent allocation tree.

This commit changes the delayed refence code to do chunk allocations if we're
getting low on room.  It prevents crashes and improves performance.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 203bf287cb01a5dc26c20bd3737cecf3aeba1d48)

13 years agoBtrfs: call d_instantiate after all ops are setup
Al Viro [Fri, 23 Dec 2011 12:58:13 +0000 (07:58 -0500)]
Btrfs: call d_instantiate after all ops are setup

This closes races where btrfs is calling d_instantiate too soon during
inode creation.  All of the callers of btrfs_add_nondir are updated to
instantiate after the inode is fully setup in memory.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 08c422c27f855d27b0b3d9fa30ebd938d4ae6f1f)

13 years agoBtrfs: fix worker lock misuse in find_worker
Chris Mason [Fri, 23 Dec 2011 12:53:00 +0000 (07:53 -0500)]
Btrfs: fix worker lock misuse in find_worker

Dan Carpenter noticed that we were doing a double unlock on the worker
lock, and sometimes picking a worker thread without the lock held.

This fixes both errors.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
(cherry picked from commit 8d532b2afb2eacc84588db709ec280a3d1219be3)

13 years agoxen/config: turn CONFIG_XEN_DEBUG_FS off.
Konrad Rzeszutek Wilk [Tue, 24 Jan 2012 21:55:29 +0000 (16:55 -0500)]
xen/config: turn CONFIG_XEN_DEBUG_FS off.

That option makes the Xen spinlock code (xen/spinlock.c) accumulate
statistics about how many locks taken, time in slowpath, etc.
Good information during debugging but not in production.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
13 years agoproc: clean up and fix /proc/<pid>/mem handling
Maxim Uvarov [Mon, 23 Jan 2012 20:08:00 +0000 (12:08 -0800)]
proc: clean up and fix /proc/<pid>/mem handling

Orabug: 13618927
CVE-2012-0056
Jüri Aedla reported that the /proc/<pid>/mem handling really isn't very
robust, and it also doesn't match the permission checking of any of the
other related files.

This changes it to do the permission checks at open time, and instead of
tracking the process, it tracks the VM at the time of the open.  That
simplifies the code a lot, but does mean that if you hold the file
descriptor open over an execve(), you'll continue to read from the _old_
VM.

That is different from our previous behavior, but much simpler.  If
somebody actually finds a load where this matters, we'll need to revert
this commit.

I suspect that nobody will ever notice - because the process mapping
addresses will also have changed as part of the execve.  So you cannot
actually usefully access the fd across a VM change simply because all
the offsets for IO would have changed too.

Reported-by: Jüri Aedla <asd@ut.ee>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:

fs/proc/base.c

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoset XEN_MAX_DOMAIN_MEMORY for 512
Maxim Uvarov [Sat, 21 Jan 2012 01:49:55 +0000 (17:49 -0800)]
set XEN_MAX_DOMAIN_MEMORY for 512

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoadd __init arguments to init functions
Maxim Uvarov [Sat, 21 Jan 2012 01:45:24 +0000 (17:45 -0800)]
add __init arguments to init functions

Fix following issues:
WARNING: vmlinux.o(.text+0x3aba): Section mismatch in reference from the function xen_align_and_add_e820_region() to the function .init.text:e820_add_region()
The function xen_align_and_add_e820_region() references
the function __init e820_add_region().
This is often because xen_align_and_add_e820_region lacks a __init
annotation or the annotation of e820_add_region is wrong.

WARNING: vmlinux.o(.text+0x2e9ec): Section mismatch in reference from the function acpi_map_cpu2node() to the variable .cpuinit.data:__apicid_to_node
The function acpi_map_cpu2node() references
the variable __cpuinitdata __apicid_to_node.
This is often because acpi_map_cpu2node lacks a __cpuinitdata
annotation or the annotation of __apicid_to_node is wrong.

WARNING: vmlinux.o(.text+0x2e9f1): Section mismatch in reference from the function acpi_map_cpu2node() to the function .cpuinit.text:numa_set_node()
The function acpi_map_cpu2node() references
the function __cpuinit numa_set_node().
This is often because acpi_map_cpu2node lacks a __cpuinit
annotation or the annotation of numa_set_node is wrong.

WARNING: vmlinux.o(.text+0x3f9b4): Section mismatch in reference from the function enable_iommus() to the function .init.text:iommu_set_device_table()
The function enable_iommus() references
the function __init iommu_set_device_table().
This is often because enable_iommus lacks a __init
annotation or the annotation of iommu_set_device_table is wrong.

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agohpwdt: clean up set_memory_x call for 32 bit
Maxim Uvarov [Sun, 15 Jan 2012 20:08:20 +0000 (12:08 -0800)]
hpwdt: clean up set_memory_x call for 32 bit

1. addess has to be page aligned.
2. set_memory_x uses page size argument, not size.
Bug causes with following commit:
commit da28179b4e90dda56912ee825c7eaa62fc103797
Author: Mingarelli, Thomas <Thomas.Mingarelli@hp.com>
Date:   Mon Nov 7 10:59:00 2011 +0100

     watchdog: hpwdt: Changes to handle NX secure bit in 32bit path

    commit e67d668e147c3b4fec638c9e0ace04319f5ceccd upstream.

    This patch makes use of the set_memory_x() kernel API in order
    to make necessary BIOS calls to source NMIs.

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoSPEC: v2.6.39-100.0.20
Maxim Uvarov [Fri, 13 Jan 2012 02:16:53 +0000 (18:16 -0800)]
SPEC: v2.6.39-100.0.20

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoEnable Kabi Check
Guru Anbalagane [Fri, 13 Jan 2012 00:10:14 +0000 (16:10 -0800)]
Enable Kabi Check
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>
13 years agonet/bna driver update from 2.3.2.3 to 3.0.2.2
Maxim Uvarov [Thu, 12 Jan 2012 21:19:34 +0000 (13:19 -0800)]
net/bna driver update from 2.3.2.3 to 3.0.2.2

Orabug: 13255
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoscsi/bfa driver update from 2.3.2.3 to 3.0.2.2
Maxim Uvarov [Thu, 12 Jan 2012 21:14:37 +0000 (13:14 -0800)]
scsi/bfa driver update from 2.3.2.3 to 3.0.2.2

Orabug: 13254
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoUpdated version to 5.02.00.00.06.02-uek1
Tej Parkash [Tue, 20 Dec 2011 16:52:01 +0000 (22:22 +0530)]
Updated version to 5.02.00.00.06.02-uek1

Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Added error logging for firmware abort
Nilesh Javali [Wed, 14 Dec 2011 10:47:50 +0000 (16:17 +0530)]
qla4xxx: Added error logging for firmware abort

JIRA Key: IUEKR2ISCSI-15

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Disable generating pause frames in case of FW hung
Giridhar Malavali [Wed, 14 Dec 2011 10:45:10 +0000 (16:15 +0530)]
qla4xxx: Disable generating pause frames in case of FW hung

JIRA Key: IUEKR2ISCSI-14

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Temperature monitoring for ISP82XX core.
Mike Hernandez [Wed, 14 Dec 2011 10:16:12 +0000 (15:46 +0530)]
qla4xxx: Temperature monitoring for ISP82XX core.

During watchdog, need to monitor temperature of ISP82XX core
and set device state to FAILED when temperature reaches
"Panic" level.

JIRA Key: IUEKR2ISCSI-13

Signed-off-by: Mike Hernandez <michael.hernandez@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: check for FW alive before calling chip_reset
Shyam Sunder [Fri, 2 Dec 2011 06:42:13 +0000 (22:42 -0800)]
qla4xxx: check for FW alive before calling chip_reset

Check for firmware alive and do premature completion of
mbox commands in case of FW hung before doing chip_reset

Jira Key: IUEKR2ISCSI-16

Signed-off-by: Shyam Sunder <shyam.sunder@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Remove the unused macros
Tej Parkash [Tue, 20 Dec 2011 04:47:14 +0000 (10:17 +0530)]
qla4xxx: Remove the unused macros

Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Lalit Chandivade <lalit.chadnivade@qlogic.com>
13 years agoqla4xxx: cleanup, make qla4xxx_build_ddb_list short
Lalit Chandivade [Tue, 13 Dec 2011 11:21:37 +0000 (16:51 +0530)]
qla4xxx: cleanup, make qla4xxx_build_ddb_list short

Make qla4xxx_build_ddb_list shorter by adding more helper functions.

JIRA Key: IUEKR2ISCSI-12

Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: clear the RISC interrupt bit during firmware init
Sarang Radke [Thu, 8 Dec 2011 06:45:31 +0000 (12:15 +0530)]
qla4xxx: clear the RISC interrupt bit during firmware init

Fix for the kdump kernel panic issue with 80xx adapters.

JIRA Key: IUEKR2ISCSI-11

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Sarang Radke <sarang.radke@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: clear the SCSI COMPLETION INTERRUPT bit during firmware init
Prasanna Mumbai [Thu, 8 Dec 2011 05:29:35 +0000 (10:59 +0530)]
qla4xxx: clear the SCSI COMPLETION INTERRUPT bit during firmware init

Fix for the kdump kernel panic issue on 40xx adapters.

JIRA Key: IUEKR2ISCSI-10

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Prasanna Mumbai <prasanna.mumbai@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Fixed BFS with sendtargets as boot index.
Manish Rangankar [Fri, 2 Dec 2011 08:25:03 +0000 (13:55 +0530)]
qla4xxx: Fixed BFS with sendtargets as boot index.

If ql4xdisablesysfsboot = 0 and sendtargets entry as boot index then
driver does export sendtarget entries in sysfs but iscsistart does not
do discovery. So in this case let driver do the discovery and
login to the targets.

JIRA Key: IUEKR2ISCSI-9

Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com>
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Correct the default relogin timeout value
Nilesh Javali [Wed, 7 Dec 2011 08:22:31 +0000 (13:52 +0530)]
qla4xxx: Correct the default relogin timeout value

The ACB default timeout value is used to set the default
relogin timeout value. For ISP4022 adapters where
the ACB default value is set to 2560s, limit the relogin
timeout to 12s.

JIRA Key: IUEKR2ISCSI-8

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Limit the ACB Default Timeout value to 12s
Nilesh Javali [Thu, 1 Dec 2011 09:06:04 +0000 (14:36 +0530)]
qla4xxx: Limit the ACB Default Timeout value to 12s

The ACB default timeout value is set to 2560s in the
ISP4022 firmware. This caused the driver to loop
for 2560s. Hence limit the default timeout at the driver
level to min 12s.

Also break out from the loop if the sendtargets list was empty.

JIRA Key: IUEKR2ISCSI-7

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agobond_alb: don't disable softirq under bond_alb_xmit
Maxim Uvarov [Thu, 5 Jan 2012 19:59:19 +0000 (11:59 -0800)]
bond_alb: don't disable softirq under bond_alb_xmit

Orabug: 13323879
No need to lock soft irqs under bond_alb_xmit()
which already has softirq disabled.

Changes:
1. add non-bh/bh version to tlb_clear_slave()

2. represent BH and non BH hash table locks
_lock_rx_hashtbl_bh/_unlock_rx_hashtbl_bh
_lock_rx_hashtbl/_unlock_rx_hashtbl
_lock_tx_hashtbl_bh/_unlock_tx_hashtbl_bh
_lock_tx_hashtbl/_unlock_tx_hashtbl

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
13 years agofix kernel version
Guru Anbalagane [Thu, 12 Jan 2012 22:34:15 +0000 (14:34 -0800)]
fix kernel version

13 years agoSPEC: v2.6.39-100.0.19
Maxim Uvarov [Wed, 11 Jan 2012 01:33:02 +0000 (17:33 -0800)]
SPEC: v2.6.39-100.0.19

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoscripts/git-changelog: generate rpm changelog script
Maxim Uvarov [Wed, 11 Jan 2012 00:54:36 +0000 (16:54 -0800)]
scripts/git-changelog: generate rpm changelog script

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoRevert "hpwd watchdog mark page executable"
Maxim Uvarov [Wed, 11 Jan 2012 00:04:19 +0000 (16:04 -0800)]
Revert "hpwd watchdog mark page executable"

Orabug: 13115973
This reverts commit e7494242c42201128204e50b2a707de335dd6334.
Bug fix is better covered with following upstream commit:

commit da28179b4e90dda56912ee825c7eaa62fc103797
Author: Mingarelli, Thomas <Thomas.Mingarelli@hp.com>
Date:   Mon Nov 7 10:59:00 2011 +0100

watchdog: hpwdt: Changes to handle NX secure bit in 32bit path

commit e67d668e147c3b4fec638c9e0ace04319f5ceccd upstream.
This patch makes use of the set_memory_x() kernel API in order
to make necessary BIOS calls to source NMIs.

This is needed for SLES11 SP2 and the latest upstream kernel as it appears
the NX Execute Disable has grown in its control.

Signed-off by: Thomas Mingarelli <thomas.mingarelli@hp.com>
Signed-off by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoPartial revert of mainline removal of deprecated sysfs interface for 13568528
Chuck Anderson [Sat, 7 Jan 2012 00:12:49 +0000 (16:12 -0800)]
Partial revert of mainline removal of deprecated sysfs interface for 13568528

Jan. 06, 2012
Oracle bug 13568528
Patch written by Andrew Thomas
Ported by Chuck Anderson

This patch partialy reverts the removal in mainline of a deprecated sysfs
interface needed by the OVM3.0.4 UEK2 based dom0 kernel when it is used
to install OVM3.0.4

Comments from Andrew:

The problem is that in newer kernels, even with the
CONFIG_SYSFS_DEPRECATED[_V2] flags set, some nodes have been removed so to
tools looking in sysfs, pieces are missing.  This breaks anaconda (actually
kudzu) for us.  For OVM3 we use the dom0 kernel as the install kernel, so we
need UEK2 to provide the right "shape" sysfs.  This isn't an issue for OL
because you use the old RHEL kernel to install UEK1/2].  That said, this
issue affects more than us.  As Joe Jin points out, bug 13100678, required
kudzu fixes for eth devices.  Arguably the OVM3 anaconda issue can also be
fixed in kudzu, but what no one knows is if the missing sysfs nodes are
symptoms of a wider set of tools related problems and therefore whether the
correct fix is to revert sysfs changes in UEK2 so that the sysfs it presents
is isomorphic to what 2.6.18 based kernels provide.

A "better" set of tools would be from 6uX, but in order to get those
installed/upgraded on OVM3 is not a trivial task because the system customers
have already installed is 5u5 based.  We have other tools in dom0 [eg our
agent] which "know" about the old flavour of sysfs and these would need
porting.  You either change the kernel OR you change all the tools that rely
on sysfs... the problem is that customers can install there own tools on OL5.

Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoscsi:lpfc update to 8.3.5.58
Maxim Uvarov [Fri, 6 Jan 2012 21:30:21 +0000 (13:30 -0800)]
scsi:lpfc update to 8.3.5.58

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoLet KERNEL_VERSION be 3.0.x, and override UTSNAME
Nelson Elhage [Tue, 10 Jan 2012 23:04:08 +0000 (15:04 -0800)]
Let KERNEL_VERSION be 3.0.x, and override UTSNAME

This will let out-of-tree modules correctly detect the kernel version
when building against it, but it will still identify as 2.6.39
everywhere in userspace.

Signed-off-by: Nelson Elhage <nelson.elhage@oracle.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoqla4xxx: Fix qla4xxx_dump_buffer to dump buffer correctly
Vikas Chaudhary [Fri, 2 Dec 2011 06:42:12 +0000 (22:42 -0800)]
qla4xxx: Fix qla4xxx_dump_buffer to dump buffer correctly

KERN_INFO in printk adding new line character that mess-up
dump print format. Remove KERN_INFO to fix dump format.

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: Fix the IDC locking mechanism
Nilesh Javali [Fri, 2 Dec 2011 06:42:11 +0000 (22:42 -0800)]
qla4xxx: Fix the IDC locking mechanism

This ensures the transition of dev_state from COLD to
INITIALIZING is within lock and atomic.

Jira Key: IUEKR2ISCSI-6

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: Wait for disable_acb before doing set_acb
Vikas Chaudhary [Fri, 2 Dec 2011 06:42:10 +0000 (22:42 -0800)]
qla4xxx: Wait for disable_acb before doing set_acb

In function qla4xxx_iface_set_param wait for disable_acb to
complete so that set_acb will not fail.

Jira Key: IUEKR2ISCSI-5

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: Don't recover adapter if device state is FAILED
Sarang Radke [Fri, 2 Dec 2011 06:42:09 +0000 (22:42 -0800)]
qla4xxx: Don't recover adapter if device state is FAILED

Multiple reset request don't get handled correctly as
the driver tries to recover adapter which is in FAILED state.

Jira Key: IUEKR2ISCSI-4

Signed-off-by: Sarang Radke <sarang.radke@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: fix call trace on rmmod with ql4xdontresethba=1
Sarang Radke [Tue, 6 Dec 2011 10:34:10 +0000 (02:34 -0800)]
qla4xxx: fix call trace on rmmod with ql4xdontresethba=1

abort all active commands from eh_host_reset in-case
of ql4xdontresethba=1

Fix following call trace:-
Nov 21 14:50:47 172.17.140.111 qla4xxx 0000:13:00.4: qla4_8xxx_disable_msix: qla4xxx (rsp_q)
Nov 21 14:50:47 172.17.140.111 qla4xxx 0000:13:00.4: PCI INT A disabled
Nov 21 14:50:47 172.17.140.111 slab error in kmem_cache_destroy(): cache `qla4xxx_srbs': Can't free all objects
Nov 21 14:50:47 172.17.140.111 Pid: 9154, comm: rmmod Tainted: G           O 3.2.0-rc2+ #2
Nov 21 14:50:47 172.17.140.111 Call Trace:
Nov 21 14:50:47 172.17.140.111  [<c051231a>] ? kmem_cache_destroy+0x9a/0xb0
Nov 21 14:50:47 172.17.140.111  [<c0489c4a>] ? sys_delete_module+0x14a/0x210
Nov 21 14:50:47 172.17.140.111  [<c04fd552>] ? do_munmap+0x202/0x280
Nov 21 14:50:47 172.17.140.111  [<c04a6d4e>] ? audit_syscall_entry+0x1ae/0x1d0
Nov 21 14:50:47 172.17.140.111  [<c083019f>] ? sysenter_do_call+0x12/0x28
Nov 21 14:51:50 172.17.140.111 SLAB: cache with size 64 has lost its name
Nov 21 14:51:50 172.17.140.111 iscsi: registered transport (qla4xxx)
Nov 21 14:51:50 172.17.140.111 qla4xxx 0000:13:00.4: PCI INT A -> GSI 28 (level, low) -> IRQ 28

Jira Key: IUEKR2ISCSI-3

Signed-off-by: Sarang Radke <sarang.radke@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: Fix CPU lockups when ql4xdontresethba set
Mike Hernandez [Fri, 2 Dec 2011 06:42:07 +0000 (22:42 -0800)]
qla4xxx: Fix CPU lockups when ql4xdontresethba set

Fix issue where CPU lockup is seen when ql4xdontresethba is set and
driver is "stuck" in NEED_RESET state handler.

Jira Key: IUEKR2ISCSI-2

Signed-off-by: Mike Hernandez <michael.hernandez@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: Perform context resets in case of context failures.
Vikas Chaudhary [Fri, 2 Dec 2011 06:42:06 +0000 (22:42 -0800)]
qla4xxx: Perform context resets in case of context failures.

For 4032, context reset was the same as chip reset, and any firmware
issue was recovered by performing a chip reset.
For 82xx, the iSCSI firmware runs along with FCoE and the NIC
firmware contexts, and an error encountered doesnot essentially mean
that a chip reset is necessary.

Perform Chip resets only in the following cases:
1. Mailbox system error.
2. Mailbox command timeout.
3. fw_heartbeat_counter counter stops incrementing.

For all other cases, only perform a context reset.
1. Command Completion with an invalid srb.
2. Other mailbox failures.

Jira Key: IUEKR2ISCSI-1

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Shyam Sunder <shyam.sunder@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agodo not obsolete firmware
Maxim Uvarov [Fri, 6 Jan 2012 00:27:43 +0000 (16:27 -0800)]
do not obsolete firmware

Orabug: 13535055
Do not remove firmware for other kernels.
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoRevert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"
Konrad Rzeszutek Wilk [Tue, 10 Jan 2012 20:43:47 +0000 (15:43 -0500)]
Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"

This reverts commit ddacf5ef684a655abe2bb50c4b2a5b72ae0d5e05.

We piggyback on upstream git commit 12275dd4b747f5d87fa36229774d76bca8e63068
which says:

commit 12275dd4b747f5d87fa36229774d76bca8e63068
Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date:   Mon Dec 19 09:30:35 2011 -0500

    Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"

    This reverts commit ddacf5ef684a655abe2bb50c4b2a5b72ae0d5e05.
    As when booting the kernel under Amazon EC2 as an HVM guest it ends up
    hanging during startup. Reverting this we loose the fix for kexec
    booting to the crash kernels.

    Fixes Canonical BZ #901305 (http://bugs.launchpad.net/bugs/901305)

Tested-by: Alessandro Salvatori <sandr8@gmail.com>
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
13 years agoRevert "xen-blkback: convert hole punching to discard request on loop devices"
Maxim Uvarov [Tue, 10 Jan 2012 22:35:52 +0000 (14:35 -0800)]
Revert "xen-blkback: convert hole punching to discard request on loop devices"

This reverts commit 7139323c5ef7b342d770efef6ae463ff201e3d83.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoath9k: Fix kernel panic in AR2427 in AP mode
Mohammed Shafi Shajakhan [Mon, 26 Dec 2011 05:12:15 +0000 (10:42 +0530)]
ath9k: Fix kernel panic in AR2427 in AP mode

commit b25bfda38236f349cde0d1b28952f4eea2148d3f upstream.

don't do aggregation related stuff for 'AP mode client power save
handling' if aggregation is not enabled in the driver, otherwise it
will lead to panic because those data structures won't be never
intialized in 'ath_tx_node_init' if aggregation is disabled

EIP is at ath_tx_aggr_wakeup+0x37/0x80 [ath9k]
EAX: e8c09a20 EBX: f2a304e8 ECX: 00000001 EDX: 00000000
ESI: e8c085e0 EDI: f2a304ac EBP: f40e1ca4 ESP: f40e1c8c
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process swapper/1 (pid: 0, ti=f40e0000 task=f408e860
task.ti=f40dc000)
Stack:
0001e966 e8c09a20 00000000 f2a304ac e8c085e0 f2a304ac
f40e1cb0 f8186741
f8186700 f40e1d2c f922988d f2a304ac 00000202 00000001
c0b4ba43 00000000
0000000f e8eb75c0 e8c085e0 205b0001 34383220 f2a304ac
f2a30000 00010020
Call Trace:
[<f8186741>] ath9k_sta_notify+0x41/0x50 [ath9k]
[<f8186700>] ? ath9k_get_survey+0x110/0x110 [ath9k]
[<f922988d>] ieee80211_sta_ps_deliver_wakeup+0x9d/0x350
[mac80211]
[<c018dc75>] ? __module_address+0x95/0xb0
[<f92465b3>] ap_sta_ps_end+0x63/0xa0 [mac80211]
[<f9246746>] ieee80211_rx_h_sta_process+0x156/0x2b0
[mac80211]
[<f9247d1e>] ieee80211_rx_handlers+0xce/0x510 [mac80211]
[<c018440b>] ? trace_hardirqs_on+0xb/0x10
[<c056936e>] ? skb_queue_tail+0x3e/0x50
[<f9248271>] ieee80211_prepare_and_rx_handle+0x111/0x750
[mac80211]
[<f9248bf9>] ieee80211_rx+0x349/0xb20 [mac80211]
[<f9248949>] ? ieee80211_rx+0x99/0xb20 [mac80211]
[<f818b0b8>] ath_rx_tasklet+0x818/0x1d00 [ath9k]
[<f8187a75>] ? ath9k_tasklet+0x35/0x1c0 [ath9k]
[<f8187a75>] ? ath9k_tasklet+0x35/0x1c0 [ath9k]
[<f8187b33>] ath9k_tasklet+0xf3/0x1c0 [ath9k]
[<c0151b7e>] tasklet_action+0xbe/0x180

Cc: Senthil Balasubramanian <senthilb@qca.qualcomm.com>
Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Reported-by: Ashwin Mendonca <ashwinloyal@gmail.com>
Tested-by: Ashwin Mendonca <ashwinloyal@gmail.com>
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoptrace: partially fix the do_wait(WEXITED) vs EXIT_DEAD->EXIT_ZOMBIE race
Oleg Nesterov [Wed, 4 Jan 2012 16:29:02 +0000 (17:29 +0100)]
ptrace: partially fix the do_wait(WEXITED) vs EXIT_DEAD->EXIT_ZOMBIE race

commit 50b8d257486a45cba7b65ca978986ed216bbcc10 upstream.

Test-case:

int main(void)
{
int pid, status;

pid = fork();
if (!pid) {
for (;;) {
if (!fork())
return 0;
if (waitpid(-1, &status, 0) < 0) {
printf("ERR!! wait: %m\n");
return 0;
}
}
}

assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0);
assert(waitpid(-1, NULL, 0) == pid);

assert(ptrace(PTRACE_SETOPTIONS, pid, 0,
PTRACE_O_TRACEFORK) == 0);

do {
ptrace(PTRACE_CONT, pid, 0, 0);
pid = waitpid(-1, NULL, 0);
} while (pid > 0);

return 1;
}

It fails because ->real_parent sees its child in EXIT_DEAD state
while the tracer is going to change the state back to EXIT_ZOMBIE
in wait_task_zombie().

The offending commit is 823b018e which moved the EXIT_DEAD check,
but in fact we should not blame it. The original code was not
correct as well because it didn't take ptrace_reparented() into
account and because we can't really trust ->ptrace.

This patch adds the additional check to close this particular
race but it doesn't solve the whole problem. We simply can't
rely on ->ptrace in this case, it can be cleared if the tracer
is multithreaded by the exiting ->parent.

I think we should kill EXIT_DEAD altogether, we should always
remove the soon-to-be-reaped child from ->children or at least
we should never do the DEAD->ZOMBIE transition. But this is too
complex for 3.2.

Reported-and-tested-by: Denys Vlasenko <vda.linux@googlemail.com>
Tested-by: Lukasz Michalik <lmi@ift.uni.wroc.pl>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoRevert "rtc: Disable the alarm in the hardware"
Linus Torvalds [Wed, 4 Jan 2012 01:32:13 +0000 (17:32 -0800)]
Revert "rtc: Disable the alarm in the hardware"

commit 157e8bf8b4823bfcdefa6c1548002374b61f61df upstream.

This reverts commit c0afabd3d553c521e003779c127143ffde55a16f.

It causes failures on Toshiba laptops - instead of disabling the alarm,
it actually seems to enable it on the affected laptops, resulting in
(for example) the laptop powering on automatically five minutes after
shutdown.

There's a patch for it that appears to work for at least some people,
but it's too late to play around with this, so revert for now and try
again in the next merge window.

See for example

http://bugs.debian.org/652869

Reported-and-bisected-by: Andreas Friedrich <afrie@gmx.net> (Toshiba Tecra)
Reported-by: Antonio-M. Corbi Bellot <antonio.corbi@ua.es> (Toshiba Portege R500)
Reported-by: Marco Santos <marco.santos@waynext.com> (Toshiba Portege Z830)
Reported-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr> (Toshiba Portege R830)
Cc: Jonathan Nieder <jrnieder@gmail.com>
Requested-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agohung_task: fix false positive during vfork
Mandeep Singh Baines [Tue, 3 Jan 2012 22:41:13 +0000 (14:41 -0800)]
hung_task: fix false positive during vfork

commit f9fab10bbd768b0e5254e53a4a8477a94bfc4b96 upstream.

vfork parent uninterruptibly and unkillably waits for its child to
exec/exit. This wait is of unbounded length. Ignore such waits
in the hung_task detector.

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Reported-by: Sasha Levin <levinsasha928@gmail.com>
LKML-Reference: <1325344394.28904.43.camel@lappy>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Kacur <jkacur@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agodrm/radeon/kms/atom: fix possible segfault in pm setup
Alexander Müller [Fri, 30 Dec 2011 17:55:48 +0000 (12:55 -0500)]
drm/radeon/kms/atom: fix possible segfault in pm setup

commit 4376eee92e5a8332b470040e672ea99cd44c826a upstream.

If we end up with no power states, don't look up
current vddc.

fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=44130

agd5f: fix patch formatting

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoxfs: log all dirty inodes in xfs_fs_sync_fs
Christoph Hellwig [Wed, 4 Jan 2012 14:48:36 +0000 (09:48 -0500)]
xfs: log all dirty inodes in xfs_fs_sync_fs

Commit be4f1ac828776bbc7868a68b465cd8eedb733cfd upstream.

Since Linux 2.6.36 the writeback code has introduces various measures for
live lock prevention during sync().  Unfortunately some of these are
actively harmful for the XFS model, where the inode gets marked dirty for
metadata from the data I/O handler.

The older_than_this checks that are now more strictly enforced since

    writeback: avoid livelocking WB_SYNC_ALL writeback

by only calling into __writeback_inodes_sb and thus only sampling the
current cut off time once.  But on a slow enough devices the previous
asynchronous sync pass might not have fully completed yet, and thus XFS
might mark metadata dirty only after that sampling of the cut off time for
the blocking pass already happened.  I have not myself reproduced this
myself on a real system, but by introducing artificial delay into the
XFS I/O completion workqueues it can be reproduced easily.

Fix this by iterating over all XFS inodes in ->sync_fs and log all that
are dirty.  This might log inode that only got redirtied after the
previous pass, but given how cheap delayed logging of inodes is it
isn't a major concern for performance.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoxfs: log the inode in ->write_inode calls for kupdate
Christoph Hellwig [Wed, 4 Jan 2012 14:48:35 +0000 (09:48 -0500)]
xfs: log the inode in ->write_inode calls for kupdate

Commit 0b8fd3033c308e4088760aa1d38ce77197b4e074 upstream.

If the writeback code writes back an inode because it has expired we currently
use the non-blockin ->write_inode path.  This means any inode that is pinned
is skipped.  With delayed logging and a workload that has very little log
traffic otherwise it is very likely that an inode that gets constantly
written to is always pinned, and thus we keep refusing to write it.  The VM
writeback code at that point redirties it and doesn't try to write it again
for another 30 seconds.  This means under certain scenarious time based
metadata writeback never happens.

Fix this by calling into xfs_log_inode for kupdate in addition to data
integrity syncs, and thus transfer the inode to the log ASAP.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomfd: Turn on the twl4030-madc MADC clock
Kyle Manna [Fri, 12 Aug 2011 03:33:13 +0000 (22:33 -0500)]
mfd: Turn on the twl4030-madc MADC clock

commit 3d6271f92e98094584fd1e609a9969cd33e61122 upstream.

Without turning the MADC clock on, no MADC conversions occur.

$ cat /sys/class/hwmon/hwmon0/device/in8_input
[   53.428436] twl4030_madc twl4030_madc: conversion timeout!
cat: read error: Resource temporarily unavailable

Signed-off-by: Kyle Manna <kyle@kylemanna.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomfd: Check for twl4030-madc NULL pointer
Kyle Manna [Fri, 12 Aug 2011 03:33:14 +0000 (22:33 -0500)]
mfd: Check for twl4030-madc NULL pointer

commit d0e84caeb4cd535923884735906e5730329505b4 upstream.

If the twl4030-madc device wasn't registered, and another device, such
as twl4030-madc-hwmon, calls twl4030_madc_conversion() a NULL pointer is
dereferenced.

Signed-off-by: Kyle Manna <kyle@kylemanna.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomfd: Copy the device pointer to the twl4030-madc structure
Kyle Manna [Fri, 12 Aug 2011 03:33:12 +0000 (22:33 -0500)]
mfd: Copy the device pointer to the twl4030-madc structure

commit 66cc5b8e50af87b0bbd0f179d76d2826f4549c13 upstream.

Worst case this fixes the following error:
[   72.086212] (NULL device *): conversion timeout!

Best case it prevents a crash

Signed-off-by: Kyle Manna <kyle@kylemanna.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
13 years agomfd: Fix mismatch in twl4030 mutex lock-unlock
Sanjeev Premi [Mon, 11 Jul 2011 15:20:31 +0000 (20:50 +0530)]
mfd: Fix mismatch in twl4030 mutex lock-unlock

commit e178ccb33569da17dc897a08a3865441b813bdfb upstream.

A mutex is locked on entry into twl4030_madc_conversion().
Immediate return on some error conditions leaves the
mutex locked.

This patch ensures that mutex is always unlocked before
leaving the function.

Signed-off-by: Sanjeev Premi <premi@ti.com>
Cc: Keerthy <j-keerthy@ti.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoiwlwifi: update SCD BC table for all SCD queues
Emmanuel Grumbach [Mon, 26 Dec 2011 06:47:33 +0000 (08:47 +0200)]
iwlwifi: update SCD BC table for all SCD queues

commit 96f1f05af76b601ab21a7dc603ae0a1cea4efc3d upstream.

Since we configure all the queues as CHAINABLE, we need to update the
byte count for all the queues, not only the AGGREGATABLE ones.

Not doing so can confuse the SCD and make the fw assert.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
13 years agoipv4: using prefetch requires including prefetch.h
Stephen Rothwell [Thu, 22 Dec 2011 06:03:29 +0000 (17:03 +1100)]
ipv4: using prefetch requires including prefetch.h

[ Upstream commit b9eda06f80b0db61a73bd87c6b0eb67d8aca55ad ]

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoipv4: reintroduce route cache garbage collector
Eric Dumazet [Wed, 21 Dec 2011 20:47:16 +0000 (15:47 -0500)]
ipv4: reintroduce route cache garbage collector

[ Upstream commit 9f28a2fc0bd77511f649c0a788c7bf9a5fd04edb ]

Commit 2c8cec5c10b (ipv4: Cache learned PMTU information in inetpeer)
removed IP route cache garbage collector a bit too soon, as this gc was
responsible for expired routes cleanup, releasing their neighbour
reference.

As pointed out by Robert Gladewitz, recent kernels can fill and exhaust
their neighbour cache.

Reintroduce the garbage collection, since we'll have to wait our
neighbour lookups become refcount-less to not depend on this stuff.

Reported-by: Robert Gladewitz <gladewitz@gmx.de>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoipv4: flush route cache after change accept_local
Weiping Pan [Thu, 1 Dec 2011 15:47:06 +0000 (15:47 +0000)]
ipv4: flush route cache after change accept_local

[ Upstream commit d01ff0a049f749e0bf10a35bb23edd012718c8c2 ]

After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0,
we should flush route cache, or it will continue receive packets with local
source address, which should be dropped.

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosctp: Do not account for sizeof(struct sk_buff) in estimated rwnd
Thomas Graf [Mon, 19 Dec 2011 04:11:40 +0000 (04:11 +0000)]
sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd

[ Upstream commit a76c0adf60f6ca5ff3481992e4ea0383776b24d2 ]

When checking whether a DATA chunk fits into the estimated rwnd a
full sizeof(struct sk_buff) is added to the needed chunk size. This
quickly exhausts the available rwnd space and leads to packets being
sent which are much below the PMTU limit. This can lead to much worse
performance.

The reason for this behaviour was to avoid putting too much memory
pressure on the receiver. The concept is not completely irational
because a Linux receiver does in fact clone an skb for each DATA chunk
delivered. However, Linux also reserves half the available socket
buffer space for data structures therefore usage of it is already
accounted for.

When proposing to change this the last time it was noted that this
behaviour was introduced to solve a performance issue caused by rwnd
overusage in combination with small DATA chunks.

Trying to reproduce this I found that with the sk_buff overhead removed,
the performance would improve significantly unless socket buffer limits
are increased.

The following numbers have been gathered using a patched iperf
supporting SCTP over a live 1 Gbit ethernet network. The -l option
was used to limit DATA chunk sizes. The numbers listed are based on
the average of 3 test runs each. Default values have been used for
sk_(r|w)mem.

Chunk
Size    Unpatched     No Overhead
-------------------------------------
   4    15.2 Kbit [!]   12.2 Mbit [!]
   8    35.8 Kbit [!]   26.0 Mbit [!]
  16    95.5 Kbit [!]   54.4 Mbit [!]
  32   106.7 Mbit      102.3 Mbit
  64   189.2 Mbit      188.3 Mbit
 128   331.2 Mbit      334.8 Mbit
 256   537.7 Mbit      536.0 Mbit
 512   766.9 Mbit      766.6 Mbit
1024   810.1 Mbit      808.6 Mbit

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosctp: fix incorrect overflow check on autoclose
Xi Wang [Fri, 16 Dec 2011 12:44:15 +0000 (12:44 +0000)]
sctp: fix incorrect overflow check on autoclose

[ Upstream commit 2692ba61a82203404abd7dd2a027bda962861f74 ]

Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for
limiting the autoclose value.  If userspace passes in -1 on 32-bit
platform, the overflow check didn't work and autoclose would be set
to 0xffffffff.

This patch defines a max_autoclose (in seconds) for limiting the value
and exposes it through sysctl, with the following intentions.

1) Avoid overflowing autoclose * HZ.

2) Keep the default autoclose bound consistent across 32- and 64-bit
   platforms (INT_MAX / HZ in this patch).

3) Keep the autoclose value consistent between setsockopt() and
   getsockopt() calls.

Suggested-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosch_gred: should not use GFP_KERNEL while holding a spinlock
Eric Dumazet [Sun, 11 Dec 2011 23:42:53 +0000 (23:42 +0000)]
sch_gred: should not use GFP_KERNEL while holding a spinlock

[ Upstream commit 3f1e6d3fd37bd4f25e5b19f1c7ca21850426c33f ]

gred_change_vq() is called under sch_tree_lock(sch).

This means a spinlock is held, and we are not allowed to sleep in this
context.

We might pre-allocate memory using GFP_KERNEL before taking spinlock,
but this is not suitable for stable material.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agonet: have ipconfig not wait if no dev is available
Gerlando Falauto [Mon, 19 Dec 2011 22:58:04 +0000 (22:58 +0000)]
net: have ipconfig not wait if no dev is available

[ Upstream commit cd7816d14953c8af910af5bb92f488b0b277e29d ]

previous commit 3fb72f1e6e6165c5f495e8dc11c5bbd14c73385c
makes IP-Config wait for carrier on at least one network device.

Before waiting (predefined value 120s), check that at least one device
was successfully brought up. Otherwise (e.g. buggy bootloader
which does not set the MAC address) there is no point in waiting
for carrier.

Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Holger Brunck <holger.brunck@keymile.com>
Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomqprio: Avoid panic if no options are provided
Thomas Graf [Thu, 22 Dec 2011 02:05:07 +0000 (02:05 +0000)]
mqprio: Avoid panic if no options are provided

[ Upstream commit 7838f2ce36b6ab5c13ef20b1857e3bbd567f1759 ]

Userspace may not provide TCA_OPTIONS, in fact tc currently does
so not do so if no arguments are specified on the command line.
Return EINVAL instead of panicing.

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agollc: llc_cmsg_rcv was getting called after sk_eat_skb.
Alex Juncu [Thu, 15 Dec 2011 23:01:25 +0000 (23:01 +0000)]
llc: llc_cmsg_rcv was getting called after sk_eat_skb.

[ Upstream commit 9cef310fcdee12b49b8b4c96fd8f611c8873d284 ]

Received non stream protocol packets were calling llc_cmsg_rcv that used a
skb after that skb was released by sk_eat_skb. This caused received STP
packets to generate kernel panics.

Signed-off-by: Alexandru Juncu <ajuncu@ixiacom.com>
Signed-off-by: Kunjan Naik <knaik@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoppp: fix pptp double release_sock in pptp_bind()
Djalal Harouni [Tue, 6 Dec 2011 15:47:12 +0000 (15:47 +0000)]
ppp: fix pptp double release_sock in pptp_bind()

[ Upstream commit a454daceb78844a09c08b6e2d8badcb76a5d73b9 ]

Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agonet: bpf_jit: fix an off-one bug in x86_64 cond jump target
Markus Kötter [Sat, 17 Dec 2011 11:39:08 +0000 (11:39 +0000)]
net: bpf_jit: fix an off-one bug in x86_64 cond jump target

[ Upstream commit a03ffcf873fe0f2565386ca8ef832144c42e67fa ]

x86 jump instruction size is 2 or 5 bytes (near/long jump), not 2 or 6
bytes.

In case a conditional jump is followed by a long jump, conditional jump
target is one byte past the start of target instruction.

Signed-off-by: Markus Kötter <nepenthesdev@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosparc: Fix handling of orig_i0 wrt. debugging when restarting syscalls.
David S. Miller [Mon, 26 Dec 2011 17:30:13 +0000 (12:30 -0500)]
sparc: Fix handling of orig_i0 wrt. debugging when restarting syscalls.

[ A combination of upstream commits 1d299bc7732c34d85bd43ac1a8745f5a2fed2078 and
  e88d2468718b0789b4c33da2f7e1cef2a1eee279 ]

Although we provide a proper way for a debugger to control whether
syscall restart occurs, we run into problems because orig_i0 is not
saved and restored properly.

Luckily we can solve this problem without having to make debuggers
aware of the issue.  Across system calls, several registers are
considered volatile and can be safely clobbered.

Therefore we use the pt_regs save area of one of those registers, %g6,
as a place to save and restore orig_i0.

Debuggers transparently will do the right thing because they save and
restore this register already.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosparc64: Fix masking and shifting in VIS fpcmp emulation.
David S. Miller [Mon, 31 Oct 2011 08:05:49 +0000 (01:05 -0700)]
sparc64: Fix masking and shifting in VIS fpcmp emulation.

[ Upstream commit 2e8ecdc008a16b9a6c4b9628bb64d0d1c05f9f92 ]

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosparc32: Correct the return value of memcpy.
David S. Miller [Wed, 19 Oct 2011 22:31:55 +0000 (15:31 -0700)]
sparc32: Correct the return value of memcpy.

[ Upstream commit a52312b88c8103e965979a79a07f6b34af82ca4b ]

Properly return the original destination buffer pointer.

Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosparc32: Remove uses of %g7 in memcpy implementation.
David S. Miller [Wed, 19 Oct 2011 22:30:14 +0000 (15:30 -0700)]
sparc32: Remove uses of %g7 in memcpy implementation.

[ Upstream commit 21f74d361dfd6a7d0e47574e315f780d8172084a ]

This is setting things up so that we can correct the return
value, so that it properly returns the original destination
buffer pointer.

Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosparc32: Remove non-kernel code from memcpy implementation.
David S. Miller [Wed, 19 Oct 2011 22:15:58 +0000 (15:15 -0700)]
sparc32: Remove non-kernel code from memcpy implementation.

[ Upstream commit 045b7de9ca0cf09f1adc3efa467f668b89238390 ]

Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosparc: Kill custom io_remap_pfn_range().
David S. Miller [Fri, 18 Nov 2011 02:17:59 +0000 (18:17 -0800)]
sparc: Kill custom io_remap_pfn_range().

[ Upstream commit 3e37fd3153ac95088a74f5e7c569f7567e9f993a ]

To handle the large physical addresses, just make a simple wrapper
around remap_pfn_range() like MIPS does.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosparc64: Patch sun4v code sequences properly on module load.
David S. Miller [Fri, 18 Nov 2011 06:44:58 +0000 (22:44 -0800)]
sparc64: Patch sun4v code sequences properly on module load.

[ Upstream commit 0b64120cceb86e93cb1bda0dc055f13016646907 ]

Some of the sun4v code patching occurs in inline functions visible
to, and usable by, modules.

Therefore we have to patch them up during module load.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosparc32: Be less strict in matching %lo part of relocation.
David S. Miller [Wed, 14 Dec 2011 18:05:22 +0000 (10:05 -0800)]
sparc32: Be less strict in matching %lo part of relocation.

[ Upstream commit b1f44e13a525d2ffb7d5afe2273b7169d6f2222e ]

The "(insn & 0x01800000) != 0x01800000" test matches 'restore'
but that is a legitimate place to see the %lo() part of a 32-bit
symbol relocation, particularly in tail calls.

Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agosparc64: Fix MSIQ HV call ordering in pci_sun4v_msiq_build_irq().
David S. Miller [Thu, 22 Dec 2011 21:23:59 +0000 (13:23 -0800)]
sparc64: Fix MSIQ HV call ordering in pci_sun4v_msiq_build_irq().

[ Upstream commit 7cc8583372a21d98a23b703ad96cab03180b5030 ]

This silently was working for many years and stopped working on
Niagara-T3 machines.

We need to set the MSIQ to VALID before we can set it's state to IDLE.

On Niagara-T3, setting the state to IDLE first was causing HV_EINVAL
errors.  The hypervisor documentation says, rather ambiguously, that
the MSIQ must be "initialized" before one can set the state.

I previously understood this to mean merely that a successful setconf()
operation has been performed on the MSIQ, which we have done at this
point.  But it seems to also mean that it has been set VALID too.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agomm: hugetlb: fix non-atomic enqueue of huge page
Hillf Danton [Wed, 28 Dec 2011 23:57:16 +0000 (15:57 -0800)]
mm: hugetlb: fix non-atomic enqueue of huge page

commit b0365c8d0cb6e79eb5f21418ae61ab511f31b575 upstream.

If a huge page is enqueued under the protection of hugetlb_lock, then the
operation is atomic and safe.

Signed-off-by: Hillf Danton <dhillf@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>