]> www.infradead.org Git - users/hch/dma-mapping.git/log
users/hch/dma-mapping.git
15 years agomm: Move ARCH_SLAB_MINALIGN and ARCH_KMALLOC_MINALIGN to <linux/slab_def.h>
David Woodhouse [Wed, 19 May 2010 11:01:42 +0000 (12:01 +0100)]
mm: Move ARCH_SLAB_MINALIGN and ARCH_KMALLOC_MINALIGN to <linux/slab_def.h>

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
15 years agoLinus 2.6.34 v2.6.34
Linus Torvalds [Sun, 16 May 2010 21:17:36 +0000 (14:17 -0700)]
Linus 2.6.34

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Sun, 16 May 2010 18:11:53 +0000 (11:11 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  rtnetlink: make SR-IOV VF interface symmetric
  sctp: delete active ICMP proto unreachable timer when free transport
  tcp: fix MD5 (RFC2385) support

15 years agoMerge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
Linus Torvalds [Sun, 16 May 2010 18:11:31 +0000 (11:11 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  MIPS: Oprofile: Fix Loongson irq handler
  MIPS: N32: Use compat version for sys_ppoll.
  MIPS FPU emulator: allow Cause bits of FCSR to be writeable by ctc1

15 years agortnetlink: make SR-IOV VF interface symmetric
Chris Wright [Sun, 16 May 2010 08:05:45 +0000 (01:05 -0700)]
rtnetlink: make SR-IOV VF interface symmetric

Now we have a set of nested attributes:

  IFLA_VFINFO_LIST (NESTED)
    IFLA_VF_INFO (NESTED)
      IFLA_VF_MAC
      IFLA_VF_VLAN
      IFLA_VF_TX_RATE

This allows a single set to operate on multiple attributes if desired.
Among other things, it means a dump can be replayed to set state.

The current interface has yet to be released, so this seems like
something to consider for 2.6.34.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosctp: delete active ICMP proto unreachable timer when free transport
Wei Yongjun [Sun, 9 May 2010 16:56:07 +0000 (16:56 +0000)]
sctp: delete active ICMP proto unreachable timer when free transport

transport may be free before ICMP proto unreachable timer expire, so
we should delete active ICMP proto unreachable timer when transport
is going away.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: fix MD5 (RFC2385) support
Eric Dumazet [Sun, 16 May 2010 07:34:04 +0000 (00:34 -0700)]
tcp: fix MD5 (RFC2385) support

TCP MD5 support uses percpu data for temporary storage. It currently
disables preemption so that same storage cannot be reclaimed by another
thread on same cpu.

We also have to make sure a softirq handler wont try to use also same
context. Various bug reports demonstrated corruptions.

Fix is to disable preemption and BH.

Reported-by: Bhaskar Dutta <bhaskie@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years ago MIPS: Oprofile: Fix Loongson irq handler
Wu Zhangjin [Thu, 6 May 2010 16:59:46 +0000 (00:59 +0800)]
MIPS: Oprofile: Fix Loongson irq handler

    The interrupt enable bit for the performance counters is in the Control
    Register $24, not in the counter register.
    loongson2_perfcount_handler(), we need to use

Reported-by: Xu Hengyang <hengyang@mail.ustc.edu.cn>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: linux-mips@linux-mips.org
    Patchwork: http://patchwork.linux-mips.org/patch/1198/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
---

15 years ago MIPS: N32: Use compat version for sys_ppoll.
Chandrakala Chavva [Tue, 11 May 2010 00:11:54 +0000 (17:11 -0700)]
MIPS: N32: Use compat version for sys_ppoll.

    The sys_ppoll() takes struct 'struct timespec'. This is different for the
    N32 and N64 ABIs. Use the compat version to do the proper conversions.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
    To: linux-mips@linux-mips.org
    Patchwork: http://patchwork.linux-mips.org/patch/1210/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
---

15 years ago MIPS FPU emulator: allow Cause bits of FCSR to be writeable by ctc1
Shane McDonald [Fri, 7 May 2010 05:26:57 +0000 (23:26 -0600)]
MIPS FPU emulator: allow Cause bits of FCSR to be writeable by ctc1

    In the FPU emulator code of the MIPS, the Cause bits of the FCSR register
    are not currently writeable by the ctc1 instruction.  In odd corner cases,
    this can cause problems.  For example, a case existed where a divide-by-zero
    exception was generated by the FPU, and the signal handler attempted to
    restore the FPU registers to their state before the exception occurred.  In
    this particular setup, writing the old value to the FCSR register would
    cause another divide-by-zero exception to occur immediately.  The solution
    is to change the ctc1 instruction emulator code to allow the Cause bits of
    the FCSR register to be writeable.  This is the behaviour of the hardware
    that the code is emulating.

    This problem was found by Shane McDonald, but the credit for the fix goes
    to Kevin Kissell.  In Kevin's words:

    I submit that the bug is indeed in that ctc_op:  case of the emulator.  The
    Cause bits (17:12) are supposed to be writable by that instruction, but the
    CTC1 emulation won't let them be updated by the instruction.  I think that
    actually if you just completely removed lines 387-388 [...] things would
    work a good deal better.  At least, it would be a more accurate emulation of
    the architecturally defined FPU.  If I wanted to be really, really pedantic
    (which I sometimes do), I'd also protect the reserved bits that aren't
    necessarily writable.

Signed-off-by: Shane McDonald <mcdonald.shane@gmail.com>
    To: anemo@mba.ocn.ne.jp
    To: kevink@paralogos.com
    To: sshtylyov@mvista.com
    Patchwork: http://patchwork.linux-mips.org/patch/1205/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
---

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
Linus Torvalds [Sat, 15 May 2010 19:55:31 +0000 (12:55 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable

* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: check for read permission on src file in the clone ioctl

15 years agolib/btree: fix possible NULL pointer dereference
kirjanov@gmail.com [Sat, 15 May 2010 16:32:34 +0000 (12:32 -0400)]
lib/btree: fix possible NULL pointer dereference

mempool_alloc() can return null in atomic case.

Signed-off-by: Denis Kirjanov <kirjanov@gmail.com>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agommc: at91_mci: modify cache flush routines
Nicolas Ferre [Sat, 15 May 2010 16:32:31 +0000 (12:32 -0400)]
mmc: at91_mci: modify cache flush routines

As we were using an internal dma flushing routine, this patch changes to
the DMA API flush_kernel_dcache_page().  Driver is able to compile now.

[akpm@linux-foundation.org: flush_kernel_dcache_page() comes before kunmap_atomic()]
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoBtrfs: check for read permission on src file in the clone ioctl
Dan Rosenberg [Sat, 15 May 2010 15:27:37 +0000 (11:27 -0400)]
Btrfs: check for read permission on src file in the clone ioctl

The existing code would have allowed you to clone a file that was
only open for writing

Signed-off-by: Chris Mason <chris.mason@oracle.com>
15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
Linus Torvalds [Sat, 15 May 2010 16:03:15 +0000 (09:03 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  JFS: Free sbi memory in error path
  fs/sysv: dereferencing ERR_PTR()
  Fix double-free in logfs
  Fix the regression created by "set S_DEAD on unlink()..." commit

15 years agoMerge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 15 May 2010 16:03:02 +0000 (09:03 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf record: Add a fallback to the reference relocation symbol

15 years agoJFS: Free sbi memory in error path
Jan Blunck [Mon, 12 Apr 2010 23:44:08 +0000 (16:44 -0700)]
JFS: Free sbi memory in error path

I spotted the missing kfree() while removing the BKL.

[akpm@linux-foundation.org: avoid multiple returns so it doesn't happen again]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agofs/sysv: dereferencing ERR_PTR()
Dan Carpenter [Wed, 21 Apr 2010 10:30:32 +0000 (12:30 +0200)]
fs/sysv: dereferencing ERR_PTR()

I moved the dir_put_page() inside the if condition so we don't dereference
"page", if it's an ERR_PTR().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoFix double-free in logfs
Al Viro [Thu, 29 Apr 2010 00:57:02 +0000 (20:57 -0400)]
Fix double-free in logfs

iput() is needed *until* we'd done successful d_alloc_root()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoFix the regression created by "set S_DEAD on unlink()..." commit
Al Viro [Fri, 30 Apr 2010 21:17:09 +0000 (17:17 -0400)]
Fix the regression created by "set S_DEAD on unlink()..." commit

1) i_flags simply doesn't work for mount/unlink race prevention;
we may have many links to file and rm on one of those obviously
shouldn't prevent bind on top of another later on.  To fix it
right way we need to mark _dentry_ as unsuitable for mounting
upon; new flag (DCACHE_CANT_MOUNT) is protected by d_flags and
i_mutex on the inode in question.  Set it (with dont_mount(dentry))
in unlink/rmdir/etc., check (with cant_mount(dentry)) in places
in namespace.c that used to check for S_DEAD.  Setting S_DEAD
is still needed in places where we used to set it (for directories
getting killed), since we rely on it for readdir/rmdir race
prevention.

2) rename()/mount() protection has another bogosity - we unhash
the target before we'd checked that it's not a mountpoint.  Fixed.

3) ancient bogosity in pivot_root() - we locked i_mutex on the
right directory, but checked S_DEAD on the different (and wrong)
one.  Noticed and fixed.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
15 years agoMerge master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Sat, 15 May 2010 04:28:42 +0000 (21:28 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  ARM: 6126/1: ARM mpcore_wdt: fix build failure and other fixes
  ARM: 6125/1: ARM TWD: move TWD registers to common header
  ARM: 6110/1: Fix Thumb-2 kernel builds when UACCESS_WITH_MEMCPY is enabled
  ARM: 6112/1: Use the Inner Shareable I-cache and BTB ops on ARMv7 SMP
  ARM: 6111/1: Implement read/write for ownership in the ARMv6 DMA cache ops
  ARM: 6106/1: Implement copy_to_user_page() for noMMU
  ARM: 6105/1: Fix the __arm_ioremap_caller() definition in nommu.c

15 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 15 May 2010 04:28:23 +0000 (21:28 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mrst: Don't blindly access extended config space

15 years agoprofile: fix stats and data leakage
Hugh Dickins [Sat, 15 May 2010 02:44:10 +0000 (19:44 -0700)]
profile: fix stats and data leakage

If the kernel is large or the profiling step small, /proc/profile
leaks data and readprofile shows silly stats, until readprofile -r
has reset the buffer: clear the prof_buffer when it is vmalloc()ed.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agohughd: update email address
Hugh Dickins [Sat, 15 May 2010 02:40:35 +0000 (19:40 -0700)]
hughd: update email address

My old address will shut down in a couple of weeks: update the tree.

Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agox86, mrst: Don't blindly access extended config space
H. Peter Anvin [Fri, 14 May 2010 20:55:57 +0000 (13:55 -0700)]
x86, mrst: Don't blindly access extended config space

Do not blindly access extended configuration space unless we actively
know we're on a Moorestown platform.  The fixed-size BAR capability
lives in the extended configuration space, and thus is not applicable
if the configuration space isn't appropriately sized.

This fixes booting certain VMware configurations with CONFIG_MRST=y.

Moorestown will add a fake PCI-X 266 capability to advertise the
presence of extended configuration space.

Reported-and-tested-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Jacob Pan <jacob.jun.pan@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
LKML-Reference: <AANLkTiltKUa3TrKR1M51eGw8FLNoQJSLT0k0_K5X3-OJ@mail.gmail.com>

15 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 14 May 2010 19:20:09 +0000 (12:20 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments
  x86, k8: Fix build error when K8_NB is disabled
  x86, amd: Check X86_FEATURE_OSVW bit before accessing OSVW MSRs
  x86: Fix fake apicid to node mapping for numa emulation

15 years agox86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments
Frank Arnold [Thu, 22 Apr 2010 14:06:59 +0000 (16:06 +0200)]
x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments

When running a quest kernel on xen we get:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
IP: [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df
PGD 0
Oops: 0000 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:

Pid: 0, comm: swapper Tainted: G        W  2.6.34-rc3 #1 /HVM domU
RIP: 0010:[<ffffffff8142f2fb>]  [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x
2ca/0x3df
RSP: 0018:ffff880002203e08  EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000060
RDX: 0000000000000000 RSI: 0000000000000040 RDI: 0000000000000000
RBP: ffff880002203ed8 R08: 00000000000017c0 R09: ffff880002203e38
R10: ffff8800023d5d40 R11: ffffffff81a01e28 R12: ffff880187e6f5c0
R13: ffff880002203e34 R14: ffff880002203e58 R15: ffff880002203e68
FS:  0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000038 CR3: 0000000001a3c000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a44020)
Stack:
 ffffffff810d7ecb ffff880002203e20 ffffffff81059140 ffff880002203e30
<0> ffffffff810d7ec9 0000000002203e40 000000000050d140 ffff880002203e70
<0> 0000000002008140 0000000000000086 ffff880040020140 ffffffff81068b8b
Call Trace:
 <IRQ>
 [<ffffffff810d7ecb>] ? sync_supers_timer_fn+0x0/0x1c
 [<ffffffff81059140>] ? mod_timer+0x23/0x25
 [<ffffffff810d7ec9>] ? arm_supers_timer+0x34/0x36
 [<ffffffff81068b8b>] ? hrtimer_get_next_event+0xa7/0xc3
 [<ffffffff81058e85>] ? get_next_timer_interrupt+0x19a/0x20d
 [<ffffffff8142fa23>] get_cpu_leaves+0x5c/0x232
 [<ffffffff8106a7b1>] ? sched_clock_local+0x1c/0x82
 [<ffffffff8106a9a0>] ? sched_clock_tick+0x75/0x7a
 [<ffffffff8107748c>] generic_smp_call_function_single_interrupt+0xae/0xd0
 [<ffffffff8101f6ef>] smp_call_function_single_interrupt+0x18/0x27
 [<ffffffff8100a773>] call_function_single_interrupt+0x13/0x20
 <EOI>
 [<ffffffff8143c468>] ? notifier_call_chain+0x14/0x63
 [<ffffffff810295c6>] ? native_safe_halt+0xc/0xd
 [<ffffffff810114eb>] ? default_idle+0x36/0x53
 [<ffffffff81008c22>] cpu_idle+0xaa/0xe4
 [<ffffffff81423a9a>] rest_init+0x7e/0x80
 [<ffffffff81b10dd2>] start_kernel+0x40e/0x419
 [<ffffffff81b102c8>] x86_64_start_reservations+0xb3/0xb7
 [<ffffffff81b103c4>] x86_64_start_kernel+0xf8/0x107
Code: 14 d5 40 ff ae 81 8b 14 02 31 c0 3b 15 47 1c 8b 00 7d 0e 48 8b 05 36 1c 8b
 00 48 63 d2 48 8b 04 d0 c7 85 5c ff ff ff 00 00 00 00 <8b> 70 38 48 8d 8d 5c ff
 ff ff 48 8b 78 10 ba c4 01 00 00 e8 eb
RIP  [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df
 RSP <ffff880002203e08>
CR2: 0000000000000038
---[ end trace a7919e7f17c0a726 ]---

The L3 cache index disable feature of AMD CPUs has to be disabled if the
kernel is running as guest on top of a hypervisor because northbridge
devices are not available to the guest. Currently, this fixes a boot
crash on top of Xen. In the future this will become an issue on KVM as
well.

Check if northbridge devices are present and do not enable the feature
if there are none.

[ hpa: backported to 2.6.34 ]

Signed-off-by: Frank Arnold <frank.arnold@amd.com>
LKML-Reference: <1271945222-5283-3-git-send-email-bp@amd64.org>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org>
15 years agox86, k8: Fix build error when K8_NB is disabled
Borislav Petkov [Sat, 24 Apr 2010 07:56:53 +0000 (09:56 +0200)]
x86, k8: Fix build error when K8_NB is disabled

K8_NB depends on PCI and when the last is disabled (allnoconfig) we fail
at the final linking stage due to missing exported num_k8_northbridges.
Add a header stub for that.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100503183036.GJ26107@aftab>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org>
15 years agoMerge branch 'for-linus' of git://git.infradead.org/users/eparis/notify
Linus Torvalds [Fri, 14 May 2010 18:49:42 +0000 (11:49 -0700)]
Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify

* 'for-linus' of git://git.infradead.org/users/eparis/notify:
  inotify: don't leak user struct on inotify release
  inotify: race use after free/double free in inotify inode marks
  inotify: clean up the inotify_add_watch out path
  Inotify: undefined reference to `anon_inode_getfd'

Manual merge to remove duplicate "select ANON_INODES" from Kconfig file

15 years agoMerge branch 'davinci-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 14 May 2010 18:43:52 +0000 (11:43 -0700)]
Merge branch 'davinci-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-davinci

* 'davinci-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-davinci:
  DA830: fix USB 2.0 clock entry

15 years agoDA830: fix USB 2.0 clock entry
Sergei Shtylyov [Thu, 13 May 2010 18:51:51 +0000 (22:51 +0400)]
DA830: fix USB 2.0 clock entry

DA8xx OHCI driver fails to load due to failing clk_get() call for the USB 2.0
clock. Arrange matching USB 2.0 clock by the clock name instead of the device.
(Adding another CLK() entry for "ohci.0" device won't do -- in the future I'll
also have to enable USB 2.0 clock to configure CPPI 4.1 module, in which case
I won't have any device at all.)

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
15 years agoinotify: don't leak user struct on inotify release
Pavel Emelyanov [Wed, 12 May 2010 22:34:07 +0000 (15:34 -0700)]
inotify: don't leak user struct on inotify release

inotify_new_group() receives a get_uid-ed user_struct and saves the
reference on group->inotify_data.user.  The problem is that free_uid() is
never called on it.

Issue seem to be introduced by 63c882a0 (inotify: reimplement inotify
using fsnotify) after 2.6.30.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Eric Paris <eparis@parisplace.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
15 years agoinotify: race use after free/double free in inotify inode marks
Eric Paris [Tue, 11 May 2010 21:17:40 +0000 (17:17 -0400)]
inotify: race use after free/double free in inotify inode marks

There is a race in the inotify add/rm watch code.  A task can find and
remove a mark which doesn't have all of it's references.  This can
result in a use after free/double free situation.

Task A Task B
------------ -----------
inotify_new_watch()
 allocate a mark (refcnt == 1)
 add it to the idr
inotify_rm_watch()
 inotify_remove_from_idr()
  fsnotify_put_mark()
      refcnt hits 0, free
 take reference because we are on idr
 [at this point it is a use after free]
 [time goes on]
 refcnt may hit 0 again, double free

The fix is to take the reference BEFORE the object can be found in the
idr.

Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: <stable@kernel.org>
15 years agoinotify: clean up the inotify_add_watch out path
Eric Paris [Tue, 11 May 2010 21:16:23 +0000 (17:16 -0400)]
inotify: clean up the inotify_add_watch out path

inotify_add_watch explictly frees the unused inode mark, but it can just
use the generic code.  Just do that.

Signed-off-by: Eric Paris <eparis@redhat.com>
15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Fri, 14 May 2010 14:56:45 +0000 (07:56 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  vhost: fix barrier pairing

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Fri, 14 May 2010 14:55:42 +0000 (07:55 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  mmap_min_addr check CAP_SYS_RAWIO only for write

15 years agoMerge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze
Linus Torvalds [Fri, 14 May 2010 14:29:29 +0000 (07:29 -0700)]
Merge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze

* 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Fix module loading on system with WB cache
  microblaze: export assembly functions used by modules
  microblaze: Remove powerpc code from Microblaze port
  microblaze: Remove compilation warnings in cache macro
  microblaze: export assembly functions used by modules
  microblaze: fix get_user/put_user side-effects
  microblaze: re-enable interrupts before calling schedule

15 years agoMerge branch 'net-2.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
David S. Miller [Fri, 14 May 2010 10:42:49 +0000 (03:42 -0700)]
Merge branch 'net-2.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

15 years agommap_min_addr check CAP_SYS_RAWIO only for write
Kees Cook [Thu, 22 Apr 2010 19:19:17 +0000 (12:19 -0700)]
mmap_min_addr check CAP_SYS_RAWIO only for write

Redirecting directly to lsm, here's the patch discussed on lkml:
http://lkml.org/lkml/2010/4/22/219

The mmap_min_addr value is useful information for an admin to see without
being root ("is my system vulnerable to kernel NULL pointer attacks?") and
its setting is trivially easy for an attacker to determine by calling
mmap() in PAGE_SIZE increments starting at 0, so trying to keep it private
has no value.

Only require CAP_SYS_RAWIO if changing the value, not reading it.

Comment from Serge :

  Me, I like to write my passwords with light blue pen on dark blue
  paper, pasted on my window - if you're going to get my password, you're
  gonna get a headache.

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
(cherry picked from commit 822cceec7248013821d655545ea45d1c6a9d15b3)

15 years agomicroblaze: Fix module loading on system with WB cache
Michal Simek [Fri, 14 May 2010 05:40:46 +0000 (07:40 +0200)]
microblaze: Fix module loading on system with WB cache

There is necessary to flush whole dcache. Icache work should be
done in kernel/module.c.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agox86, amd: Check X86_FEATURE_OSVW bit before accessing OSVW MSRs
Andreas Herrmann [Tue, 27 Apr 2010 10:13:48 +0000 (12:13 +0200)]
x86, amd: Check X86_FEATURE_OSVW bit before accessing OSVW MSRs

If host CPU is exposed to a guest the OSVW MSRs are not guaranteed
to be present and a GP fault occurs. Thus checking the feature flag is
essential.

Cc: <stable@kernel.org> # .32.x .33.x
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20100427101348.GC4489@alberich.amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
Linus Torvalds [Thu, 13 May 2010 21:48:10 +0000 (14:48 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
  mfd: Clean up after WM83xx AUXADC interrupt if it arrives late

15 years agoMerge branch 'kvm-updates/2.6.34' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Thu, 13 May 2010 21:36:19 +0000 (14:36 -0700)]
Merge branch 'kvm-updates/2.6.34' of git://git.kernel.org/pub/scm/virt/kvm/kvm

* 'kvm-updates/2.6.34' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Keep index within boundaries in kvmppc_44x_emul_tlbwe()
  KVM: VMX: blocked-by-sti must not defer NMI injections
  KVM: x86: Call vcpu_load and vcpu_put in cpuid_update
  KVM: SVM: Fix wrong intercept masks on 32 bit
  KVM: convert ioapic lock to spinlock

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
Linus Torvalds [Thu, 13 May 2010 19:21:44 +0000 (12:21 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  serial: imx.c: fix CTS trigger level lower to avoid lost chars
  tty: Fix unbalanced BKL handling in error path
  serial: mpc52xx_uart: fix null pointer dereference

15 years agoserial: imx.c: fix CTS trigger level lower to avoid lost chars
Valentin Longchamp [Wed, 5 May 2010 09:47:07 +0000 (11:47 +0200)]
serial: imx.c: fix CTS trigger level lower to avoid lost chars

The imx CTS trigger level is left at its reset value that is 32
chars. Since the RX FIFO has 32 entries, when CTS is raised, the
FIFO already is full. However, some serial port devices first empty
their TX FIFO before stopping when CTS is raised, resulting in lost
chars.

This patch sets the trigger level lower so that other chars arrive
after CTS is raised, there is still room for 16 of them.

Signed-off-by: Valentin Longchamp<valentin.longchamp@epfl.ch>
Tested-by: Philippe Rétornaz<philippe.retornaz@epfl.ch>
Acked-by: Wolfram Sang<w.sang@pengutronix.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agotty: Fix unbalanced BKL handling in error path
Alan Cox [Tue, 4 May 2010 19:42:36 +0000 (20:42 +0100)]
tty: Fix unbalanced BKL handling in error path

Arnd noted:

After the "retry_open:" label, we first get the tty_mutex
and then the BKL. However a the end of tty_open, we jump
back to retry_open with the BKL still held. If we run into
this case, the tty_open function will be left with the BKL
still held.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoserial: mpc52xx_uart: fix null pointer dereference
Anatolij Gustschin [Tue, 4 May 2010 22:18:59 +0000 (00:18 +0200)]
serial: mpc52xx_uart: fix null pointer dereference

Commit 6acc6833510db8f72b5ef343296d97480555fda9
introduced NULL pointer dereference and kernel crash
on ppc32 machines while booting. Fix this bug now.

Reported-by: Leonardo Chiquitto <leonardo.lists@gmail.com>
Tested-by: Leonardo Chiquitto <leonardo.lists@gmail.com>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench...
Linus Torvalds [Thu, 13 May 2010 17:36:16 +0000 (10:36 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: guard against hardlinking directories

15 years agovfs: Fix O_NOFOLLOW behavior for paths with trailing slashes
Jan Kara [Thu, 13 May 2010 10:52:57 +0000 (12:52 +0200)]
vfs: Fix O_NOFOLLOW behavior for paths with trailing slashes

According to specification

mkdir d; ln -s d a; open("a/", O_NOFOLLOW | O_RDONLY)

should return success but currently it returns ELOOP.  This is a
regression caused by path lookup cleanup patch series.

Fix the code to ignore O_NOFOLLOW in case the provided path has trailing
slashes.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Marius Tolzmann <tolzmann@molgen.mpg.de>
Acked-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
Linus Torvalds [Thu, 13 May 2010 14:35:26 +0000 (07:35 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: ice1724 - Fix ESI Maya44 capture source control
  ALSA: pcm - Use pgprot_noncached() for MIPS non-coherent archs
  ALSA: virtuoso: fix Xonar D1/DX front panel microphone
  ALSA: hda - Add hp-dv4 model for IDT 92HD71bx
  ALSA: hda - Fix mute-LED GPIO pin for HP dv series
  ALSA: hda: Fix 0 dB for Lenovo models using Conexant CX20549 (Venice)

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Thu, 13 May 2010 14:28:43 +0000 (07:28 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: ad7877 - keep dma rx buffers in seperate cache lines
  Input: psmouse - reset all types of mice before reconnecting
  Input: elantech - use all 3 bytes when checking version
  Input: iforce - fix Guillemot Jet Leader 3D entry
  Input: iforce - add Guillemot Jet Leader Force Feedback

15 years agomfd: Clean up after WM83xx AUXADC interrupt if it arrives late
Mark Brown [Fri, 2 Apr 2010 12:08:39 +0000 (13:08 +0100)]
mfd: Clean up after WM83xx AUXADC interrupt if it arrives late

In certain circumstances, especially under heavy load, the AUXADC
completion interrupt may be detected after we've timed out waiting for
it.  That conversion would still succeed but the next conversion will
see the completion that was signalled by the interrupt for the previous
conversion and therefore not wait for the AUXADC conversion to run,
causing it to report failure.

Provide a simple, non-invasive cleanup by using try_wait_for_completion()
to ensure that the completion is not signalled before we wait.  Since
the AUXADC is run within a mutex we know there can only have been at
most one AUXADC interrupt outstanding.  A more involved change should
follow for the next merge window.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
15 years agomicroblaze: export assembly functions used by modules
Michal Simek [Thu, 13 May 2010 10:11:42 +0000 (12:11 +0200)]
microblaze: export assembly functions used by modules

Export __strncpy_user, memory_size, ioremap_bot for modules.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Remove powerpc code from Microblaze port
Michal Simek [Thu, 13 May 2010 10:09:54 +0000 (12:09 +0200)]
microblaze: Remove powerpc code from Microblaze port

Remove eeh_add_device_tree_late which is powerpc specific code.

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: Remove compilation warnings in cache macro
Michal Simek [Thu, 13 May 2010 08:55:47 +0000 (10:55 +0200)]
microblaze: Remove compilation warnings in cache macro

CC      arch/microblaze/kernel/cpu/cache.o
arch/microblaze/kernel/cpu/cache.c: In function '__invalidate_dcache_range_wb':
arch/microblaze/kernel/cpu/cache.c:398: warning: ISO C90 forbids mixed declarations and code
arch/microblaze/kernel/cpu/cache.c: In function '__flush_dcache_range_wb':
arch/microblaze/kernel/cpu/cache.c:509: warning: ISO C90 forbids mixed declara

Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agomicroblaze: export assembly functions used by modules
Steven J. Magnani [Tue, 27 Apr 2010 18:00:35 +0000 (13:00 -0500)]
microblaze: export assembly functions used by modules

Modules that use copy_{to,from}_user(), memcpy(), and memset() fail to build
in certain circumstances.

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agoMerge branch 'fix/hda' into for-linus
Takashi Iwai [Thu, 13 May 2010 08:07:15 +0000 (10:07 +0200)]
Merge branch 'fix/hda' into for-linus

15 years agoInput: ad7877 - keep dma rx buffers in seperate cache lines
Oskar Schirmer [Thu, 13 May 2010 07:42:23 +0000 (00:42 -0700)]
Input: ad7877 - keep dma rx buffers in seperate cache lines

With dma based spi transmission, data corruption is observed
occasionally. With dma buffers located right next to msg and
xfer fields, cache lines correctly flushed in preparation for
dma usage may be polluted again when writing to fields in the
same cache line.

Make sure cache fields used with dma do not share cache lines
with fields changed during dma handling. As both fields are part
of a struct that is allocated via kzalloc, thus cache aligned,
moving the fields to the 1st position and insert padding for
alignment does the job.

Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Oliver Schneidewind <osw@emlix.com>
Signed-off-by: Johannes Weiner <jw@emlix.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
[dtor@mail.ru - changed to use ___cacheline_aligned as suggested
 by akpm]
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
15 years agoInput: psmouse - reset all types of mice before reconnecting
Dmitry Torokhov [Thu, 13 May 2010 07:42:23 +0000 (00:42 -0700)]
Input: psmouse - reset all types of mice before reconnecting

Synaptics hardware requires resetting device after suspend to ram
in order for the device to be operational. The reset lives in
synaptics-specific reconnect handler, but it is not being invoked
if synaptics support is disabled and the device is handled as a
standard PS/2 device (bare or IntelliMouse protocol).

Let's add reset into generic reconnect handler as well.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
15 years agoInput: elantech - use all 3 bytes when checking version
Dmitry Torokhov [Thu, 13 May 2010 07:41:15 +0000 (00:41 -0700)]
Input: elantech - use all 3 bytes when checking version

Apparently all 3 bytes returned by ETP_FW_VERSION_QUERY are significant
and should be taken into account when matching hardware version/features.

Tested-by: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
15 years agomicroblaze: fix get_user/put_user side-effects
Steven J. Magnani [Thu, 6 May 2010 21:38:33 +0000 (16:38 -0500)]
microblaze: fix get_user/put_user side-effects

The Microblaze implementations of get_user() and (MMU) put_user() evaluate
the address argument more than once. This causes unexpected side-effects for
invocations that include increment operators, i.e. get_user(foo, bar++).

This patch also removes the distinction between MMU and noMMU put_user().

Without the patch:
  $ echo 1234567890 > /proc/sys/kernel/core_pattern
  $ cat /proc/sys/kernel/core_pattern
  12345

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
15 years agomicroblaze: re-enable interrupts before calling schedule
Steven J. Magnani [Tue, 27 Apr 2010 18:00:23 +0000 (13:00 -0500)]
microblaze: re-enable interrupts before calling schedule

schedule() should not be called with interrupts disabled.

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: Michal Simek <monstr@monstr.eu>
15 years agoperf record: Add a fallback to the reference relocation symbol
Arnaldo Carvalho de Melo [Tue, 30 Mar 2010 21:27:39 +0000 (18:27 -0300)]
perf record: Add a fallback to the reference relocation symbol

Usually "_text" is enough, but I received reports that its not always
available, so fallback to "_stext" for the symbol we use to check if we
need to apply any relocation to all the symbols in the kernel symtab,
for when, for instance, kexec is being used.

Reported-by: Darren Hart <dvhltc@us.ibm.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoKVM: PPC: Keep index within boundaries in kvmppc_44x_emul_tlbwe()
Roel Kluin [Sun, 9 May 2010 15:26:47 +0000 (17:26 +0200)]
KVM: PPC: Keep index within boundaries in kvmppc_44x_emul_tlbwe()

An index of KVM44x_GUEST_TLB_SIZE is already one too large.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Hollis Blanchard <hollis@penguinppc.org>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
15 years agoKVM: VMX: blocked-by-sti must not defer NMI injections
Jan Kiszka [Tue, 11 May 2010 13:16:46 +0000 (15:16 +0200)]
KVM: VMX: blocked-by-sti must not defer NMI injections

As the processor may not consider GUEST_INTR_STATE_STI as a reason for
blocking NMI, it could return immediately with EXIT_REASON_NMI_WINDOW
when we asked for it. But as we consider this state as NMI-blocking, we
can run into an endless loop.

Resolve this by allowing NMI injection if just GUEST_INTR_STATE_STI is
active (originally suggested by Gleb). Intel confirmed that this is
safe, the processor will never complain about NMI injection in this
state.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
KVM-Stable-Tag
Acked-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
15 years agoKVM: x86: Call vcpu_load and vcpu_put in cpuid_update
Dongxiao Xu [Tue, 11 May 2010 10:21:33 +0000 (18:21 +0800)]
KVM: x86: Call vcpu_load and vcpu_put in cpuid_update

cpuid_update may operate VMCS, so vcpu_load() and vcpu_put()
should be called to ensure correctness.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
15 years agoKVM: SVM: Fix wrong intercept masks on 32 bit
Joerg Roedel [Wed, 5 May 2010 14:04:43 +0000 (16:04 +0200)]
KVM: SVM: Fix wrong intercept masks on 32 bit

This patch makes KVM on 32 bit SVM working again by
correcting the masks used for iret interception. With the
wrong masks the upper 32 bits of the intercepts are masked
out which leaves vmrun unintercepted. This is not legal on
svm and the vmrun fails.
Bug was introduced by commits 95ba827313 and 3cfc3092.

Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
15 years agoKVM: convert ioapic lock to spinlock
Marcelo Tosatti [Fri, 23 Apr 2010 17:03:38 +0000 (14:03 -0300)]
KVM: convert ioapic lock to spinlock

kvm_set_irq is used from non sleepable contexes, so convert ioapic from
mutex to spinlock.

KVM-Stable-Tag.
Tested-by: Ralf Bonenkamp <ralf.bonenkamp@swyx.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
15 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Thu, 13 May 2010 01:48:26 +0000 (18:48 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/perf_event: Fix oops due to perf_event_do_pending call
  powerpc/swiotlb: Fix off by one in determining boundary of which ops to use

15 years agoMerge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
Linus Torvalds [Thu, 13 May 2010 01:47:55 +0000 (18:47 -0700)]
Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6

* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] correct address of _stext with CONFIG_SHARED_KERNEL=y
  [S390] ptrace: fix return value of do_syscall_trace_enter()
  [S390] dasd: fix race between tasklet and dasd_sleep_on

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph...
Linus Torvalds [Thu, 13 May 2010 01:47:29 +0000 (18:47 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: preserve seq # on requeued messages after transient transport errors
  ceph: fix cap removal races
  ceph: zero unused message header, footer fields
  ceph: fix locking for waking session requests after reconnect
  ceph: resubmit requests on pg mapping change (not just primary change)
  ceph: fix open file counting on snapped inodes when mds returns no caps
  ceph: unregister osd request on failure
  ceph: don't use writeback_control in writepages completion
  ceph: unregister bdi before kill_anon_super releases device name

15 years agoMerge commit 'kumar/merge' into merge
Benjamin Herrenschmidt [Thu, 13 May 2010 01:42:40 +0000 (11:42 +1000)]
Merge commit 'kumar/merge' into merge

15 years agoRevert "PCI: update bridge resources to get more big ranges in PCI assign unssigned"
Linus Torvalds [Thu, 13 May 2010 01:39:45 +0000 (18:39 -0700)]
Revert "PCI: update bridge resources to get more big ranges in PCI assign unssigned"

This reverts commit 977d17bb1749517b353874ccdc9b85abc7a58c2a, because it
can cause problems with some devices not getting any resources at all
when the resource tree is re-allocated.

For an example of this, see

https://bugzilla.kernel.org/show_bug.cgi?id=15960
(originally https://bugtrack.alsa-project.org/alsa-bug/view.php?id=4982)
(lkml thread: http://lkml.org/lkml/2010/4/19/20)

where Peter Henriksson reported his Xonar DX sound card gone, because
the IO port region was no longer allocated.

Reported-bisected-and-tested-by: Peter Henriksson <peter.henriksson@gmail.com>
Requested-by: Andrew Morton <akpm@linux-foundation.org>
Requested-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoCacheFiles: Fix error handling in cachefiles_determine_cache_security()
David Howells [Wed, 12 May 2010 14:34:03 +0000 (15:34 +0100)]
CacheFiles: Fix error handling in cachefiles_determine_cache_security()

cachefiles_determine_cache_security() is expected to return with a
security override in place.  However, if set_create_files_as() fails, we
fail to do this.  In this case, we should just reinstate the security
override that was set by the caller.

Furthermore, if set_create_files_as() fails, we should dispose of the
new credentials we were in the process of creating.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agorwsem: Test for no active locks in __rwsem_do_wake undo code
Michel Lespinasse [Wed, 12 May 2010 10:38:45 +0000 (11:38 +0100)]
rwsem: Test for no active locks in __rwsem_do_wake undo code

If there are no active threasd using a semaphore, it is always correct
to unqueue blocked threads.  This seems to be what was intended in the
undo code.

What was done instead, was to look for a sem count of zero - this is an
impossible situation, given that at least one thread is known to be
queued on the semaphore.  The code might be correct as written, but it's
hard to reason about and it's not what was intended (otherwise the goto
out would have been unconditional).

Go for checking the active count - the alternative is not worth the
headache.

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agovhost: fix barrier pairing
Michael S. Tsirkin [Tue, 11 May 2010 16:44:17 +0000 (19:44 +0300)]
vhost: fix barrier pairing

According to memory-barriers.txt, an smp memory barrier in guest
should always be paired with an smp memory barrier in host,
and I quote "a lack of appropriate pairing is almost certainly an
error". In case of vhost, failure to flush out used index
update before looking at the interrupt disable flag
could result in missed interrupts, resulting in
networking hang under stress.

This might happen when flags read bypasses used index write.
So we see interrupts disabled and do not interrupt, at the
same time guest writes flags value to enable interrupt,
reads an old used index value, thinks that
used ring is empty and waits for interrupt.

Note: the barrier we pair with here is in
drivers/virtio/virtio_ring.c, function
vring_enable_cb.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
15 years agoInotify: undefined reference to `anon_inode_getfd'
Russell King [Sun, 18 Apr 2010 20:25:11 +0000 (21:25 +0100)]
Inotify: undefined reference to `anon_inode_getfd'

Fix:

fs/built-in.o: In function `sys_inotify_init1':
summary.c:(.text+0x347a4): undefined reference to `anon_inode_getfd'

found by kautobuild with arms bcmring_defconfig, which ends up with
INOTIFY_USER enabled (through the 'default y') but leaves ANON_INODES
unset.  However, inotify_user.c uses anon_inode_getfd().

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Eric Paris <eparis@redhat.com>
15 years agoALSA: ice1724 - Fix ESI Maya44 capture source control
Takashi Iwai [Wed, 12 May 2010 14:43:32 +0000 (16:43 +0200)]
ALSA: ice1724 - Fix ESI Maya44 capture source control

The capture source control of maya44 was wrongly coded with the bit
shift instead of the bit mask.  Also, the slot for line-in was
wrongly assigned (slot 5 instead of 4).

Reported-by: Alex Chernyshoff <alexdsp@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoARM: 6126/1: ARM mpcore_wdt: fix build failure and other fixes
Srinidhi Kasagar [Wed, 12 May 2010 04:53:26 +0000 (05:53 +0100)]
ARM: 6126/1: ARM mpcore_wdt: fix build failure and other fixes

This fixes the build failures seen when building mpcore_wdt and it
also removes the nonexistent ARM_MPCORE_PLATFORM dependency, instead
make it dependent on HAVE_ARM_TWD.

Also this fixes spinlock usage appropriately.

Signed-off-by: srinidhi kasagar <srinidhi.kasagar@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
15 years agoARM: 6125/1: ARM TWD: move TWD registers to common header
Srinidhi Kasagar [Wed, 12 May 2010 04:52:18 +0000 (05:52 +0100)]
ARM: 6125/1: ARM TWD: move TWD registers to common header

This moves the TWD register set of MPcore to a common
existing file so that watchdog driver can access it

Signed-off-by: srinidhi kasagar <srinidhi.kasagar@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
15 years agoALSA: pcm - Use pgprot_noncached() for MIPS non-coherent archs
Takashi Iwai [Wed, 12 May 2010 08:32:42 +0000 (10:32 +0200)]
ALSA: pcm - Use pgprot_noncached() for MIPS non-coherent archs

MIPS non-coherent archs need the noncached pgprot in mmap of PCM buffers.
But, since the coherency needs to be checked dynamically via
plat_device_is_coherent(), we need an ugly check dependent on MIPS
in ALSA core code.

This should be cleaned up in MIPS arch side (e.g. creating
dma_mmap_coherent()) in near future.

Tested-by: Wu Zhangjin <wuzhangjin@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoALSA: virtuoso: fix Xonar D1/DX front panel microphone
Clemens Ladisch [Tue, 11 May 2010 14:34:39 +0000 (16:34 +0200)]
ALSA: virtuoso: fix Xonar D1/DX front panel microphone

Commit 65c3ac885ce9852852b895a4a62212f62cb5f2e9 in 2.6.33 accidentally
left out the initialization of the AC97 codec FMIC2MIC bit, which broke
recording from the front panel microphone.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Cc: <stable@kernel.org>
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoALSA: hda - Add hp-dv4 model for IDT 92HD71bx
Takashi Iwai [Wed, 12 May 2010 08:16:20 +0000 (10:16 +0200)]
ALSA: hda - Add hp-dv4 model for IDT 92HD71bx

It turned out that HP dv series have inconsistent the mute-LED GPIO
mapping among various models.  dv4/7 seem to use GPIO 0 while dv 5/6
seem to use GPIO 3.  The previous commit
  26ebe0a28986f4845b2c5bea43ac5cc0b9f27f0a
  ALSA: hda - Fix mute-LED GPIO pin for HP dv series
breaks dv5/6.

This patch adds the new quirk model, hp-dv4, to handle HP dv4/7
separately from HP dv5/6.

Tested-by: Kunal Gangakhedkar <kunal.gangakhedkar@gmail.com> (for dv6-1110ax)
Acked-by: Kunal Gangakhedkar <kunal.gangakhedkar@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years ago[S390] correct address of _stext with CONFIG_SHARED_KERNEL=y
Martin Schwidefsky [Wed, 12 May 2010 07:32:13 +0000 (09:32 +0200)]
[S390] correct address of _stext with CONFIG_SHARED_KERNEL=y

As of git commit 1844c9bc0b2fed3023551c1affe033ab38e90b9a head64.S/head31.S
are not included in head.S anymore but build as an extra object. This breaks
shared kernel support because the .org statement in head64.S/head31.S for
CONFIG_SHARED_KERNEL=y will have a different effect. The end address of the
head.text section in head.o will be added to the .org value, to compensate
for this subtract 0x11000 to get the required value of 0x100000 again.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
15 years ago[S390] ptrace: fix return value of do_syscall_trace_enter()
Gerald Schaefer [Wed, 12 May 2010 07:32:12 +0000 (09:32 +0200)]
[S390] ptrace: fix return value of do_syscall_trace_enter()

strace may change the system call number, so regs->gprs[2] must not
be read before tracehook_report_syscall_entry(). This fixes a bug
where "strace -f" will hang after a vfork().

Cc: <stable@kernel.org>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
15 years ago[S390] dasd: fix race between tasklet and dasd_sleep_on
Stefan Weinhuber [Wed, 12 May 2010 07:32:11 +0000 (09:32 +0200)]
[S390] dasd: fix race between tasklet and dasd_sleep_on

The various dasd_sleep_on functions use a global wait queue when
waiting for a cqr. The wait condition checks the status and devlist
fields of the cqr to determine if it is safe to continue. This
evaluation may return true, although the tasklet has not finished
processing of the cqr and the callback function has not been called
yet. When the callback is finally called, the data in the cqr may
already be invalid. The sleep_on wait condition needs a safe way to
determine if the tasklet has finished processing. Use the
callback_data field of the cqr to store a token, which is set by
the callback function itself.

Cc: <stable@kernel.org>
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
15 years agopowerpc/perf_event: Fix oops due to perf_event_do_pending call
Paul Mackerras [Tue, 13 Apr 2010 20:46:04 +0000 (20:46 +0000)]
powerpc/perf_event: Fix oops due to perf_event_do_pending call

Anton Blanchard found that large POWER systems would occasionally
crash in the exception exit path when profiling with perf_events.
The symptom was that an interrupt would occur late in the exit path
when the MSR[RI] (recoverable interrupt) bit was clear.  Interrupts
should be hard-disabled at this point but they were enabled.  Because
the interrupt was not recoverable the system panicked.

The reason is that the exception exit path was calling
perf_event_do_pending after hard-disabling interrupts, and
perf_event_do_pending will re-enable interrupts.

The simplest and cleanest fix for this is to use the same mechanism
that 32-bit powerpc does, namely to cause a self-IPI by setting the
decrementer to 1.  This means we can remove the tests in the exception
exit path and raw_local_irq_restore.

This also makes sure that the call to perf_event_do_pending from
timer_interrupt() happens within irq_enter/irq_exit.  (Note that
calling perf_event_do_pending from timer_interrupt does not mean that
there is a possible 1/HZ latency; setting the decrementer to 1 ensures
that the timer interrupt will happen immediately, i.e. within one
timebase tick, which is a few nanoseconds or 10s of nanoseconds.)

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
15 years agoceph: preserve seq # on requeued messages after transient transport errors
Sage Weil [Wed, 12 May 2010 04:20:38 +0000 (21:20 -0700)]
ceph: preserve seq # on requeued messages after transient transport errors

If the tcp connection drops and we reconnect to reestablish a stateful
session (with the mds), we need to resend previously sent (and possibly
received) messages with the _same_ seq # so that they can be dropped on
the other end if needed.  Only assign a new seq once after the message is
queued.

Signed-off-by: Sage Weil <sage@newdream.net>
15 years agoceph: fix cap removal races
Sage Weil [Wed, 12 May 2010 03:56:31 +0000 (20:56 -0700)]
ceph: fix cap removal races

The iterate_session_caps helper traverses the session caps list and tries
to grab an inode reference.  However, the __ceph_remove_cap was clearing
the inode backpointer _before_ removing itself from the session list,
causing a null pointer dereference.

Clear cap->ci under protection of s_cap_lock to avoid the race, and to
tightly couple the list and backpointer state.  Use a local flag to
indicate whether we are releasing the cap, as cap->session may be modified
by a racing thread in iterate_session_caps.

Signed-off-by: Sage Weil <sage@newdream.net>
15 years agoMerge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelv...
Linus Torvalds [Wed, 12 May 2010 00:38:04 +0000 (17:38 -0700)]
Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: (applesmc) Correct sysfs fan error handling
  hwmon: (asc7621) Bug fixes

15 years agoMerge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 12 May 2010 00:37:24 +0000 (17:37 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  kprobes/x86: Fix removed int3 checking order
  perf: Fix static strings treated like dynamic ones

15 years agodrivers/gpu/drm/i915/i915_irq.c:i915_error_object_create(): use correct kmap-atomic...
Andrew Morton [Tue, 11 May 2010 21:07:05 +0000 (14:07 -0700)]
drivers/gpu/drm/i915/i915_irq.c:i915_error_object_create(): use correct kmap-atomic slot

i915_error_object_create() is called from the timer interrupt and hence
can corrupt the KM_USER0 slot.  Use KM_IRQ0 instead.

Reported-by: Jaswinder Singh Rajput <jaswinderlinux@gmail.com>
Tested-by: Jaswinder Singh Rajput <jaswinderlinux@gmail.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agohp_accel: fix race in device removal
Oliver Neukum [Tue, 11 May 2010 21:07:03 +0000 (14:07 -0700)]
hp_accel: fix race in device removal

The work queue has to be flushed after the device has been made
inaccessible.  The patch closes a window during which a work queue might
remain active after the device is removed and would then lead to ACPI
calls with undefined behavior.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Acked-by: Eric Piel <eric.piel@tremplin-utc.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Herrmann <morpheus.ibis@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomqueue: fix kernel BUG caused by double free() on mq_open()
André Goddard Rosa [Tue, 11 May 2010 21:07:03 +0000 (14:07 -0700)]
mqueue: fix kernel BUG caused by double free() on mq_open()

In case of aborting because we reach the maximum amount of memory which
can be allocated to message queues per user (RLIMIT_MSGQUEUE), we would
try to free the message area twice when bailing out: first by the error
handling code itself, and then later when cleaning up the inode through
delete_inode().

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agofbdev: bfin-t350mcqb-fb: fix fbmem allocation with blanking lines
Michael Hennerich [Tue, 11 May 2010 21:07:00 +0000 (14:07 -0700)]
fbdev: bfin-t350mcqb-fb: fix fbmem allocation with blanking lines

The current allocation does not include the memory required for blanking
lines.  So avoid memory corruption when multiple devices are using the DMA
memory near each other.

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: fix css_is_ancestor() RCU locking
KAMEZAWA Hiroyuki [Tue, 11 May 2010 21:06:59 +0000 (14:06 -0700)]
memcg: fix css_is_ancestor() RCU locking

Some callers (in memcontrol.c) calls css_is_ancestor() without
rcu_read_lock.  Because css_is_ancestor() has to access RCU protected
data, it should be under rcu_read_lock().

This makes css_is_ancestor() itself does safe access to RCU protected
area.  (At least, "root" can have refcnt==0 if it's not an ancestor of
"child".  So, we need rcu_read_lock().)

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomemcg: fix css_id() RCU locking for real
KAMEZAWA Hiroyuki [Tue, 11 May 2010 21:06:58 +0000 (14:06 -0700)]
memcg: fix css_id() RCU locking for real

Commit ad4ba375373937817404fd92239ef4cadbded23b ("memcg: css_id() must be
called under rcu_read_lock()") modifies memcontol.c for fixing RCU check
message.  But Andrew Morton pointed out that the fix doesn't seems sane
and it was just for hidining lockdep messages.

This is a patch for do proper things.  Checking again, all places,
accessing without rcu_read_lock, that commit fixies was intentional....
all callers of css_id() has reference count on it.  So, it's not necessary
to be under rcu_read_lock().

Considering again, we can use rcu_dereference_check for css_id().  We know
css->id is valid if css->refcnt > 0.  (css->id never changes and freed
after css->refcnt going to be 0.)

This patch makes use of rcu_dereference_check() in css_id/depth and remove
unnecessary rcu-read-lock added by the commit.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agobsdacct: use del_timer_sync() in acct_exit_ns()
Vitaliy Gusev [Tue, 11 May 2010 21:06:56 +0000 (14:06 -0700)]
bsdacct: use del_timer_sync() in acct_exit_ns()

acct_exit_ns --> acct_file_reopen deletes timer without check timer
execution on other CPUs.  So acct_timeout() can change an unmapped memory.

Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agormap: remove anon_vma check in page_address_in_vma()
Naoya Horiguchi [Tue, 11 May 2010 21:06:55 +0000 (14:06 -0700)]
rmap: remove anon_vma check in page_address_in_vma()

Currently page_address_in_vma() compares vma->anon_vma and
page_anon_vma(page) for parameter check, but in 2.6.34 a vma can have
multiple anon_vmas with anon_vma_chain, so current check does not work.
(For anonymous page shared by multiple processes, some verified (page,vma)
pairs return -EFAULT wrongly.)

We can go to checking all anon_vmas in the "same_vma" chain, but it needs
to meet lock requirement.  Instead, we can remove anon_vma check safely
because page_address_in_vma() assumes that page and vma are already
checked to belong to the identical process.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agohugetlbfs: kill applications that use MAP_NORESERVE with SIGBUS instead of OOM-killer
Mel Gorman [Tue, 11 May 2010 21:06:53 +0000 (14:06 -0700)]
hugetlbfs: kill applications that use MAP_NORESERVE with SIGBUS instead of OOM-killer

Ordinarily, application using hugetlbfs will create mappings with
reserves.  For shared mappings, these pages are reserved before mmap()
returns success and for private mappings, the caller process is guaranteed
and a child process that cannot get the pages gets killed with sigbus.

An application that uses MAP_NORESERVE gets no reservations and mmap()
will always succeed at the risk the page will not be available at fault
time.  This might be used for example on very large sparse mappings where
the developer is confident the necessary huge pages exist to satisfy all
faults even though the whole mapping cannot be backed by huge pages.
Unfortunately, if an allocation does fail, VM_FAULT_OOM is returned to the
fault handler which proceeds to trigger the OOM-killer.  This is
unhelpful.

Even without hugetlbfs mounted, a user using mmap() can trivially trigger
the OOM-killer because VM_FAULT_OOM is returned (will provide example
program if desired - it's a whopping 24 lines long).  It could be
considered a DOS available to an unprivileged user.

This patch alters hugetlbfs to kill a process that uses MAP_NORESERVE
where huge pages were not available with SIGBUS instead of triggering the
OOM killer.

This change affects hugetlb_cow() as well.  I feel there is a failure case
in there, but I didn't create one.  It would need a fairly specific target
in terms of the faulting application and the hugepage pool size.  The
hugetlb_no_page() path is much easier to hit but both might as well be
closed.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>