]> www.infradead.org Git - users/hch/misc.git/log
users/hch/misc.git
14 years agoARM: footbridge: convert to clockevents/clocksource
Russell King [Fri, 28 Jan 2011 21:00:39 +0000 (21:00 +0000)]
ARM: footbridge: convert to clockevents/clocksource

The Footbridge platforms have some reasonable timers in the host bridge,
which we use for most footbridge-based platforms.  However, NetWinder's
clock these using a spread-spectrum clock which makes them too unstable
for time keeping.  So we have to rely on the PIT.

Convert both Footbridge timers and PIT timers to use the clocksource
and clockevent infrastructure.  Tested on Netwinder.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoARM: footbridge: fix debug macros
Russell King [Fri, 28 Jan 2011 20:57:57 +0000 (20:57 +0000)]
ARM: footbridge: fix debug macros

0ea1293 (arm: return both physical and virtual addresses from addruart)
changed the way the 'addruart' worked, making it return both the virt
and phys addresses.  Unfortunately, for footbridge, these were reversed.
Fix that.  Tested on Netwinder.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoARM: mmci: round down the bytes transferred on error
Russell King [Sun, 30 Jan 2011 21:06:53 +0000 (21:06 +0000)]
ARM: mmci: round down the bytes transferred on error

We should not report incomplete blocks on error.  Return the number of
bytes successfully transferred, rounded down to the nearest block.

Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoARM: mmci: complete the transaction on error
Russell King [Sun, 30 Jan 2011 21:03:50 +0000 (21:03 +0000)]
ARM: mmci: complete the transaction on error

When we encounter an error, make sure we complete the transaction
otherwise we'll leave the request dangling.

Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoARM: 6642/1: mmci: calculate remaining bytes at error correctly
Linus Walleij [Thu, 27 Jan 2011 16:44:34 +0000 (17:44 +0100)]
ARM: 6642/1: mmci: calculate remaining bytes at error correctly

The MMCIDATACNT register contain the number of byte left at error
not the number of words, so loose the << 2 thing. Further if CRC
fails on the first block, we may end up with a negative number
of transferred bytes which is not good, and the formula was in
wrong order.

Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoMerge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 27 Jan 2011 20:45:04 +0000 (06:45 +1000)]
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Use rq->clock_task instead of rq->clock for correctly maintaining load averages
  sched: Fix/remove redundant cfs_rq checks
  sched: Fix sign under-flows in wake_affine

14 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 27 Jan 2011 20:43:41 +0000 (06:43 +1000)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  percpu, x86: Fix percpu_xchg_op()
  x86: Remove left over system_64.h
  x86-64: Don't use pointer to out-of-scope variable in dump_trace()

14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Linus Torvalds [Thu, 27 Jan 2011 20:39:08 +0000 (06:39 +1000)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: bfin_sdh: fix alloc size for private data
  mmc: sdhci-s3c: add platform_8bit_width() hook
  mmc: jz4740: don't treat NULL clk as an error
  mmc: mmci: don't read command response when invalid
  mmc: ushc: Remove duplicate include of usb.h

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Thu, 27 Jan 2011 20:35:51 +0000 (06:35 +1000)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (43 commits)
  bnx2: Eliminate AER error messages on systems not supporting it
  cnic: Fix big endian bug
  xfrm6: Don't forget to propagate peer into ipsec route.
  tg3: Use new VLAN code
  bonding: update documentation - alternate configuration.
  TCP: fix a bug that triggers large number of TCP RST by mistake
  MAINTAINERS: remove Reinette Chatre as iwlwifi maintainer
  rt2x00: add device id for windy31 usb device
  mac80211: fix a crash in ieee80211_beacon_get_tim on change_interface
  ipv6: Revert 'administrative down' address handling changes.
  textsearch: doc - fix spelling in lib/textsearch.c.
  USB NET KL5KUSB101: Fix mem leak in error path of kaweth_download_firmware()
  pch_gbe: don't use flush_scheduled_work()
  bnx2: Always set ETH_FLAG_TXVLAN
  net: clear heap allocation for ethtool_get_regs()
  ipv6: Always clone offlink routes.
  dcbnl: make get_app handling symmetric for IEEE and CEE DCBx
  tcp: fix bug in listening_get_next()
  inetpeer: Use correct AVL tree base pointer in inet_getpeer().
  GRO: fix merging a paged skb after non-paged skbs
  ...

14 years agoMerge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groec...
Linus Torvalds [Thu, 27 Jan 2011 20:34:19 +0000 (06:34 +1000)]
Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
  hwmon: (lis3) turn down the no IRQ message
  hwmon: (asus_atk0110) Override interface detection on Sabertooth X58
  hwmon: (applesmc) Properly initialize lockdep attributes

14 years agoMerge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspe...
Linus Torvalds [Thu, 27 Jan 2011 20:32:49 +0000 (06:32 +1000)]
Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6

* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PM / Runtime: Don't enable interrupts while running in_interrupt

14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt...
Linus Torvalds [Thu, 27 Jan 2011 20:32:05 +0000 (06:32 +1000)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/avr32-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/avr32-2.6:
  avr32: add missing include causing undefined pgtable_page_* references

14 years agobnx2: Eliminate AER error messages on systems not supporting it
Michael Chan [Tue, 25 Jan 2011 22:14:51 +0000 (22:14 +0000)]
bnx2: Eliminate AER error messages on systems not supporting it

On PPC for example, AER is not supported and we see unnecessary AER
error message without this patch:

bnx2 0003:01:00.1: pci_cleanup_aer_uncorrect_error_status failed 0xfffffffb

Reported-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocnic: Fix big endian bug
Michael Chan [Tue, 25 Jan 2011 22:14:50 +0000 (22:14 +0000)]
cnic: Fix big endian bug

The chip's page tables did not set up properly on big endian machines,
causing EEH errors on PPC machines.

Reported-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoxfrm6: Don't forget to propagate peer into ipsec route.
David S. Miller [Wed, 26 Jan 2011 21:41:03 +0000 (13:41 -0800)]
xfrm6: Don't forget to propagate peer into ipsec route.

Like ipv4, we have to propagate the ipv6 route peer into
the ipsec top-level route during instantiation.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotg3: Use new VLAN code
Matt Carlson [Wed, 26 Jan 2011 21:13:10 +0000 (13:13 -0800)]
tg3: Use new VLAN code

This patch pivots the tg3 driver to the new VLAN infrastructure.
All references to vlgrp have been removed.  The driver still attempts to
disable VLAN tag stripping if CONFIG_VLAN_8021Q or
CONFIG_VLAN_8021Q_MODULE is not defined.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
David S. Miller [Wed, 26 Jan 2011 19:49:49 +0000 (11:49 -0800)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6

14 years agoavr32: add missing include causing undefined pgtable_page_* references
Hans-Christian Egtvedt [Mon, 24 Jan 2011 12:51:58 +0000 (13:51 +0100)]
avr32: add missing include causing undefined pgtable_page_* references

This patch adds the linux/mm.h header file to the AVR32 arch pgalloc.c
implementation to fix the undefined reference to pgtable_page_ctor() and
pgtable_page_dtor().

Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
14 years agosched: Use rq->clock_task instead of rq->clock for correctly maintaining load averages
Paul Turner [Sat, 22 Jan 2011 04:45:02 +0000 (20:45 -0800)]
sched: Use rq->clock_task instead of rq->clock for correctly maintaining load averages

The delta in clock_task is a more fair attribution of how much time a tg has
been contributing load to the current cpu.

While not really important it also means we're more in sync (by magnitude)
with respect to periodic updates (since __update_curr deltas are clock_task
based).

Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110122044852.007092349@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agosched: Fix/remove redundant cfs_rq checks
Paul Turner [Sat, 22 Jan 2011 04:45:00 +0000 (20:45 -0800)]
sched: Fix/remove redundant cfs_rq checks

Since updates are against an entity's queuing cfs_rq it's not possible to
enter update_cfs_{shares,load} with a NULL cfs_rq.  (Indeed, update_cfs_load
would crash prior to the check if we did anyway since we load is examined
during the initializers).

Also, in the update_cfs_load case there's no point
in maintaining averages for rq->cfs_rq since we don't perform shares
distribution at that level -- NULL check is replaced accordingly.

Thanks to Dan Carpenter for pointing out the deference before NULL check.

Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110122044851.825284940@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agosched: Fix sign under-flows in wake_affine
Paul Turner [Sat, 22 Jan 2011 04:44:59 +0000 (20:44 -0800)]
sched: Fix sign under-flows in wake_affine

While care is taken around the zero-point in effective_load to not exceed
the instantaneous rq->weight, it's still possible (e.g. using wake_idx != 0)
for (load + effective_load) to underflow.

In this case the comparing the unsigned values can result in incorrect balanced
decisions.

Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110122044851.734245014@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agopercpu, x86: Fix percpu_xchg_op()
Eric Dumazet [Tue, 25 Jan 2011 16:31:54 +0000 (17:31 +0100)]
percpu, x86: Fix percpu_xchg_op()

These recent percpu commits:

  2485b6464cf8: x86,percpu: Move out of place 64 bit ops into X86_64 section
  8270137a0d50: cpuops: Use cmpxchg for xchg to avoid lock semantics

Caused this 'perf top' crash:

 Kernel panic - not syncing: Fatal exception in interrupt
 Pid: 0, comm: swapper Tainted: G     D
 2.6.38-rc2-00181-gef71723 #413 Call Trace: <IRQ> [<ffffffff810465b5>]
    ? panic
    ? kmsg_dump
    ? kmsg_dump
    ? oops_end
    ? no_context
    ? __bad_area_nosemaphore
    ? perf_output_begin
    ? bad_area_nosemaphore
    ? do_page_fault
    ? __task_pid_nr_ns
    ? perf_event_tid
    ? __perf_event_header__init_id
    ? validate_chain
    ? perf_output_sample
    ? trace_hardirqs_off
    ? page_fault
    ? irq_work_run
    ? update_process_times
    ? tick_sched_timer
    ? tick_sched_timer
    ? __run_hrtimer
    ? hrtimer_interrupt
    ? account_system_vtime
    ? smp_apic_timer_interrupt
    ? apic_timer_interrupt
 ...

Looking at assembly code, I found:

list = this_cpu_xchg(irq_work_list, NULL);

gives this wrong code : (gcc-4.1.2 cross compiler)

ffffffff810bc45e:
mov    %gs:0xead0,%rax
cmpxchg %rax,%gs:0xead0
jne    ffffffff810bc45e <irq_work_run+0x3e>
test   %rax,%rax
je     ffffffff810bc4aa <irq_work_run+0x8a>

Tell gcc we dirty eax/rax register in percpu_xchg_op()

Compiler must use another register to store pxo_new__

We also dont need to reload percpu value after a jump,
since a 'failed' cmpxchg already updated eax/rax

Wrong generated code was :
xor     %rax,%rax   /* load 0 into %rax */
1: mov     %gs:0xead0,%rax
cmpxchg %rax,%gs:0xead0
jne     1b
test    %rax,%rax

After patch :

xor     %rdx,%rdx   /* load 0 into %rdx */
mov     %gs:0xead0,%rax
1: cmpxchg %rdx,%gs:0xead0
jne     1b:
test    %rax,%rax

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>
LKML-Reference: <1295973114.3588.312.camel@edumazet-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agox86: Remove left over system_64.h
Yinghai Lu [Tue, 25 Jan 2011 01:13:53 +0000 (17:13 -0800)]
x86: Remove left over system_64.h

Left-over from the x86 merge ...

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4D3E23D1.7010405@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Wed, 26 Jan 2011 06:31:44 +0000 (16:31 +1000)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: wacom - pass touch resolution to clients through input_absinfo
  Input: wacom - add 2 Bamboo Pen and touch models
  Input: sysrq - ensure sysrq_enabled and __sysrq_enabled are consistent
  Input: sparse-keymap - fix KEY_VSW handling in sparse_keymap_setup
  Input: tegra-kbc - add tegra keyboard driver
  Input: gpio_keys - switch to using request_any_context_irq
  Input: serio - allow registered drivers to get status flag
  Input: ct82710c - return proper error code for ct82c710_open
  Input: bu21013_ts - added regulator support
  Input: bu21013_ts - remove duplicate resolution parameters
  Input: tnetv107x-ts - don't treat NULL clk as an error
  Input: tnetv107x-keypad - don't treat NULL clk as an error

Fix up trivial conflicts in drivers/input/keyboard/Makefile due to
additions of tc3589x/Tegra drivers

14 years agommc: bfin_sdh: fix alloc size for private data
Sonic Zhang [Wed, 12 Jan 2011 03:39:35 +0000 (22:39 -0500)]
mmc: bfin_sdh: fix alloc size for private data

The bfin_sdh driver allocates the wrong size for the private data
in the mmc_host.  The first parameter of mmc_alloc_host should be
the size of the local driver struct rather than the common mmc_host.

Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
14 years agommc: sdhci-s3c: add platform_8bit_width() hook
Jaehoon Chung [Wed, 12 Jan 2011 02:59:12 +0000 (11:59 +0900)]
mmc: sdhci-s3c: add platform_8bit_width() hook

We have 8-bit width support but is not a v3 controller.
So we need platform_8bit_width() to support 8-bit buswidth.
Also we need MMC_CAP_8_BIT_DATA, so we add it in platdata.

This gets 8-bit support working again on s3c, after we previously
disabled 8-bit by default on non-v3 controllers.

Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
14 years agommc: jz4740: don't treat NULL clk as an error
Jamie Iles [Tue, 11 Jan 2011 12:43:50 +0000 (12:43 +0000)]
mmc: jz4740: don't treat NULL clk as an error

clk_get() returns a struct clk cookie to the driver and some platforms
may return NULL if they only support a single clock.  clk_get() has only
failed if it returns a ERR_PTR() encoded pointer.

Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
14 years agommc: mmci: don't read command response when invalid
Russell King - ARM Linux [Tue, 11 Jan 2011 16:35:56 +0000 (16:35 +0000)]
mmc: mmci: don't read command response when invalid

Don't read the command response from the registers when either the
command timed out (because there was no response from the card) or
the checksum on the response was invalid.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Chris Ball <cjb@laptop.org>
14 years agommc: ushc: Remove duplicate include of usb.h
Jesper Juhl [Mon, 10 Jan 2011 20:56:08 +0000 (21:56 +0100)]
mmc: ushc: Remove duplicate include of usb.h

Including usb.h once is enough in drivers/mmc/host/ushc.c
This removes the duplicate.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Chris Ball <cjb@laptop.org>
14 years agoInput: wacom - pass touch resolution to clients through input_absinfo
Ping Cheng [Wed, 26 Jan 2011 02:03:13 +0000 (18:03 -0800)]
Input: wacom - pass touch resolution to clients through input_absinfo

Also remove fake ABS_RX/ABS_RY "axes" that were used to report physical
dimensions now that we have better way.

Signed-off-by: Ping Cheng <pingc@wacom.com>
Reviewed-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
14 years agoconsole: rename acquire/release_console_sem() to console_lock/unlock()
Torben Hohn [Tue, 25 Jan 2011 23:07:35 +0000 (15:07 -0800)]
console: rename acquire/release_console_sem() to console_lock/unlock()

The -rt patches change the console_semaphore to console_mutex.  As a
result, a quite large chunk of the patches changes all
acquire/release_console_sem() to acquire/release_console_mutex()

This commit makes things use more neutral function names which dont make
implications about the underlying lock.

The only real change is the return value of console_trylock which is
inverted from try_acquire_console_sem()

This patch also paves the way to switching console_sem from a semaphore to
a mutex.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: make console_trylock return 1 on success, per Geert]
Signed-off-by: Torben Hohn <torbenh@gmx.de>
Cc: Thomas Gleixner <tglx@tglx.de>
Cc: Greg KH <gregkh@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agosquashfs: fix use of uninitialised variable in zlib & xz decompressors
Phillip Lougher [Tue, 25 Jan 2011 23:07:34 +0000 (15:07 -0800)]
squashfs: fix use of uninitialised variable in zlib & xz decompressors

Fix potential use of uninitialised variable caused by recent
decompressor code optimisations.

In zlib_uncompress (zlib_wrapper.c) we have

int zlib_err, zlib_init = 0;
...
do {
...
if (avail == 0) {
offset = 0;
put_bh(bh[k++]);
continue;
}
...
zlib_err = zlib_inflate(stream, Z_SYNC_FLUSH);
...
} while (zlib_err == Z_OK);

If continue is executed (avail == 0) then the while condition will be
evaluated testing zlib_err, which is uninitialised first time around the
loop.

Fix this by getting rid of the 'if (avail == 0)' condition test, this
edge condition should not be being handled in the decompressor code, and
instead handle it generically in the caller code.

Similarly for xz_wrapper.c.

Incidentally, on most architectures (bar Mips and Parisc), no
uninitialised variable warning is generated by gcc, this is because the
while condition test on continue is optimised out and not performed
(when executing continue zlib_err has not been changed since entering
the loop, and logically if the while condition was true previously, then
it's still true).

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Reported-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoradix_tree: radix_tree_gang_lookup_tag_slot() may never return
Toshiyuki Okajima [Tue, 25 Jan 2011 23:07:32 +0000 (15:07 -0800)]
radix_tree: radix_tree_gang_lookup_tag_slot() may never return

Executed command: fsstress -d /mnt -n 600 -p 850

  crash> bt
  PID: 7947   TASK: ffff880160546a70  CPU: 0   COMMAND: "fsstress"
   #0 [ffff8800dfc07d00] machine_kexec at ffffffff81030db9
   #1 [ffff8800dfc07d70] crash_kexec at ffffffff810a7952
   #2 [ffff8800dfc07e40] oops_end at ffffffff814aa7c8
   #3 [ffff8800dfc07e70] die_nmi at ffffffff814aa969
   #4 [ffff8800dfc07ea0] do_nmi_callback at ffffffff8102b07b
   #5 [ffff8800dfc07f10] do_nmi at ffffffff814aa514
   #6 [ffff8800dfc07f50] nmi at ffffffff814a9d60
      [exception RIP: __lookup_tag+100]
      RIP: ffffffff812274b4  RSP: ffff88016056b998  RFLAGS: 00000287
      RAX: 0000000000000000  RBX: 0000000000000002  RCX: 0000000000000006
      RDX: 000000000000001d  RSI: ffff88016056bb18  RDI: ffff8800c85366e0
      RBP: ffff88016056b9c8   R8: ffff88016056b9e8   R9: 0000000000000000
      R10: 000000000000000e  R11: ffff8800c8536908  R12: 0000000000000010
      R13: 0000000000000040  R14: ffffffffffffffc0  R15: ffff8800c85366e0
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
  <NMI exception stack>
   #7 [ffff88016056b998] __lookup_tag at ffffffff812274b4
   #8 [ffff88016056b9d0] radix_tree_gang_lookup_tag_slot at ffffffff81227605
   #9 [ffff88016056ba20] find_get_pages_tag at ffffffff810fc110
  #10 [ffff88016056ba80] pagevec_lookup_tag at ffffffff81105e85
  #11 [ffff88016056baa0] write_cache_pages at ffffffff81104c47
  #12 [ffff88016056bbd0] generic_writepages at ffffffff81105014
  #13 [ffff88016056bbe0] do_writepages at ffffffff81105055
  #14 [ffff88016056bbf0] __filemap_fdatawrite_range at ffffffff810fb2cb
  #15 [ffff88016056bc40] filemap_write_and_wait_range at ffffffff810fb32a
  #16 [ffff88016056bc70] generic_file_direct_write at ffffffff810fb3dc
  #17 [ffff88016056bce0] __generic_file_aio_write at ffffffff810fcee5
  #18 [ffff88016056bda0] generic_file_aio_write at ffffffff810fd085
  #19 [ffff88016056bdf0] do_sync_write at ffffffff8114f9ea
  #20 [ffff88016056bf00] vfs_write at ffffffff8114fcf8
  #21 [ffff88016056bf30] sys_write at ffffffff81150691
  #22 [ffff88016056bf80] system_call_fastpath at ffffffff8100c0b2

I think this root cause is the following:

 radix_tree_range_tag_if_tagged() always tags the root tag with settag
 if the root tag is set with iftag even if there are no iftag tags
 in the specified range (Of course, there are some iftag tags
 outside the specified range).

===============================================================================
[[[Detailed description]]]

(1) Why cannot radix_tree_gang_lookup_tag_slot() return forever?

__lookup_tag():
 - Return with 0.
 - Return with the index which is not bigger than the old one as the
   input parameter.

Therefore the following "while" repeats forever because the above
conditions cause "ret" not to be updated and the cur_index cannot be
changed into the bigger one.

(So, radix_tree_gang_lookup_tag_slot() cannot return forever.)

radix_tree_gang_lookup_tag_slot():
1178         while (ret < max_items) {
1179                 unsigned int slots_found;
1180                 unsigned long next_index;       /* Index of next search */
1181
1182                 if (cur_index > max_index)
1183                         break;
1184                 slots_found = __lookup_tag(node, results + ret,
1185                                 cur_index, max_items - ret, &next_index,
tag);
1186                 ret += slots_found;
// cannot update ret because slots_found == 0.
// so, this while loops forever.
1187                 if (next_index == 0)
1188                         break;
1189                 cur_index = next_index;
1190         }

(2) Why does __lookup_tag() return with 0 and doesn't update the index?

Assuming the following:
  - the one of the slot in radix_tree_node is NULL.
  - the one of the tag which corresponds to the slot sets with
    PAGECACHE_TAG_TOWRITE or other.
  - In a certain height(!=0), the corresponding index is 0.

a) __lookup_tag() notices that the tag is set.

1005 static unsigned int
1006 __lookup_tag(struct radix_tree_node *slot, void ***results, unsigned long index,
1007         unsigned int max_items, unsigned long *next_index, unsigned int tag)
1008 {
1009         unsigned int nr_found = 0;
1010         unsigned int shift, height;
1011
1012         height = slot->height;
1013         if (height == 0)
1014                 goto out;
1015         shift = (height-1) * RADIX_TREE_MAP_SHIFT;
1016
1017         while (height > 0) {
1018                 unsigned long i = (index >> shift) & RADIX_TREE_MAP_MASK ;
1019
1020                 for (;;) {
1021                         if (tag_get(slot, tag, i))
1022                                 break;
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* the index is not updated yet.

b) __lookup_tag() notices that the slot is NULL.

1023                         index &= ~((1UL << shift) - 1);
1024                         index += 1UL << shift;
1025                         if (index == 0)
1026                                 goto out;       /* 32-bit wraparound */
1027                         i++;
1028                         if (i == RADIX_TREE_MAP_SIZE)
1029                                 goto out;
1030                 }
1031                 height--;
1032                 if (height == 0) {      /* Bottom level: grab some items */
...
1055                 }
1056                 shift -= RADIX_TREE_MAP_SHIFT;
1057                 slot = rcu_dereference_raw(slot->slots[i]);
1058                 if (slot == NULL)
1059                         break;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

c) __lookup_tag() doesn't update the index and return with 0.

1060         }
1061 out:
1062         *next_index = index;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1063         return nr_found;
1064 }

(3) Why is the slot NULL even if the tag is set?

Because radix_tree_range_tag_if_tagged() always sets the root tag with
PAGECACHE_TAG_TOWRITE if the root tag is set with PAGECACHE_TAG_DIRTY,
even if there is no tag which can be set with PAGECACHE_TAG_TOWRITE
in the specified range (from *first_indexp to last_index). Of course,
some PAGECACHE_TAG_DIRTY nodes must exist outside the specified range.
(radix_tree_range_tag_if_tagged() is called only from tag_pages_for_writeback())

 640 unsigned long radix_tree_range_tag_if_tagged(struct radix_tree_root
*root,
 641                 unsigned long *first_indexp, unsigned long last_index,
 642                 unsigned long nr_to_tag,
 643                 unsigned int iftag, unsigned int settag)
 644 {
 645         unsigned int height = root->height;
 646         struct radix_tree_path path[height];
 647         struct radix_tree_path *pathp = path;
 648         struct radix_tree_node *slot;
 649         unsigned int shift;
 650         unsigned long tagged = 0;
 651         unsigned long index = *first_indexp;
 652
 653         last_index = min(last_index, radix_tree_maxindex(height));
 654         if (index > last_index)
 655                 return 0;
 656         if (!nr_to_tag)
 657                 return 0;
 658         if (!root_tag_get(root, iftag)) {
 659                 *first_indexp = last_index + 1;
 660                 return 0;
 661         }
 662         if (height == 0) {
 663                 *first_indexp = last_index + 1;
 664                 root_tag_set(root, settag);
 665                 return 1;
 666         }
...
 733         root_tag_set(root, settag);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 734         *first_indexp = index;
 735
 736         return tagged;
 737 }

As the result, there is no radix_tree_node which is set with
PAGECACHE_TAG_TOWRITE but the root tag(radix_tree_root) is set with
PAGECACHE_TAG_TOWRITE.

[figure: inside radix_tree]
(Please see the figure with typewriter font)
===========================================
          [roottag = DIRTY]
                 |             tag=0:NOTHING
         tag[0 0 0 1]              1:DIRTY
            [x x x +]              2:WRITEBACK
                   |               3:DIRTY,WRITEBACK
                   p               4:TOWRITE
             <--->                 5:DIRTY,TOWRITE ...
     specified range (index: 0 to 2)

* There is no DIRTY tag within the specified range.
 (But there is a DIRTY tag outside that range.)

            | | | | | | | | |
    after calling tag_pages_for_writeback()
            | | | | | | | | |
            v v v v v v v v v

          [roottag = DIRTY,TOWRITE]
                 |                 p is "page".
         tag[0 0 0 1]              x is NULL.
            [x x x +]              +- is a pointer to "page".
                   |
                   p

* But TOWRITE tag is set on the root tag.
============================================

After that, radix_tree_extend() via radix_tree_insert() is called
when the page is added.
This function sets the new radix_tree_node with PAGECACHE_TAG_TOWRITE
to succeed the status of the root tag.

 246 static int radix_tree_extend(struct radix_tree_root *root, unsigned long
index)
 247 {
 248         struct radix_tree_node *node;
 249         unsigned int height;
 250         int tag;
 251
 252         /* Figure out what the height should be.  */
 253         height = root->height + 1;
 254         while (index > radix_tree_maxindex(height))
 255                 height++;
 256
 257         if (root->rnode == NULL) {
 258                 root->height = height;
 259                 goto out;
 260         }
 261
 262         do {
 263                 unsigned int newheight;
 264                 if (!(node = radix_tree_node_alloc(root)))
 265                         return -ENOMEM;
 266
 267                 /* Increase the height.  */
 268                 node->slots[0] = radix_tree_indirect_to_ptr(root->rnode);
 269
 270                 /* Propagate the aggregated tag info into the new root */
 271                 for (tag = 0; tag < RADIX_TREE_MAX_TAGS; tag++) {
 272                         if (root_tag_get(root, tag))
 273                                 tag_set(node, tag, 0);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 274                 }

===========================================
          [roottag = DIRTY,TOWRITE]
                 |     :
         tag[0 0 0 1] [0 0 0 0]
            [x x x +] [+ x x x]
                   |   |
                   p   p (new page)

            | | | | | | | | |
    after calling radix_tree_insert
            | | | | | | | | |
            v v v v v v v v v

          [roottag = DIRTY,TOWRITE]
                 |
         tag [5 0 0 0]    *  DIRTY and TOWRITE tags are
             [+ + x x]       succeeded to the new node.
              | |
  tag [0 0 0 1] [0 0 0 0]
      [x x x +] [+ x x x]
             |   |
             p   p
============================================

After that, the index 3 page is released by remove_from_page_cache().
Then we can make the situation that the tag is set with PAGECACHE_TAG_TOWRITE
and that the slot which corresponds to the tag is NULL.
===========================================
          [roottag = DIRTY,TOWRITE]
                 |
         tag [5 0 0 0]
             [+ + x x]
              | |
  tag [0 0 0 1] [0 0 0 0]
      [x x x +] [+ x x x]
             |   |
             p   p
         (remove)

            | | | | | | | | |
    after calling remove_page_cache
            | | | | | | | | |
            v v v v v v v v v

          [roottag = DIRTY,TOWRITE]
                 |
         tag [4 0 0 0]      * Only DIRTY tag is cleared
             [x + x x]        because no TOWRITE tag is existed
                |             in the bottom node.
                [0 0 0 0]
                [+ x x x]
                 |
                 p
============================================

To solve this problem

Change to that radix_tree_tag_if_tagged() doesn't tag the root tag
if it doesn't set any tags within the specified range.

Like this.
============================================
 640 unsigned long radix_tree_range_tag_if_tagged(struct radix_tree_root
*root,
 641                 unsigned long *first_indexp, unsigned long last_index,
 642                 unsigned long nr_to_tag,
 643                 unsigned int iftag, unsigned int settag)
 644 {
 650         unsigned long tagged = 0;
...
 733       if (tagged)
^^^^^^^^^^^^^^^^^^^^^^^^
 734            root_tag_set(root, settag);
 735         *first_indexp = index;
 736
 737         return tagged;
 738 }

============================================

Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agodrivers/clocksource/tcb_clksrc.c: fix init sequence
Voss, Nikolaus [Tue, 25 Jan 2011 23:07:29 +0000 (15:07 -0800)]
drivers/clocksource/tcb_clksrc.c: fix init sequence

setup_irq() was called before clockevents_register_device() which is
needed by the irq handler.  Bug was reproducible by restarting the
kernel using kexec (reliable crash).

Signed-off-by: Nikolaus Voss <n.voss@weinmann.de>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomemcg: fix race at move_parent around compound_order()
KAMEZAWA Hiroyuki [Tue, 25 Jan 2011 23:07:29 +0000 (15:07 -0800)]
memcg: fix race at move_parent around compound_order()

A fix up mem_cgroup_move_parent() which use compound_order() in
asynchronous manner.  This compound_order() may return unknown value
because we don't take lock.  Use PageTransHuge() and HPAGE_SIZE instead
of it.

Also clean up for mem_cgroup_move_parent().
 - remove unnecessary initialization of local variable.
 - rename charge_size -> page_size
 - remove unnecessary (wrong) comment.
 - added a comment about THP.

Note:
 Current design take compound_page_lock() in caller of move_account().
 This should be revisited when we implement direct move_task of hugepage
 without splitting.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomemcg: bugfix check mem_cgroup_disabled() at split fixup
KAMEZAWA Hiroyuki [Tue, 25 Jan 2011 23:07:28 +0000 (15:07 -0800)]
memcg: bugfix check mem_cgroup_disabled() at split fixup

mem_cgroup_disabled() should be checked at splitting.  If disabled, no
heavy work is necesary.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomemcg: fix account leak at failure of memsw acconting
KAMEZAWA Hiroyuki [Tue, 25 Jan 2011 23:07:27 +0000 (15:07 -0800)]
memcg: fix account leak at failure of memsw acconting

Commit 4b53433468 ("memcg: clean up try_charge main loop") removes a
cancel of charge at case: memory charge-> success.  mem+swap charge->
failure.

This leaks usage of memory.  Fix it.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: <stable@kernel.org> [2.6.36+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomm: migration: clarify migrate_pages() comment
Minchan Kim [Tue, 25 Jan 2011 23:07:26 +0000 (15:07 -0800)]
mm: migration: clarify migrate_pages() comment

Callers of migrate_pages should putback_lru_pages to return pages
isolated to LRU or free list.  Now comment is rather confusing.  It says
caller always have to call it.

It is more clear to point out that the caller has to call it if
migrate_pages's return value isn't zero.

Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomm: compaction: don't depend on HUGETLB_PAGE
Andrea Arcangeli [Tue, 25 Jan 2011 23:07:25 +0000 (15:07 -0800)]
mm: compaction: don't depend on HUGETLB_PAGE

Commit 5d6892407 ("thp: select CONFIG_COMPACTION if TRANSPARENT_HUGEPAGE
enabled") causes this warning during the configuration process:

  warning: (TRANSPARENT_HUGEPAGE) selects COMPACTION which has unmet
  direct dependencies (EXPERIMENTAL && HUGETLB_PAGE && MMU)

COMPACTION doesn't depend on HUGETLB_PAGE, it doesn't depend on THP
either, it is also useful for regular alloc_pages(order > 0) including
the very kernel stack during fork (THREAD_ORDER = 1).  It's always
better to enable COMPACTION.

The warning should be an error because we would end up with MIGRATION
not selected, and COMPACTION wouldn't work without migration (despite it
seems to build with an inline migrate_pages returning -ENOSYS).

I'd also like to remove EXPERIMENTAL: compaction has been in the kernel
for some releases (for full safety the default remains disabled which I
think is enough).

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Luca Tettamanti <kronos.it@gmail.com>
Tested-by: Luca Tettamanti <kronos.it@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomm/memcontrol.c: fix uninitialized variable use in mem_cgroup_move_parent()
Jesper Juhl [Tue, 25 Jan 2011 23:07:24 +0000 (15:07 -0800)]
mm/memcontrol.c: fix uninitialized variable use in mem_cgroup_move_parent()

In mm/memcontrol.c::mem_cgroup_move_parent() there's a path that jumps
to the 'put_back' label

   ret = __mem_cgroup_try_charge(NULL, gfp_mask, &parent, false, charge);
   if (ret || !parent)
   goto put_back;

where we'll

   if (charge > PAGE_SIZE)
   compound_unlock_irqrestore(page, flags);

but, we have not assigned anything to 'flags' at this point, nor have we
called 'compound_lock_irqsave()' (which is what sets 'flags').  The
'put_back' label should be moved below the call to
compound_unlock_irqrestore() as per this patch.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomm: clear pages_scanned only if draining a pcp adds pages to the buddy allocator
David Rientjes [Tue, 25 Jan 2011 23:07:23 +0000 (15:07 -0800)]
mm: clear pages_scanned only if draining a pcp adds pages to the buddy allocator

Commit 0e093d99763e ("writeback: do not sleep on the congestion queue if
there are no congested BDIs or if significant congestion is not being
encountered in the current zone") uncovered a livelock in the page
allocator that resulted in tasks infinitely looping trying to find
memory and kswapd running at 100% cpu.

The issue occurs because drain_all_pages() is called immediately
following direct reclaim when no memory is freed and try_to_free_pages()
returns non-zero because all zones in the zonelist do not have their
all_unreclaimable flag set.

When draining the per-cpu pagesets back to the buddy allocator for each
zone, the zone->pages_scanned counter is cleared to avoid erroneously
setting zone->all_unreclaimable later.  The problem is that no pages may
actually be drained and, thus, the unreclaimable logic never fails
direct reclaim so the oom killer may be invoked.

This apparently only manifested after wait_iff_congested() was
introduced and the zone was full of anonymous memory that would not
congest the backing store.  The page allocator would infinitely loop if
there were no other tasks waiting to be scheduled and clear
zone->pages_scanned because of drain_all_pages() as the result of this
change before kswapd could scan enough pages to trigger the reclaim
logic.  Additionally, with every loop of the page allocator and in the
reclaim path, kswapd would be kicked and would end up running at 100%
cpu.  In this scenario, current and kswapd are all running continuously
with kswapd incrementing zone->pages_scanned and current clearing it.

The problem is even more pronounced when current swaps some of its
memory to swap cache and the reclaimable logic then considers all active
anonymous memory in the all_unreclaimable logic, which requires a much
higher zone->pages_scanned value for try_to_free_pages() to return zero
that is never attainable in this scenario.

Before wait_iff_congested(), the page allocator would incur an
unconditional timeout and allow kswapd to elevate zone->pages_scanned to
a level that the oom killer would be called the next time it loops.

The fix is to only attempt to drain pcp pages if there is actually a
quantity to be drained.  The unconditional clearing of
zone->pages_scanned in free_pcppages_bulk() need not be changed since
other callers already ensure that draining will occur.  This patch
ensures that free_pcppages_bulk() will actually free memory before
calling into it from drain_all_pages() so zone->pages_scanned is only
cleared if appropriate.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomm: fix deferred congestion timeout if preferred zone is not allowed
David Rientjes [Tue, 25 Jan 2011 23:07:20 +0000 (15:07 -0800)]
mm: fix deferred congestion timeout if preferred zone is not allowed

Before 0e093d99763e ("writeback: do not sleep on the congestion queue if
there are no congested BDIs or if significant congestion is not being
encountered in the current zone"), preferred_zone was only used for NUMA
statistics, to determine the zoneidx from which to allocate from given
the type requested, and whether to utilize memory compaction.

wait_iff_congested(), though, uses preferred_zone to determine if the
congestion wait should be deferred because its dirty pages are backed by
a congested bdi.  This incorrectly defers the timeout and busy loops in
the page allocator with various cond_resched() calls if preferred_zone
is not allowed in the current context, usually consuming 100% of a cpu.

This patch ensures preferred_zone is an allowed zone in the fastpath
depending on whether current is constrained by its cpuset or nodes in
its mempolicy (when the nodemask passed is non-NULL).  This is correct
since the fastpath allocation always passes ALLOC_CPUSET when trying to
allocate memory.  In the slowpath, this patch resets preferred_zone to
the first zone of the allowed type when the allocation is not
constrained by current's cpuset, i.e.  it does not pass ALLOC_CPUSET.

This patch also ensures preferred_zone is from the set of allowed nodes
when called from within direct reclaim since allocations are always
constrained by cpusets in this context (it is blockable).

Both of these uses of cpuset_current_mems_allowed are protected by
get_mems_allowed().

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agopps: claim parallel port exclusively
Alexander Gordeev [Tue, 25 Jan 2011 23:07:19 +0000 (15:07 -0800)]
pps: claim parallel port exclusively

Both pps_parport and pps_gen_parport are written in a way that they
can't share a port with any other driver.  This can result in locking up
the process that loads modules or even the whole kernel if the modules
are compiled in.  Use PARPORT_FLAG_EXCL to indicate this.

Signed-off-by: Alexander Gordeev <lasaine@lvk.cs.msu.su>
Cc: Alexander Gordeev <lasaine@lvk.cs.msu.su>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agopps ktimer: remove noisy message
Rodolfo Giometti [Tue, 25 Jan 2011 23:07:17 +0000 (15:07 -0800)]
pps ktimer: remove noisy message

Signed-off-by: Rodolfo Giometti <giometti@linux.it>
Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Gordeev <lasaine@lvk.cs.msu.su>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoparport: make lockdep happy with waitlist_lock
Alexander Gordeev [Tue, 25 Jan 2011 23:07:16 +0000 (15:07 -0800)]
parport: make lockdep happy with waitlist_lock

parport_unregister_device() should never be used when interrupts are
enabled in hardware and irq handler is registered so there is no need to
disable interrupts when using waitlist_lock.  But there is no way to
explain this subtle semantics to lockdep analyzer.

So disable interrupts here too to simplify things.  The price is
negligible.

Signed-off-by: Alexander Gordeev <lasaine@lvk.cs.msu.su>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agolangwell_gpio: modify EOI handling following change of kernel irq subsystem
Feng Tang [Tue, 25 Jan 2011 23:07:15 +0000 (15:07 -0800)]
langwell_gpio: modify EOI handling following change of kernel irq subsystem

Latest kernel has many changes in IRQ subsystem and its interfaces, like
adding "irq_eoi" for struct irq_chip, this patch is a follow up change
for that.

Also remove the unnecessary cast for a "void *".

Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Alek Du <alek.du@intel.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoleds: leds-pwm: return proper error if pwm_request failed
Axel Lin [Tue, 25 Jan 2011 23:07:14 +0000 (15:07 -0800)]
leds: leds-pwm: return proper error if pwm_request failed

Return PTR_ERR(led_dat->pwm) instead of 0 if pwm_request failed

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Luotao Fu <l.fu@pengutronix.de>
Cc: Reviewed-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agomm/pgtable-generic.c: fix CONFIG_SWAP=n build
Andrew Morton [Tue, 25 Jan 2011 23:07:11 +0000 (15:07 -0800)]
mm/pgtable-generic.c: fix CONFIG_SWAP=n build

mips (and sparc32):

  In file included from arch/mips/include/asm/tlb.h:21,
                   from mm/pgtable-generic.c:9:
  include/asm-generic/tlb.h: In function `tlb_flush_mmu':
  include/asm-generic/tlb.h:76: error: implicit declaration of function `release_pages'
  include/asm-generic/tlb.h: In function `tlb_remove_page':
  include/asm-generic/tlb.h:105: error: implicit declaration of function `page_cache_release'

free_pages_and_swap_cache() and free_page_and_swap_cache() are macros
which call release_pages() and page_cache_release().  The obvious fix is
to include pagemap.h in swap.h, where those macros are defined.  But that
breaks sparc for weird reasons.

So fix it within mm/pgtable-generic.c instead.

Reported-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Sergei Shtylyov <sshtylyov@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agothp: fix PARAVIRT x86 32bit noPAE
Andrea Arcangeli [Tue, 25 Jan 2011 23:07:09 +0000 (15:07 -0800)]
thp: fix PARAVIRT x86 32bit noPAE

This fixes TRANSPARENT_HUGEPAGE=y with PARAVIRT=y and HIGHMEM64=n.

The #ifdef that this patch removes was erratically introduced to fix a
build error for noPAE (where pmd.pmd doesn't exist).  So then the kernel
built but it failed at runtime because set_pmd_at was a noop.  This will
correct it by enabling set_pmd_at for noPAE mode too.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: werner <w.landgraf@ru.ru>
Reported-by: Minchan Kim <minchan.kim@gmail.com>
Tested-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoMerge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Tue, 25 Jan 2011 23:04:18 +0000 (09:04 +1000)]
Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm

* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm:
  ALSA: AACI: fix timeout duration
  ALSA: AACI: fix timeout condition checking
  ARM: 6636/1: ep93xx: default multiplexed gpio ports to gpio mode
  ARM: 6637/1: Make the argument to virt_to_phys() "const volatile"
  ARM: twd: ensure timer reload is reprogrammed on entry to periodic mode
  ARM: 6635/2: Configure reference clock for Versatile Express timers
  ARM: versatile: name configuration options after actual board names
  ARM: realview: name configuration options after actual board names
  ARM: realview,vexpress: fix section mismatch warning for pen_release
  ARM: 6632/3: mmci: stop using the blockend interrupts

14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke...
Linus Torvalds [Tue, 25 Jan 2011 23:03:36 +0000 (09:03 +1000)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
  nilfs2: fix crash after one superblock became unavailable

14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux...
Linus Torvalds [Tue, 25 Jan 2011 23:02:55 +0000 (09:02 +1000)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k/amiga: Fix "debug=mem"
  m68k/atari: Rename "scc" to "atari_scc"
  m68k: Uninline strchr()

14 years agoMerge branch 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 25 Jan 2011 23:02:14 +0000 (09:02 +1000)]
Merge branch 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6

* 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  ARM: mach-shmobile: AG5EVM LCDC / MIPI-DSI platform data
  ARM: mach-shmobile: sh73a0 CPGA fix for PLL CFG bit
  ARM: mach-shmobile: mackerel: clarify shdi/mmcif switch settings
  ARM: mach-shmobile: sh73a0 CPGA fix for IrDA MSTP
  ARM: mach-shmobile: sh73a0 CPGA fix for FRQCRA M3
  ARM: mach-shmobile: remove sh7367 on-chip set_irq_type()
  ARM: mach-shmobile: sh7372 INTCS MFIS2 interrupt update
  ARM: mach-shmobile: ag5evm: Add IrDA support
  ARM: mach-shmobile: clock-sh7372: fixup pllc2 set_rate
  mmc: sh_mmcif: Convert to __raw_xxx() I/O accessors.
  ARM: mach-shmobile: ag5evm requires GPIOLIB
  ARM: mach-shmobile: fix cpu_base of gic_init() on sh73a0

14 years agoMerge branch 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 25 Jan 2011 23:01:22 +0000 (09:01 +1000)]
Merge branch 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6

* 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6:
  mailmap: Add an entry for Axel Lin.
  video: fix some comments in drivers/video/console/vgacon.c
  drivers/video/bf537-lq035.c: Add missing IS_ERR test
  video: pxa168fb: remove a redundant pxa168fb_check_var call
  video: da8xx-fb: fix fb_probe error path
  video: pxa3xx-gcu: Return -EFAULT when copy_from_user() fails
  video: nuc900fb: properly free resources in nuc900fb_remove
  video: nuc900fb: fix compile error

14 years agoMerge branch 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 25 Jan 2011 23:00:17 +0000 (09:00 +1000)]
Merge branch 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6

* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  sh: Fix build of sh7750 base boards
  sh: update INTC to clear IRQ sense valid flag
  sh: Fix sh build failure when CONFIG_SFC=m
  sh: fix MSIOF0 SPI on ecovec: it conflicts with VOU
  sh: support XZ-compressed kernel.
  sh: Fix up breakage from asm-generic/pgtable.h changes.

14 years agoKEYS: Fix __key_link_end() quota fixup on error
David Howells [Tue, 25 Jan 2011 16:34:28 +0000 (16:34 +0000)]
KEYS: Fix __key_link_end() quota fixup on error

Fix __key_link_end()'s attempt to fix up the quota if an error occurs.

There are two erroneous cases: Firstly, we always decrease the quota if
the preallocated replacement keyring needs cleaning up, irrespective of
whether or not we should (we may have replaced a pointer rather than
adding another pointer).

Secondly, we never clean up the quota if we added a pointer without the
keyring storage being extended (we allocate multiple pointers at a time,
even if we're not going to use them all immediately).

We handle this by setting the bottom bit of the preallocation pointer in
__key_link_begin() to indicate that the quota needs fixing up, which is
then passed to __key_link() (which clears the whole thing) and
__key_link_end().

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agointel_scu_ipcutils: Fix the license tag
Alan Cox [Tue, 25 Jan 2011 14:33:36 +0000 (14:33 +0000)]
intel_scu_ipcutils: Fix the license tag

GPL V2 should be GPL v2

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoDocumentation: Fix kernel parameter ordering
Alan Cox [Tue, 25 Jan 2011 14:18:38 +0000 (14:18 +0000)]
Documentation: Fix kernel parameter ordering

A B C D E ...

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agointel_scu_ipc: fix signedness bug
Axel Lin [Tue, 25 Jan 2011 14:12:12 +0000 (14:12 +0000)]
intel_scu_ipc: fix signedness bug

busy_loop() returns negative error code, thus change err variable
from u32 to int to properly propagate correct error code.

Also remove unneeded initialization for err and i variables.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agobonding: update documentation - alternate configuration.
Nicolas de Pesloüan [Mon, 24 Jan 2011 13:21:37 +0000 (13:21 +0000)]
bonding: update documentation - alternate configuration.

The bonding documentation used to provide configuration
details and examples for initscripts and sysconfig only.

This patch describe the third possible configuration:
/etc/network/interfaces.

Signed-off-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoTCP: fix a bug that triggers large number of TCP RST by mistake
Jerry Chu [Tue, 25 Jan 2011 21:46:30 +0000 (13:46 -0800)]
TCP: fix a bug that triggers large number of TCP RST by mistake

This patch fixes a bug that causes TCP RST packets to be generated
on otherwise correctly behaved applications, e.g., no unread data
on close,..., etc. To trigger the bug, at least two conditions must
be met:

1. The FIN flag is set on the last data packet, i.e., it's not on a
separate, FIN only packet.
2. The size of the last data chunk on the receive side matches
exactly with the size of buffer posted by the receiver, and the
receiver closes the socket without any further read attempt.

This bug was first noticed on our netperf based testbed for our IW10
proposal to IETF where a large number of RST packets were observed.
netperf's read side code meets the condition 2 above 100%.

Before the fix, tcp_data_queue() will queue the last skb that meets
condition 1 to sk_receive_queue even though it has fully copied out
(skb_copy_datagram_iovec()) the data. Then if condition 2 is also met,
tcp_recvmsg() often returns all the copied out data successfully
without actually consuming the skb, due to a check
"if ((chunk = len - tp->ucopy.len) != 0) {"
and
"len -= chunk;"
after tcp_prequeue_process() that causes "len" to become 0 and an
early exit from the big while loop.

I don't see any reason not to free the skb whose data have been fully
consumed in tcp_data_queue(), regardless of the FIN flag.  We won't
get there if MSG_PEEK is on. Am I missing some arcane cases related
to urgent data?

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMAINTAINERS: remove Reinette Chatre as iwlwifi maintainer
Reinette Chatre [Tue, 25 Jan 2011 16:38:06 +0000 (08:38 -0800)]
MAINTAINERS: remove Reinette Chatre as iwlwifi maintainer

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agort2x00: add device id for windy31 usb device
Greg Kroah-Hartman [Tue, 25 Jan 2011 09:42:29 +0000 (17:42 +0800)]
rt2x00: add device id for windy31 usb device

This patch adds the device id for the windy31 USB device to the rt73usb
driver.

Thanks to Ralf Flaxa for reporting this and providing testing and a
sample device.

Reported-by: Ralf Flaxa <rf@suse.de>
Tested-by: Ralf Flaxa <rf@suse.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agomac80211: fix a crash in ieee80211_beacon_get_tim on change_interface
Felix Fietkau [Mon, 24 Jan 2011 18:28:49 +0000 (19:28 +0100)]
mac80211: fix a crash in ieee80211_beacon_get_tim on change_interface

Some drivers (e.g. ath9k) do not always disable beacons when they're
supposed to. When an interface is changed using the change_interface op,
the mode specific sdata part is in an undefined state and trying to
get a beacon at this point can produce weird crashes.

To fix this, add a check for ieee80211_sdata_running before using
anything from the sdata.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
14 years agoMerge branch 'mmci' into fixes
Russell King [Tue, 25 Jan 2011 21:18:11 +0000 (21:18 +0000)]
Merge branch 'mmci' into fixes

14 years agoALSA: AACI: fix timeout duration
Russell King [Wed, 12 Jan 2011 23:42:57 +0000 (23:42 +0000)]
ALSA: AACI: fix timeout duration

Relying on the access time of peripherals is unreliable - it depends
on the speed of the CPU and the bus.  On Versatile Express, these
timeouts were expiring, causing the driver to fail.

Add udelay(1) to ensure that they don't expire early, and adjust
timeouts to give a reasonable margin over the response times.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoALSA: AACI: fix timeout condition checking
Russell King [Wed, 12 Jan 2011 23:17:24 +0000 (23:17 +0000)]
ALSA: AACI: fix timeout condition checking

Ensure that a timeout coincident with the condition being waited for
results in success rather than failure.  This helps avoid timeout
conditions being inappropriately flagged.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoARM: 6636/1: ep93xx: default multiplexed gpio ports to gpio mode
Hartley Sweeten [Tue, 25 Jan 2011 00:05:35 +0000 (01:05 +0100)]
ARM: 6636/1: ep93xx: default multiplexed gpio ports to gpio mode

The EP93xx C and D GPIO ports are multiplexed with the Keypad Interface
peripheral.  At power-up they default into non-GPIO mode with the Key
Matrix controller enabled so these ports are unusable for GPIO.  Note
that the Keypad Interface peripheral is only available in the EP9307,
EP9312, and EP9315 processor variants.

The keypad support will clear the DeviceConfig bits appropriately to
enable the Keypad Interface when the driver is loaded.  And, when the
driver is unloaded it will set the bits to return the ports to GPIO mode.

To make these ports available for GPIO after power-up on all EP93xx
processor variants, set the KEYS and GONK bits in the DeviceConfig
register.

Similarly, the E, G, and H ports are multiplexed with the IDE Interface
peripheral.  At power-up these also default into non-GPIO mode.  Note
that the IDE peripheral is only available in the EP9312 and EP9315
processor variants.

Since an IDE driver is not even available in mainline, set the EONIDE,
GONIDE, and HONIDE bits in the DeviceConfig register so that these
ports will be available for GPIO use after power-up.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Ryan Mallon <ryan@bluewatersys.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoARM: 6637/1: Make the argument to virt_to_phys() "const volatile"
Catalin Marinas [Tue, 25 Jan 2011 10:18:25 +0000 (11:18 +0100)]
ARM: 6637/1: Make the argument to virt_to_phys() "const volatile"

Changing the virt_to_phys() argument to "const volatile void *" avoids
compiler warnings in some situations where this function is used.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoARM: twd: ensure timer reload is reprogrammed on entry to periodic mode
Russell King [Tue, 25 Jan 2011 10:35:36 +0000 (10:35 +0000)]
ARM: twd: ensure timer reload is reprogrammed on entry to periodic mode

Ensure that the twd timer reload value is reprogrammed each time we
enter periodic mode.  This ensures that the reload value is always
reset correctly.

Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoipv6: Revert 'administrative down' address handling changes.
David S. Miller [Mon, 24 Jan 2011 07:27:15 +0000 (23:27 -0800)]
ipv6: Revert 'administrative down' address handling changes.

This reverts the following set of commits:

d1ed113f1669390da9898da3beddcc058d938587 ("ipv6: remove duplicate neigh_ifdown")
29ba5fed1bbd09c2cba890798c8f9eaab251401d ("ipv6: don't flush routes when setting loopback down")
9d82ca98f71fd686ef2f3017c5e3e6a4871b6e46 ("ipv6: fix missing in6_ifa_put in addrconf")
2de795707294972f6c34bae9de713e502c431296 ("ipv6: addrconf: don't remove address state on ifdown if the address is being kept")
8595805aafc8b077e01804c9a3668e9aa3510e89 ("IPv6: only notify protocols if address is compeletely gone")
27bdb2abcc5edb3526e25407b74bf17d1872c329 ("IPv6: keep tentative addresses in hash table")
93fa159abe50d3c55c7f83622d3f5c09b6e06f4b ("IPv6: keep route for tentative address")
8f37ada5b5f6bfb4d251a7f510f249cb855b77b3 ("IPv6: fix race between cleanup and add/delete address")
84e8b803f1e16f3a2b8b80f80a63fa2f2f8a9be6 ("IPv6: addrconf notify when address is unavailable")
dc2b99f71ef477a31020511876ab4403fb7c4420 ("IPv6: keep permanent addresses on admin down")

because the core semantic change to ipv6 address handling on ifdown
has broken some things, in particular "disable_ipv6" sysctl handling.

Stephen has made several attempts to get things back in working order,
but nothing has restored disable_ipv6 fully yet.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Tested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoPM / Runtime: Don't enable interrupts while running in_interrupt
Alan Stern [Tue, 25 Jan 2011 19:50:07 +0000 (20:50 +0100)]
PM / Runtime: Don't enable interrupts while running in_interrupt

This patch (as1445) fixes a bug in the runtime PM core left over from
the addition of the no_callbacks flag.  If this flag is set then it is
possible for rpm_suspend() to be called in_interrupt, so when
releasing spinlocks it's important not to re-enable interrupts.

To avoid an unnecessary save-and-restore of the interrupt flag, the
patch also inlines a pm_request_idle() call.

This fixes Bugzilla #27482.

(The offending code was added in 2.6.37, so it's not necessary to apply
this to any earlier stable kernels.)

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: tim blechmann <tim@klingt.org>
CC: <stable@kernel.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
14 years agohwmon: (lis3) turn down the no IRQ message
Kalhan Trisal [Tue, 25 Jan 2011 14:24:37 +0000 (14:24 +0000)]
hwmon: (lis3) turn down the no IRQ message

Turn down the no IRQ message - on some platforms that's a normal state of
affairs.

Signed-off-by: Kalhan Trisal <kalhan.trisal@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Acked-by: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
14 years agoARM: 6635/2: Configure reference clock for Versatile Express timers
Pawel Moll [Tue, 25 Jan 2011 14:53:03 +0000 (15:53 +0100)]
ARM: 6635/2: Configure reference clock for Versatile Express timers

Timers on Versatile Express mainboard are used as system clock/event
sources. Driver assumes that they are clocked with 1MHz signal.
Old V2M firmware apparently configured it by default, but on newer
boards one can observe that "sleep 1" command takes over 30 seconds
to finish, as the timers are fed with 32kHz instead...

This patch performs required magic and also removes code clearing
timer's control registers, as exactly the same operations are
performed by the timer driver few jiffies later.

Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoARM: versatile: name configuration options after actual board names
Russell King [Mon, 24 Jan 2011 12:00:01 +0000 (12:00 +0000)]
ARM: versatile: name configuration options after actual board names

Update the option text to those which appear on the front of the
appropriate board user guides.  This gives consistent board naming, and
makes it obvious which option is for which platform.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoARM: realview: name configuration options after actual board names
Russell King [Mon, 24 Jan 2011 10:58:24 +0000 (10:58 +0000)]
ARM: realview: name configuration options after actual board names

As no one seems to really know which configuration options tie up with
which boards, I thought I'd do some investigation and try to work it
out.  After discussion with some folk in linaro, I think I have this
nailed.

The names are updated to use the name on the front of the appropriate
board user guide for the various baseboards, which I've taken to be
the official name for each board.

I haven't significantly updated the descriptions for the tiles as that
is even less clear - as far as I can see on ARMs website, there is no
Cortex-A9 tile for Realview EB - only ARM11MPCore, ARM1156T2F-S,
ARM1176TZF-S and Cortex-R4F.  So exactly what this 'Multicore Cortex-A9
Tile' is...

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agoARM: realview,vexpress: fix section mismatch warning for pen_release
Russell King [Sat, 22 Jan 2011 17:22:34 +0000 (17:22 +0000)]
ARM: realview,vexpress: fix section mismatch warning for pen_release

Fix two section mismatch warnings in the platform SMP bringup code for
Realview and Versatile Express:

WARNING: arch/arm/mach-realview/built-in.o(.text+0x8ac): Section mismatch in reference from the function write_pen_release() to the variable .cpuinit.data:pen_release
The function write_pen_release() references
the variable __cpuinitdata pen_release.
This is often because write_pen_release lacks a __cpuinitdata
annotation or the annotation of pen_release is wrong.

WARNING: arch/arm/mach-vexpress/built-in.o(.text+0x7b4): Section mismatch in reference from the function write_pen_release() to the variable .cpuinit.data:pen_release
The function write_pen_release() references
the variable __cpuinitdata pen_release.
This is often because write_pen_release lacks a __cpuinitdata
annotation or the annotation of pen_release is wrong.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
14 years agotextsearch: doc - fix spelling in lib/textsearch.c.
Jesper Dangaard Brouer [Mon, 24 Jan 2011 02:41:37 +0000 (02:41 +0000)]
textsearch: doc - fix spelling in lib/textsearch.c.

Found the following spelling errors while reading the textsearch code:
  "facitilies"  -> "facilities"
  "continously" -> "continuously"
  "arbitary"    -> "arbitrary"
  "patern"      -> "pattern"
  "occurences"  -> "occurrences"

I'll try to push this patch through DaveM, given the only users
of textsearch is in the net/ tree (nf_conntrack_amanda.c, xt_string.c
and em_text.c)

Signed-off-by: Jesper Sander <sander.contrib@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoUSB NET KL5KUSB101: Fix mem leak in error path of kaweth_download_firmware()
Jesper Juhl [Sun, 23 Jan 2011 12:19:55 +0000 (12:19 +0000)]
USB NET KL5KUSB101: Fix mem leak in error path of kaweth_download_firmware()

We will leak the storage allocated by request_firmware() if the size of
the firmware is greater than KAWETH_FIRMWARE_BUF_SIZE.
This removes the leak by calling release_firmware() before we return
-ENOSPC.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agopch_gbe: don't use flush_scheduled_work()
Tejun Heo [Tue, 25 Jan 2011 07:19:10 +0000 (23:19 -0800)]
pch_gbe: don't use flush_scheduled_work()

Directly cancel adapter->reset_task instead of using to-be-deprecated
flush_scheduled_work().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2: Always set ETH_FLAG_TXVLAN
Michael Chan [Mon, 24 Jan 2011 12:59:02 +0000 (12:59 +0000)]
bnx2: Always set ETH_FLAG_TXVLAN

TSO does not work if the VLAN tag is in the packet (non-accelerated).
We may be able to remove this restriction in future firmware.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agomailmap: Add an entry for Axel Lin.
Paul Mundt [Tue, 25 Jan 2011 06:30:55 +0000 (15:30 +0900)]
mailmap: Add an entry for Axel Lin.

Not all of Axel's patches have used a consistent casing, so fix it up
here.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
14 years agovideo: fix some comments in drivers/video/console/vgacon.c
Amerigo Wang [Wed, 19 Jan 2011 06:24:02 +0000 (06:24 +0000)]
video: fix some comments in drivers/video/console/vgacon.c

Now vgacon_scrollback_startup() uses slab, not bootmem,
the comment above it is obsolete, so does __init_refok.

Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
14 years agodrivers/video/bf537-lq035.c: Add missing IS_ERR test
Julia Lawall [Mon, 24 Jan 2011 19:55:21 +0000 (19:55 +0000)]
drivers/video/bf537-lq035.c: Add missing IS_ERR test

lcd_device_register may return ERR_PTR, so a check is added for this value
before the dereference.  All of the other changes reorganize the error
handling code in this function to avoid duplicating all of it in the added
case.

In the original code, in one case, the global variable fb_buffer was set to
NULL in error code that appears after this variable is initialized.  This
is done now in all error handling code that has this property.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
identifier f;
@@
f(...) { ... return ERR_PTR(...); }

@@
identifier r.f, fld;
expression x;
statement S1,S2;
@@
 x = f(...)
 ... when != IS_ERR(x)
(
 if (IS_ERR(x) ||...) S1 else S2
|
*x->fld
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
14 years agovideo: pxa168fb: remove a redundant pxa168fb_check_var call
axel lin [Fri, 21 Jan 2011 11:18:06 +0000 (11:18 +0000)]
video: pxa168fb: remove a redundant pxa168fb_check_var call

Current implementation calls pxa168fb_check_var twice in pxa168fb_probe.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
14 years agovideo: da8xx-fb: fix fb_probe error path
axel lin [Thu, 20 Jan 2011 03:50:51 +0000 (03:50 +0000)]
video: da8xx-fb: fix fb_probe error path

Current implementation puts CONFIG_CPU_FREQ at wrong place, CONFIG_CPU_FREQ
is for lcd_da8xx_cpufreq_deregister not for unregister_framebuffer.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
14 years agosh: Fix build of sh7750 base boards
Nobuhiro Iwamatsu [Mon, 24 Jan 2011 01:40:17 +0000 (01:40 +0000)]
sh: Fix build of sh7750 base boards

Renamed platform_register_device to platform_device_register.

Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
14 years agonet: clear heap allocation for ethtool_get_regs()
Eugene Teo [Tue, 25 Jan 2011 05:05:17 +0000 (21:05 -0800)]
net: clear heap allocation for ethtool_get_regs()

There is a conflict between commit b00916b1 and a77f5db3. This patch resolves
the conflict by clearing the heap allocation in ethtool_get_regs().

Cc: stable@kernel.org
Signed-off-by: Eugene Teo <eugeneteo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Tue, 25 Jan 2011 04:23:54 +0000 (14:23 +1000)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  Make CIFS mount work in a container.
  CIFS: Remove pointless variable assignment in cifs_dfs_do_automount()

14 years agoMerge branch 'for-38-rc3' of git://codeaurora.org/quic/kernel/davidb/linux-msm
Linus Torvalds [Tue, 25 Jan 2011 01:01:33 +0000 (11:01 +1000)]
Merge branch 'for-38-rc3' of git://codeaurora.org/quic/kernel/davidb/linux-msm

* 'for-38-rc3' of git://codeaurora.org/quic/kernel/davidb/linux-msm:
  drivers: mmc: msm: remove clock disable in probe
  mmc: msm: fix dma usage not to use internal APIs

14 years agoMerge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied...
Linus Torvalds [Tue, 25 Jan 2011 00:46:14 +0000 (10:46 +1000)]
Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: add new radeon_info ioctl query for clock crystal freq
  drm/i915: Prevent uninitialised reads during error state capture
  drm/i915: Use consistent mappings for OpRegion between ACPI and i915
  drm/i915: Handle the no-interrupts case for UMS by polling
  drm/i915: Disable high-precision vblank timestamping for UMS
  drm/i915: Increase the amount of defense before computing vblank timestamps
  drm/i915,agp/intel: Do not clear stolen entries
  drm/radeon/kms: simplify atom adjust pll setup
  drm/radeon/kms: match r6xx/r7xx/evergreen asic_reset with previous asics
  drm/radeon/kms: make the mac rv630 quirk generic
  drm/radeon/kms: fix a spelling error in an error message
  drm/radeon/kms: Initialize pageflip spinlocks.
  drm/i915: Recognise non-VGA display devices
  drm/i915: Fix use of invalid array size for ring->sync_seqno
  drm/i915/ringbuffer: Fix use of stale HEAD position whilst polling for space
  drm/i915: Don't kick-off hangcheck after a DRI interrupt
  drm/i915: Add dependency on CONFIG_TMPFS
  drm/i915: Initialise ring vfuncs for old DRI paths
  drm/i915: make the blitter report buffer modifications to the FBC unit
  drm/i915: set more FBC chicken bits

14 years agoipv6: Always clone offlink routes.
David S. Miller [Tue, 25 Jan 2011 00:01:58 +0000 (16:01 -0800)]
ipv6: Always clone offlink routes.

Do not handle PMTU vs. route lookup creation any differently
wrt. offlink routes, always clone them.

Reported-by: PK <runningdoglackey@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodcbnl: make get_app handling symmetric for IEEE and CEE DCBx
John Fastabend [Fri, 21 Jan 2011 16:35:18 +0000 (16:35 +0000)]
dcbnl: make get_app handling symmetric for IEEE and CEE DCBx

The IEEE get/set app handlers use generic routines and do not
require the net_device to implement the dcbnl_ops routines. This
patch makes it symmetric so user space and drivers do not have
to handle the CEE version and IEEE DCBx versions differently.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'can/at91_can-for-net-2.6' of git://git.pengutronix.de/git/mkl/linux-2.6
David S. Miller [Mon, 24 Jan 2011 23:16:11 +0000 (15:16 -0800)]
Merge branch 'can/at91_can-for-net-2.6' of git://git.pengutronix.de/git/mkl/linux-2.6

14 years agoMerge branch 'drm-intel-fixes-2' of ssh://master.kernel.org/pub/scm/linux/kernel...
Dave Airlie [Mon, 24 Jan 2011 22:41:58 +0000 (08:41 +1000)]
Merge branch 'drm-intel-fixes-2' of ssh://master.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel into drm-fixes

* 'drm-intel-fixes-2' of ssh://master.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel: (30 commits)
  drm/i915: Prevent uninitialised reads during error state capture
  drm/i915: Use consistent mappings for OpRegion between ACPI and i915
  drm/i915: Handle the no-interrupts case for UMS by polling
  drm/i915: Disable high-precision vblank timestamping for UMS
  drm/i915: Increase the amount of defense before computing vblank timestamps
  drm/i915,agp/intel: Do not clear stolen entries
  Remove MAYBE_BUILD_BUG_ON
  BUILD_BUG_ON: make it handle more cases
  module: fix missing semicolons in MODULE macro usage
  param: add null statement to compiled-in module params
  module: fix linker error for MODULE_VERSION when !MODULE and CONFIG_SYSFS=n
  module: show version information for built-in modules in sysfs
  selinux: return -ENOMEM when memory allocation fails
  tpm: fix panic caused by "tpm: Autodetect itpm devices"
  TPM: Long default timeout fix
  trusted keys: Fix a memory leak in trusted_update().
  keys: add trusted and encrypted maintainers
  encrypted-keys: rename encrypted_defined files to encrypted
  trusted-keys: rename trusted_defined files to trusted
  drm/i915: Recognise non-VGA display devices
  ...

14 years agotcp: fix bug in listening_get_next()
Eric Dumazet [Mon, 24 Jan 2011 22:41:20 +0000 (14:41 -0800)]
tcp: fix bug in listening_get_next()

commit a8b690f98baf9fb19 (tcp: Fix slowness in read /proc/net/tcp)
introduced a bug in handling of SYN_RECV sockets.

st->offset represents number of sockets found since beginning of
listening_hash[st->bucket].

We should not reset st->offset when iterating through
syn_table[st->sbucket], or else if more than ~25 sockets (if
PAGE_SIZE=4096) are in SYN_RECV state, we exit from listening_get_next()
with a too small st->offset

Next time we enter tcp_seek_last_pos(), we are not able to seek past
already found sockets.

Reported-by: PK <runningdoglackey@yahoo.com>
CC: Tom Herbert <therbert@google.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodrm/radeon/kms: add new radeon_info ioctl query for clock crystal freq
Alex Deucher [Mon, 24 Jan 2011 22:14:26 +0000 (17:14 -0500)]
drm/radeon/kms: add new radeon_info ioctl query for clock crystal freq

Needed for timer queries in the 3D driver.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
14 years agoinetpeer: Use correct AVL tree base pointer in inet_getpeer().
David S. Miller [Mon, 24 Jan 2011 22:37:46 +0000 (14:37 -0800)]
inetpeer: Use correct AVL tree base pointer in inet_getpeer().

Family was hard-coded to AF_INET but should be daddr->family.

This fixes crashes when unlinking ipv6 peer entries, since the
unlink code was looking up the base pointer properly.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoGRO: fix merging a paged skb after non-paged skbs
Michal Schmidt [Mon, 24 Jan 2011 12:08:48 +0000 (12:08 +0000)]
GRO: fix merging a paged skb after non-paged skbs

Suppose that several linear skbs of the same flow were received by GRO. They
were thus merged into one skb with a frag_list. Then a new skb of the same flow
arrives, but it is a paged skb with data starting in its frags[].

Before adding the skb to the frag_list skb_gro_receive() will of course adjust
the skb to throw away the headers. It correctly modifies the page_offset and
size of the frag, but it leaves incorrect information in the skb:
 ->data_len is not decreased at all.
 ->len is decreased only by headlen, as if no change were done to the frag.
Later in a receiving process this causes skb_copy_datagram_iovec() to return
-EFAULT and this is seen in userspace as the result of the recv() syscall.

In practice the bug can be reproduced with the sfc driver. By default the
driver uses an adaptive scheme when it switches between using
napi_gro_receive() (with skbs) and napi_gro_frags() (with pages). The bug is
reproduced when under rx load with enough successful GRO merging the driver
decides to switch from the former to the latter.

Manual control is also possible, so reproducing this is easy with netcat:
 - on machine1 (with sfc): nc -l 12345 > /dev/null
 - on machine2: nc machine1 12345 < /dev/zero
 - on machine1:
   echo 1 > /sys/module/sfc/parameters/rx_alloc_method  # use skbs
   echo 2 > /sys/module/sfc/parameters/rx_alloc_method  # use pages
 - See that nc has quit suddenly.

[v2: Modified by Eric Dumazet to avoid advancing skb->data past the end
     and to use a temporary variable.]

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agox86-64: Don't use pointer to out-of-scope variable in dump_trace()
Jesper Juhl [Mon, 24 Jan 2011 21:41:11 +0000 (22:41 +0100)]
x86-64: Don't use pointer to out-of-scope variable in dump_trace()

In arch/x86/kernel/dumpstack_64.c::dump_trace() we have this code:

...
   if (!stack) {
   unsigned long dummy;
   stack = &dummy;
   if (task && task != current)
   stack = (unsigned long *)task->thread.sp;
   }

   bp = stack_frame(task, regs);
   /*
    * Print function call entries in all stacks, starting at the
    * current stack address. If the stacks consist of nested
    * exceptions
    */
   tinfo = task_thread_info(task);

   for (;;) {
   char *id;
   unsigned long *estack_end;
   estack_end = in_exception_stack(cpu, (unsigned long)stack,
   &used, &id);
...

You'll notice that we assign to 'stack' the address of the variable
'dummy' which is only in-scope inside the 'if (!stack)'. So when we later
access stack (at the end of the above, and assuming we did not take the
'if (task && task != current)' branch) we'll be using the address of a
variable that is no longer in scope. I believe this patch is the proper
fix, but I freely admit that I'm not 100% certain.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
LKML-Reference: <alpine.LNX.2.00.1101242232590.10252@swampdragon.chaosbits.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>