]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
16 years agoperfcounters: enable lowlevel pmc code to schedule counters
Ingo Molnar [Sun, 21 Dec 2008 12:50:42 +0000 (13:50 +0100)]
perfcounters: enable lowlevel pmc code to schedule counters

Allow lowlevel ->enable() op to return an error if a counter can not be
added. This can be used to handle counter constraints.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: fix init context lock
Ingo Molnar [Sun, 21 Dec 2008 14:07:49 +0000 (15:07 +0100)]
perfcounters: fix init context lock

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: pull inherited counters
Ingo Molnar [Fri, 19 Dec 2008 09:20:42 +0000 (10:20 +0100)]
perfcounters: pull inherited counters

Change counter inheritance from a 'push' to a 'pull' model: instead of
child tasks pushing their final counts to the parent, reuse the wait4
infrastructure to pull counters as child tasks are exit-processed,
much like how cutime/cstime is collected.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: fix task clock counter
Ingo Molnar [Wed, 17 Dec 2008 13:10:57 +0000 (14:10 +0100)]
perfcounters: fix task clock counter

Impact: fix per task clock counter precision

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: hw ops rename
Ingo Molnar [Wed, 17 Dec 2008 13:20:28 +0000 (14:20 +0100)]
perfcounters: hw ops rename

Impact: rename field names

Shorten them.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, perfcounters: refactor code for fixed-function PMCs
Ingo Molnar [Wed, 17 Dec 2008 12:09:20 +0000 (13:09 +0100)]
x86, perfcounters: refactor code for fixed-function PMCs

Impact: clean up

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: add fixed-mode PMC enumeration
Ingo Molnar [Wed, 17 Dec 2008 09:51:15 +0000 (10:51 +0100)]
perfcounters: add fixed-mode PMC enumeration

Enumerate fixed-mode PMCs based on CPUID, and feed that into the
perfcounter code.

Does not use fixed-mode PMCs yet.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, perfcounters: prepare for fixed-mode PMCs
Ingo Molnar [Wed, 17 Dec 2008 08:09:13 +0000 (09:09 +0100)]
x86, perfcounters: prepare for fixed-mode PMCs

Impact: refactor the x86 code for fixed-mode PMCs

Extend the data structures and rename the existing facilities
to allow for a 'generic' versus 'fixed' counter distinction.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, perfcounters: rename intel_arch_perfmon.h => perf_counter.h
Ingo Molnar [Wed, 17 Dec 2008 08:02:19 +0000 (09:02 +0100)]
x86, perfcounters: rename intel_arch_perfmon.h => perf_counter.h

Impact: rename include file

We'll be providing an asm/perf_counter.h to the generic perfcounter code,
so use the already existing x86 file for this purpose and rename it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: tweak group scheduling
Ingo Molnar [Wed, 17 Dec 2008 07:54:56 +0000 (08:54 +0100)]
perfcounters: tweak group scheduling

Impact: schedule in groups atomically

If there are multiple groups in a task, make sure they are scheduled
in and out atomically.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: remove warnings
Ingo Molnar [Tue, 23 Dec 2008 11:04:16 +0000 (12:04 +0100)]
perfcounters: remove warnings

Impact: remove debug checks

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: perf_counter.c intel_perfmon_event_map and max_intel_perfmon_events should be...
Jaswinder Singh [Fri, 19 Dec 2008 17:07:58 +0000 (22:37 +0530)]
x86: perf_counter.c intel_perfmon_event_map and max_intel_perfmon_events should be static

Impact: cleanup, avoid sparse warnings, reduce kernel size a bit

Fixes these sparse warnings:
 arch/x86/kernel/cpu/perf_counter.c:44:11: warning: symbol 'intel_perfmon_event_map' was not declared. Should it be static?
 arch/x86/kernel/cpu/perf_counter.c:54:11: warning: symbol 'max_intel_perfmon_events' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: use hw_event.disable flag
Ingo Molnar [Tue, 16 Dec 2008 23:43:10 +0000 (00:43 +0100)]
perfcounters: use hw_event.disable flag

Impact: implement default-off counters

Make sure that counters that are created with counter.hw_event.disabled=1,
get created in disabled state.

They can be enabled via:

        prctl(PR_TASK_PERF_COUNTERS_ENABLE);

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: flush on setuid exec
Ingo Molnar [Tue, 16 Dec 2008 12:40:44 +0000 (13:40 +0100)]
perfcounters: flush on setuid exec

Pavel Machek pointed out that performance counters should be flushed
when crossing protection domains on setuid execution.

Reported-by: Pavel Machek <pavel@suse.cz>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: release CPU context when exiting task counters
Ingo Molnar [Sun, 14 Dec 2008 22:20:36 +0000 (23:20 +0100)]
perfcounters: release CPU context when exiting task counters

If counters are exiting via do_exit() not via filp close, then
the CPU context needs to be released - otherwise future percpu
counter creations might fail.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'linus' into perfcounters/core
Ingo Molnar [Sun, 14 Dec 2008 20:58:33 +0000 (21:58 +0100)]
Merge branch 'linus' into perfcounters/core

16 years agoperfcounters: fix lapic initialization
Ingo Molnar [Sun, 14 Dec 2008 20:58:46 +0000 (21:58 +0100)]
perfcounters: fix lapic initialization

Fix non-working NMI sampling in certain bootup scenarios.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters, x86: fix sw counters on non-PMC CPUs
Ingo Molnar [Sun, 14 Dec 2008 19:21:00 +0000 (20:21 +0100)]
perfcounters, x86: fix sw counters on non-PMC CPUs

Make perf_max_counters default to at least 1 - this allows the sw
counters to be used.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: fix non-intel-perfmon CPUs
Ingo Molnar [Sun, 14 Dec 2008 17:36:30 +0000 (18:36 +0100)]
perfcounters: fix non-intel-perfmon CPUs

Do not write MSR_CORE_PERF_GLOBAL_CTRL on CPUs where it does not exist.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: add nr-of-faults counter
Ingo Molnar [Sun, 14 Dec 2008 13:44:31 +0000 (14:44 +0100)]
perfcounters: add nr-of-faults counter

Impact: add new feature, new sw counter

Add a counter that counts the number of pagefaults a task
is experiencing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: add task migrations counter
Ingo Molnar [Sun, 14 Dec 2008 11:34:15 +0000 (12:34 +0100)]
perfcounters: add task migrations counter

Impact: add new feature, new sw counter

Add a counter that counts the number of cross-CPU migrations a
task is suffering.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: add context switch counter
Ingo Molnar [Sun, 14 Dec 2008 11:28:33 +0000 (12:28 +0100)]
perfcounters: add context switch counter

Impact: add new feature, new sw counter

Add a counter that counts the number of context-switches a task
is doing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: fix task clock counter
Ingo Molnar [Sun, 14 Dec 2008 11:22:31 +0000 (12:22 +0100)]
perfcounters: fix task clock counter

Impact: bugfix

Update the task clock counter to the new math.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: implement "counter inheritance"
Ingo Molnar [Fri, 12 Dec 2008 12:49:45 +0000 (13:49 +0100)]
perfcounters: implement "counter inheritance"

Impact: implement new performance feature

Counter inheritance can be used to run performance counters in a workload,
transparently - and pipe back the counter results to the parent counter.

Inheritance for performance counters works the following way: when creating
a counter it can be marked with the .inherit=1 flag. Such counters are then
'inherited' by all child tasks (be they fork()-ed or clone()-ed). These
counters get inherited through exec() boundaries as well (except through
setuid boundaries).

The counter values get added back to the parent counter(s) when the child
task(s) exit - much like stime/utime statistics are gathered. So inherited
counters are ideal to gather summary statistics about an application's
behavior via shell commands, without having to modify that application.

The timec.c command utilizes counter inheritance:

  http://redhat.com/~mingo/perfcounters/timec.c

Sample output:

   $ ./timec -e 1 -e 3 -e 5 ls -lR /usr/include/ >/dev/null

   Performance counter stats for 'ls':

           163516953 instructions
                2295 cache-misses
             2855182 branch-misses

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperfcounters: restructure x86 counter math
Ingo Molnar [Sat, 13 Dec 2008 08:00:03 +0000 (09:00 +0100)]
perfcounters: restructure x86 counter math

Impact: restructure code

Change counter math from absolute values to clear delta logic.

We try to extract elapsed deltas from the raw hw counter - and put
that into the generic counter.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: implement atomic64_t on 32-bit
Ingo Molnar [Sun, 14 Dec 2008 19:22:35 +0000 (20:22 +0100)]
x86: implement atomic64_t on 32-bit

Impact: new API

Implement the atomic64_t APIs on 32-bit as well. Will be used by
the performance counters code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/powerpc-4xx
Linus Torvalds [Sat, 13 Dec 2008 19:32:24 +0000 (11:32 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/powerpc-4xx

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/powerpc-4xx:
  powerpc/40x: Add proper BOOTCFLAGS for cuboot-acadia

16 years agoMerge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
Linus Torvalds [Sat, 13 Dec 2008 19:32:04 +0000 (11:32 -0800)]
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6

* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
  i2c-highlander: Trivial endian casting fixes
  i2c-pmcmsp: Fix endianness misannotation

16 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
Linus Torvalds [Sat, 13 Dec 2008 19:28:13 +0000 (11:28 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Commands needing to be retried require a complete re-initialization.

16 years agoMerge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
Linus Torvalds [Sat, 13 Dec 2008 19:26:34 +0000 (11:26 -0800)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  MIPS: IP32: Update defconfig
  MIPS: Add missing calls to plat_unmap_dma_mem.
  MIPS: Kconfig: Fix the arch-specific header path
  MIPS: Use EI/DI for MIPS R2.

16 years agoconsole ASCII glyph 1:1 mapping
Ingo Brueckl [Wed, 10 Dec 2008 22:35:00 +0000 (23:35 +0100)]
console ASCII glyph 1:1 mapping

For the console, there is a 1:1 mapping of glyphs which cannot be found
in the current font.  This seems to be meant as a kind of 'emergency
fallback' for fonts without unicode mapping which otherwise would
display nothing readable on the screen.

At the moment it affects all chars for which no substitution character
is defined.  In particular this means that for all chars (>= 128) where
there is no iso88591-1/unicode character (e.g.  control character area)
you'll get the very strange 1:1 mapping of the (cp437) graphics card
glyphs.

I'm pretty sure that the 1:1 mapping should only affect strict ASCII
code characters, i.e.  chars < 128.

The patch limits the mapping as it probably was meant anyway.

Signed-off-by: Ingo Brueckl <ib@wupperonline.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Egmont Koblinger <egmont@uhulinux.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agounicode table for cp437
Ingo Brueckl [Wed, 10 Dec 2008 22:34:00 +0000 (23:34 +0100)]
unicode table for cp437

There is a major bug in the cp437 to unicode translation table.  Char
0x7c is mapped to U+00a5 which is the Yen sign and wrong.  The right
mapping is U+00a6 (broken bar).

Furthermore, a mapping for U+00b4 (a widely used character) is missing
even though easily possible.

The patch fixes these, as well as it provides a few other useful
mappings.

The changes are as follows:

  0x0f (enhancement) enables a sort of currency symbol
  0x27 (bug) enables a sort of acute accent which is a widely used character
  0x44 (enhancement) enables a sort of icelandic capital letter eth
  0x7c (major bug) corrects mapping
  0xeb (enhancement) enables a sort of icelandic small letter eth
  0xee (enhancement) enables a sort of math 'element of'

Signed-off-by: Ingo Brueckl <ib@wupperonline.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMIPS: IP32: Update defconfig
David Daney [Wed, 10 Dec 2008 23:27:13 +0000 (15:27 -0800)]
MIPS: IP32: Update defconfig

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
16 years agoMIPS: Add missing calls to plat_unmap_dma_mem.
David Daney [Thu, 11 Dec 2008 02:14:45 +0000 (18:14 -0800)]
MIPS: Add missing calls to plat_unmap_dma_mem.

dma_free_noncoherent() and dma_free_coherent() are missing calls to
plat_unmap_dma_mem().  This patch adds them.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
16 years agoMIPS: Kconfig: Fix the arch-specific header path
Dmitri Vorobiev [Wed, 10 Dec 2008 20:38:36 +0000 (22:38 +0200)]
MIPS: Kconfig: Fix the arch-specific header path

The header path in the help text for the RUNTIME_DEBUG config option is
obsolete and needs to be updated to match the new location of
architecture-specific header files. While at it, fix the spelling mistake.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
16 years agoMIPS: Use EI/DI for MIPS R2.
David Daney [Wed, 10 Dec 2008 16:37:25 +0000 (08:37 -0800)]
MIPS: Use EI/DI for MIPS R2.

For MIPS R2, use the EI and DI instructions to enable and disable
interrupts.

Signed-off-by: Tomaso Paoletti <tpaoletti@caviumnetworks.com>
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
16 years agoCommands needing to be retried require a complete re-initialization.
Alan D. Brunelle [Tue, 9 Dec 2008 14:52:15 +0000 (15:52 +0100)]
Commands needing to be retried require a complete re-initialization.

The test-unit-ready portion of this patch was causing boots to fail on
my test machine (as in http://lkml.org/lkml/2008/12/5/161). With this
patch in place, the system is booting reliably.

Mike Anderson found the same problem in the hp_hw_start_stop code,
and I applied the same solution in cdrom_read_cdda_bpc.

Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
16 years agoMerge branch 'x86/irq' into perfcounters/core
Ingo Molnar [Fri, 12 Dec 2008 11:00:02 +0000 (12:00 +0100)]
Merge branch 'x86/irq' into perfcounters/core

( with manual semantic merge of arch/x86/kernel/cpu/perf_counter.c )

16 years agox86: hardirq: introduce inc_irq_stat()
Hiroshi Shimamoto [Tue, 9 Dec 2008 03:19:26 +0000 (19:19 -0800)]
x86: hardirq: introduce inc_irq_stat()

Impact: cleanup

Introduce inc_irq_stat() macro and unify irq_stat accounting code.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge commit 'v2.6.28-rc8' into x86/irq
Ingo Molnar [Fri, 12 Dec 2008 10:59:39 +0000 (11:59 +0100)]
Merge commit 'v2.6.28-rc8' into x86/irq

16 years agoperf counters: update docs
Ingo Molnar [Thu, 11 Dec 2008 19:40:18 +0000 (20:40 +0100)]
perf counters: update docs

Impact: update docs

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: clean up state transitions
Ingo Molnar [Thu, 11 Dec 2008 14:17:03 +0000 (15:17 +0100)]
perf counters: clean up state transitions

Impact: cleanup

Introduce a proper enum for the 3 states of a counter:

PERF_COUNTER_STATE_OFF = -1
PERF_COUNTER_STATE_INACTIVE =  0
PERF_COUNTER_STATE_ACTIVE =  1

and rename counter->active to counter->state and propagate the
changes everywhere.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: add prctl interface to disable/enable counters
Ingo Molnar [Thu, 11 Dec 2008 13:59:31 +0000 (14:59 +0100)]
perf counters: add prctl interface to disable/enable counters

Add a way for self-monitoring tasks to disable/enable counters summarily,
via a prctl:

PR_TASK_PERF_COUNTERS_DISABLE 31
PR_TASK_PERF_COUNTERS_ENABLE 32

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: implement PERF_COUNT_TASK_CLOCK
Ingo Molnar [Thu, 11 Dec 2008 13:03:20 +0000 (14:03 +0100)]
perf counters: implement PERF_COUNT_TASK_CLOCK

Impact: add new perf-counter type

The 'task clock' counter counts the amount of time a task is executing,
in nanoseconds. It stops ticking when a task is scheduled out either due
to it blocking, sleeping or it being preempted.

This counter type is a Linux kernel based abstraction, it is available
even if the hardware does not support native hardware performance counters.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: consolidate hw_perf save/restore APIs
Ingo Molnar [Thu, 11 Dec 2008 12:45:51 +0000 (13:45 +0100)]
perf counters: consolidate hw_perf save/restore APIs

Impact: cleanup

Rename them to better match up the usual IRQ disable/enable APIs:

 hw_perf_disable_all()  => hw_perf_save_disable()
 hw_perf_restore_ctrl() => hw_perf_restore()

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: implement PERF_COUNT_CPU_CLOCK
Ingo Molnar [Thu, 11 Dec 2008 12:21:10 +0000 (13:21 +0100)]
perf counters: implement PERF_COUNT_CPU_CLOCK

Impact: add new perf-counter type

The 'CPU clock' counter counts the amount of CPU clock time that is
elapsing, in nanoseconds. (regardless of how much of it the task is
spending on a CPU executing)

This counter type is a Linux kernel based abstraction, it is available
even if the hardware does not support native hardware performance counters.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: hw driver API
Ingo Molnar [Thu, 11 Dec 2008 11:46:46 +0000 (12:46 +0100)]
perf counters: hw driver API

Impact: restructure code, introduce hw_ops driver abstraction

Introduce this abstraction to handle counter details:

 struct hw_perf_counter_ops {
void (*hw_perf_counter_enable) (struct perf_counter *counter);
void (*hw_perf_counter_disable) (struct perf_counter *counter);
void (*hw_perf_counter_read) (struct perf_counter *counter);
 };

This will be useful to support assymetric hw details, and it will also
be useful to implement "software counters". (Counters that count kernel
managed sw events such as pagefaults, context-switches, wall-clock time
or task-local time.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: group counter, fixes
Ingo Molnar [Thu, 11 Dec 2008 10:26:29 +0000 (11:26 +0100)]
perf counters: group counter, fixes

Impact: bugfix

Check that a group does not span outside the context of a CPU or a task.

Also, do not allow deep recursive hierarchies.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: add support for group counters
Ingo Molnar [Thu, 11 Dec 2008 07:38:42 +0000 (08:38 +0100)]
perf counters: add support for group counters

Impact: add group counters

This patch adds the "counter groups" abstraction.

Groups of counters behave much like normal 'single' counters, with a
few semantic and behavioral extensions on top of that.

A counter group is created by creating a new counter with the open()
syscall's group-leader group_fd file descriptor parameter pointing
to another, already existing counter.

Groups of counters are scheduled in and out in one atomic group, and
they are also roundrobin-scheduled atomically.

Counters that are member of a group can also record events with an
(atomic) extended timestamp that extends to all members of the group,
if the record type is set to PERF_RECORD_GROUP.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: restructure the API
Ingo Molnar [Wed, 10 Dec 2008 11:33:23 +0000 (12:33 +0100)]
perf counters: restructure the API

Impact: clean up new API

Thorough cleanup of the new perf counters API, we now get clean separation
of the various concepts:

 - introduce perf_counter_hw_event to separate out the event source details

 - move special type flags into separate attributes: PERF_COUNT_NMI,
   PERF_COUNT_RAW

 - extend the type to u64 and reserve it fully to the architecture in the
   raw type case.

And make use of all these changes in the core and x86 perfcounters code.

Also change the syscall signature to:

  asmlinkage int sys_perf_counter_open(

struct perf_counter_hw_event *hw_event_uptr __user,
pid_t pid,
int cpu,
int group_fd);

( Note that group_fd is unused for now - it's reserved for the counter
  groups abstraction. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: expand use of counter->event
Thomas Gleixner [Mon, 8 Dec 2008 18:35:37 +0000 (19:35 +0100)]
perf counters: expand use of counter->event

Impact: change syscall, cleanup

Make use of the new perf_counters event type.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: clean up 'raw' type API
Thomas Gleixner [Mon, 8 Dec 2008 18:26:59 +0000 (19:26 +0100)]
perf counters: clean up 'raw' type API

Impact: cleanup

Introduce a separate hw_event type.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoperf counters: protect them against CSTATE transitions
Thomas Gleixner [Tue, 9 Dec 2008 20:43:39 +0000 (21:43 +0100)]
perf counters: protect them against CSTATE transitions

Impact: fix rare lost events problem

There are CPUs whose performance counters misbehave on CSTATE transitions,
so provide a way to just disable/enable them around deep idle methods.

(hw_perf_enable_all() is cheap on x86.)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agopowerpc/40x: Add proper BOOTCFLAGS for cuboot-acadia
Josh Boyer [Tue, 25 Nov 2008 06:33:35 +0000 (06:33 +0000)]
powerpc/40x: Add proper BOOTCFLAGS for cuboot-acadia

The cuboot-acadia.c wrapper can cause assembler errors on some
toolchains due to the lack of the proper BOOTCFLAGS.  This adds
the proper flags for the file.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
16 years agoi2c-highlander: Trivial endian casting fixes
Harvey Harrison [Thu, 11 Dec 2008 11:11:21 +0000 (12:11 +0100)]
i2c-highlander: Trivial endian casting fixes

Fixes sparse warnings:
drivers/i2c/busses/i2c-highlander.c:95:26: warning: incorrect type in argument 1 (different base types)
drivers/i2c/busses/i2c-highlander.c:95:26:    expected restricted __be16 const [usertype] *p
drivers/i2c/busses/i2c-highlander.c:95:26:    got unsigned short [usertype] *<noident>
drivers/i2c/busses/i2c-highlander.c:106:15: warning: incorrect type in assignment (different base types)
drivers/i2c/busses/i2c-highlander.c:106:15:    expected unsigned short [unsigned] [short] [usertype] <noident>
drivers/i2c/busses/i2c-highlander.c:106:15:    got restricted __be16

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ben Dooks <ben-linux@fluff.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
16 years agoi2c-pmcmsp: Fix endianness misannotation
Harvey Harrison [Thu, 11 Dec 2008 11:11:20 +0000 (12:11 +0100)]
i2c-pmcmsp: Fix endianness misannotation

tmp is used as host-endian and is loaded from a be64, fix the cast and the
endian accessor used.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
16 years agoRevert "radeonfb: accelerate imageblit and other improvements"
Linus Torvalds [Wed, 10 Dec 2008 17:26:17 +0000 (09:26 -0800)]
Revert "radeonfb: accelerate imageblit and other improvements"

This reverts commit b1ee26bab14886350ba12a5c10cbc0696ac679bf, along with
the "fixes" for it that all just caused problems:

 - c4c6fa9891f3d1bcaae4f39fb751d5302965b566 "radeonfb: fix problem with
   color expansion & alignment"

 - f3179748a157c21d44d929fd3779421ebfbeaa93 "radeonfb: Disable new color
   expand acceleration unless explicitely enabled"

because even when disabled, it breaks for people. See

http://bugzilla.kernel.org/show_bug.cgi?id=12191

for the latest example.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Krzysztof Halasa <khc@pm.waw.pl>
Cc: James Cloos <cloos@jhcloos.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Jean-Luc Coulon <jean.luc.coulon@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoLinux 2.6.28-rc8 v2.6.28-rc8
Linus Torvalds [Wed, 10 Dec 2008 23:11:51 +0000 (15:11 -0800)]
Linux 2.6.28-rc8

16 years agoMerge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 10 Dec 2008 22:41:06 +0000 (14:41 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: CPU remove deadlock fix

16 years agofix mapping_writably_mapped()
Hugh Dickins [Wed, 10 Dec 2008 20:48:52 +0000 (20:48 +0000)]
fix mapping_writably_mapped()

Lee Schermerhorn noticed yesterday that I broke the mapping_writably_mapped
test in 2.6.7!  Bad bad bug, good good find.

The i_mmap_writable count must be incremented for VM_SHARED (just as
i_writecount is for VM_DENYWRITE, but while holding the i_mmap_lock)
when dup_mmap() copies the vma for fork: it has its own more optimal
version of __vma_link_file(), and I missed this out.  So the count
was later going down to 0 (dangerous) when one end unmapped, then
wrapping negative (inefficient) when the other end unmapped.

The only impact on x86 would have been that setting a mandatory lock on
a file which has at some time been opened O_RDWR and mapped MAP_SHARED
(but not necessarily PROT_WRITE) across a fork, might fail with -EAGAIN
when it should succeed, or succeed when it should fail.

But those architectures which rely on flush_dcache_page() to flush
userspace modifications back into the page before the kernel reads it,
may in some cases have skipped the flush after such a fork - though any
repetitive test will soon wrap the count negative, in which case it will
flush_dcache_page() unnecessarily.

Fix would be a two-liner, but mapping variable added, and comment moved.

Reported-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux...
Linus Torvalds [Wed, 10 Dec 2008 22:40:21 +0000 (14:40 -0800)]
Merge branch 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland

* 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland:
  tracehook: exec double-reporting fix

16 years agolib/idr.c: Fix bug introduced by RCU fix
Manfred Spraul [Wed, 10 Dec 2008 17:17:06 +0000 (18:17 +0100)]
lib/idr.c: Fix bug introduced by RCU fix

The last patch to lib/idr.c caused a bug if idr_get_new_above() was
called on an empty idr.

Usually, nodes stay on the same layer.  New layers are added to the top
of the tree.

The exception is idr_get_new_above() on an empty tree: In this case, the
new root node is first added on layer 0, then moved upwards.  p->layer
was not updated.

As usual: You shall never rely on the source code comments, they will
only mislead you.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMN10300: Give correct size when reserving interrupt vector table
Akira Takeuchi [Wed, 10 Dec 2008 12:43:39 +0000 (12:43 +0000)]
MN10300: Give correct size when reserving interrupt vector table

Give the correct size when reserving the interrupt vector table.  It should be
a page not a single byte.

Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMN10300: Fix __put_user_asm8()
Akira Takeuchi [Wed, 10 Dec 2008 12:43:34 +0000 (12:43 +0000)]
MN10300: Fix __put_user_asm8()

Fix __put_user_asm8() by jumping to the end label (3:) from the exception
handler, rather than jumping back to retry the second store instruction (label
2:).

Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMN10300: Fix the preemption resume_kernel() routine
Akira Takeuchi [Wed, 10 Dec 2008 12:43:29 +0000 (12:43 +0000)]
MN10300: Fix the preemption resume_kernel() routine

Fix the preemption resume_kernel() routine by inverting the test to see
whether interrupts are off (IM7 is all enabled, not all disabled).

Furthermore, interrupts should be disabled on entry to resume_kernel() so that
they're correctly set for jumping to restore_all() and doing the need
reschedule test.

Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMN10300: Discard low-priority Tx interrupts when closing an on-chip serial port
Akira Takeuchi [Wed, 10 Dec 2008 12:43:24 +0000 (12:43 +0000)]
MN10300: Discard low-priority Tx interrupts when closing an on-chip serial port

Discard low-prioriy Tx interrupts when closing an MN10300 on-chip serial port.

The MN10300 on-chip serial port uses three interrupts to manage its serial
ports:

 (1) A very high priority interrupt that drives virtual DMA for Rx.

 (2) A very high priority interrupt that drives virtual DMA for Tx.

 (3) A normal priority virtual interrupt that does the normal UART interrupt
     stuff and is shared between Rx and Tx.

mn10300_serial_stop_tx() only disables the high priority Tx interrupt.  It
doesn't also disable the normal priority one because it is shared with Rx.

However, the high priority interrupt may interrupt local_irq_disabled()
sections, and so may have queued up a low priority virtual interrupt whilst the
UART driver is asking for the Tx interrupt to be disabled.

The result of this can be an oops when we try to process the interrupt in
mn10300_serial_transmit_interrupt() as port->uart.info and port->uart.info->tty
may have gone away.

To deal with this, if either of those pointers is NULL, we make sure the
high-priority Tx interrupt is disabled and discard the interrupt.  The low
priority interrupt is disabled by the mn10300_serial_pic irq_chip table.

Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMN10300: vmlinux.lds.S cleanup - use PAGE_SIZE, PERCPU macros
Cyrill Gorcunov [Wed, 10 Dec 2008 12:43:19 +0000 (12:43 +0000)]
MN10300: vmlinux.lds.S cleanup - use PAGE_SIZE, PERCPU macros

Include the linux/page.h header into the MN10300 kernel linker script thus
allowing us to use PAGE_SIZE macro instead of a numeric constant.

Also use the PERCPU macro instead of an explicit section definition.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Wed, 10 Dec 2008 18:13:57 +0000 (10:13 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: api - Disallow cryptomgr as a module if algorithms are built-in

16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes...
Linus Torvalds [Wed, 10 Dec 2008 18:04:50 +0000 (10:04 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCIe: ASPM: Break out of endless loop waiting for PCI config bits to switch
  PCI: stop leaking 'slot_name' in pci_create_slot

16 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
Linus Torvalds [Wed, 10 Dec 2008 18:04:25 +0000 (10:04 -0800)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] SN: prevent IRQ retargetting in request_irq()
  [IA64] Fix section mismatch ioc3uart_init()/ioc3uart_submodule
  [IA64] Clear up section mismatch for ioc4_ide_attach_one.
  [IA64] Clear up section mismatch with arch_unregister_cpu()
  [IA64] Clear up section mismatch for sn_check_wars.
  [IA64] Updated the generic_defconfig to work with the 2.6.28-rc7 kernel.
  [IA64] Fix GRU compile error w/o CONFIG_HUGETLB_PAGE
  [IA64] eliminate NULL test and memset after alloc_bootmem
  [IA64] remove BUILD_BUG_ON from paravirt_getreg()

16 years agoMerge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
Linus Torvalds [Wed, 10 Dec 2008 18:03:55 +0000 (10:03 -0800)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  MIPS: Better than nothing implementation of PCI mmap to fix X.

16 years agopktcdvd: remove broken dev_t export of class devices
Kay Sievers [Sat, 6 Dec 2008 03:38:11 +0000 (04:38 +0100)]
pktcdvd: remove broken dev_t export of class devices

The pktcdvd created class devices only export some sysfs files,
but have no char dev_t registered in the driver.

At class device creation time they copy the dev_t value of the
block device to the char device, wich will register a new char
device in the driver core and userspace, with a conflicting dev_t
value.

In many cases the class devices dev_t just points to a random
USB device. This fixes the sysfs "duplicate entry" errors.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Peter Osterlund <petero2@telia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394...
Linus Torvalds [Wed, 10 Dec 2008 18:02:17 +0000 (10:02 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  firewire: fw-ohci: fix IOMMU resource exhaustion
  ieee1394: node manager causes up to ~3.25s delay in freezing tasks

16 years agodrivers/video/mb862xx/mb862xxfb.c: fix printk
Andrew Morton [Tue, 9 Dec 2008 21:14:31 +0000 (13:14 -0800)]
drivers/video/mb862xx/mb862xxfb.c: fix printk

sparc64:

drivers/video/mb862xx/mb862xxfb.c:929: warning: long long unsigned int format, resource_size_t arg (arg 4)
drivers/video/mb862xx/mb862xxfb.c:931: warning: long long unsigned int format, resource_size_t arg (arg 4)

We don't know what type the architecture uses to implement u64, hence they
cannot be printed.

Cc: Anatolij Gustschin <agust@denx.de>
Cc: Dmitry Baryshkov <dbaryshkov@gmail.com>
Cc: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: Matteo Fortini <m.fortini@selcomgroup.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoKSYM_SYMBOL_LEN fixes
Hugh Dickins [Tue, 9 Dec 2008 21:14:27 +0000 (13:14 -0800)]
KSYM_SYMBOL_LEN fixes

Miles Lane tailing /sys files hit a BUG which Pekka Enberg has tracked
to my 966c8c12dc9e77f931e2281ba25d2f0244b06949 sprint_symbol(): use
less stack exposing a bug in slub's list_locations() -
kallsyms_lookup() writes a 0 to namebuf[KSYM_NAME_LEN-1], but that was
beyond the end of page provided.

The 100 slop which list_locations() allows at end of page looks roughly
enough for all the other stuff it might print after the symbol before
it checks again: break out KSYM_SYMBOL_LEN earlier than before.

Latencytop and ftrace and are using KSYM_NAME_LEN buffers where they
need KSYM_SYMBOL_LEN buffers, and vmallocinfo a 2*KSYM_NAME_LEN buffer
where it wants a KSYM_SYMBOL_LEN buffer: fix those before anyone copies
them.

[akpm@linux-foundation.org: ftrace.h needs module.h]
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc Miles Lane <miles.lane@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoinotify: fix IN_ONESHOT unmount event watcher
Dmitri Monakhov [Tue, 9 Dec 2008 21:14:26 +0000 (13:14 -0800)]
inotify: fix IN_ONESHOT unmount event watcher

On umount two event will be dispatched to watcher:

1: inotify_dev_queue_event(.., IN_UNMOUNT,..)
2: remove_watch(watch, dev)
    ->inotify_dev_queue_event(.., IN_IGNORED, ..)

But if watcher has IN_ONESHOT bit set then the watcher will be released
inside first event.  Which result in accessing invalid object later.  IMHO
it is not pure regression.  This bug wasn't triggered while initial
inotify interface testing phase because of another bug in IN_ONESHOT
handling logic :)

  commit ac74c00e499ed276a965e5b5600667d5dc04a84a
  Author: Ulisses Furquim <ulissesf@gmail.com>
  Date:   Fri Feb 8 04:18:16 2008 -0800
    inotify: fix check for one-shot watches before destroying them
    As the IN_ONESHOT bit is never set when an event is sent we must check it
    in the watch's mask and not in the event's mask.

TESTCASE:
mkdir mnt
mount -ttmpfs none mnt
mkdir mnt/d
./inotify mnt/d&
umount mnt ## << lockup or crash here

TESTSOURCE:
/* gcc -oinotify inotify.c */
#include <stdio.h>
#include <stdlib.h>
#include <sys/inotify.h>

int main(int argc, char **argv)
{
        char buf[1024];
        struct inotify_event *ie;
        char *p;
        int i;
        ssize_t l;

        p = argv[1];
        i = inotify_init();
        inotify_add_watch(i, p, ~0);

        l = read(i, buf, sizeof(buf));
        printf("read %d bytes\n", l);
        ie = (struct inotify_event *) buf;
        printf("event mask: %d\n", ie->mask);
return 0;
}

Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Robert Love <rlove@google.com>
Cc: Ulisses Furquim <ulissesf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoatomic: fix a typo in atomic_long_xchg()
Eric Dumazet [Tue, 9 Dec 2008 21:14:25 +0000 (13:14 -0800)]
atomic: fix a typo in atomic_long_xchg()

atomic_long_xchg() is not correctly defined for 32bit arches.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agomm: no get_user/put_user while holding mmap_sem in do_pages_stat?
Brice Goglin [Tue, 9 Dec 2008 21:14:23 +0000 (13:14 -0800)]
mm: no get_user/put_user while holding mmap_sem in do_pages_stat?

Since commit 2f007e74bb85b9fc4eab28524052161703300f1a, do_pages_stat()
gets the page address from user-space and puts the corresponding status
back while holding the mmap_sem for read.  There is no need to hold
mmap_sem there while some page-faults may occur.

This patch adds a temporary address and status buffer so as to only
hold mmap_sem while working on these kernel buffers.  This is
implemented by extracting do_pages_stat_array() out of do_pages_stat().

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agodrivers/serial/s3c2440.c: fix typo in MODULE_LICENSE
Balaji Rao [Tue, 9 Dec 2008 21:14:22 +0000 (13:14 -0800)]
drivers/serial/s3c2440.c: fix typo in MODULE_LICENSE

Signed-off-by: Balaji Rao <balajirrao@gmail.com>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agopagemap: fix 32-bit pagemap regression
Matt Mackall [Tue, 9 Dec 2008 21:14:21 +0000 (13:14 -0800)]
pagemap: fix 32-bit pagemap regression

The large pages fix from bcf8039ed45 broke 32-bit pagemap by pulling the
pagemap entry code out into a function with the wrong return type.
Pagemap entries are 64 bits on all systems and unsigned long is only 32
bits on 32-bit systems.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Reported-by: Doug Graham <dgraham@nortel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: <stable@kernel.org> [2.6.26.x, 2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agopage_cgroup should ignore empty nodes
KAMEZAWA Hiroyuki [Tue, 9 Dec 2008 21:14:20 +0000 (13:14 -0800)]
page_cgroup should ignore empty nodes

Fix a total bootup freeze on ia64.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agortc twl4030: rename ioctl function when RTC_INTF_DEV=n
Randy Dunlap [Tue, 9 Dec 2008 21:14:18 +0000 (13:14 -0800)]
rtc twl4030: rename ioctl function when RTC_INTF_DEV=n

Fix build error when RTC_INTF_DEV=n:

drivers/rtc/rtc-twl4030.c:402: error: 'twl4030_rtc_ioctl' undeclared here (not in a function)
make[3]: *** [drivers/rtc/rtc-twl4030.o] Error 1

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agofbcon: fix workqueue shutdown
Geoff Levand [Tue, 9 Dec 2008 21:14:17 +0000 (13:14 -0800)]
fbcon: fix workqueue shutdown

Add a call to cancel_work_sync() in fbcon_exit() to cancel any pending
work in the fbcon workqueue.

The current implementation of fbcon_exit() sets the fbcon workqueue
function info->queue.func to NULL, but does not assure that there is no
work pending when it does so.  On occasion, depending on system timing,
there will still be pending work in the queue when fbcon_exit() is
called.  This results in a null pointer deference when run_workqueue()
tries to call the queue's work function.

Fixes errors on shutdown similar to these:

  Console: switching to colour dummy device 80x25
  Unable to handle kernel paging request for data at address 0x00000000

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agomm: remove UP version of lru_add_drain_all()
KOSAKI Motohiro [Tue, 9 Dec 2008 21:14:16 +0000 (13:14 -0800)]
mm: remove UP version of lru_add_drain_all()

Currently, lru_add_drain_all() has two version.
  (1) use schedule_on_each_cpu()
  (2) don't use schedule_on_each_cpu()

Gerald Schaefer reported it doesn't work well on SMP (not NUMA) S390
machine.

  offline_pages() calls lru_add_drain_all() followed by drain_all_pages().
  While drain_all_pages() works on each cpu, lru_add_drain_all() only runs
  on the current cpu for architectures w/o CONFIG_NUMA. This let us run
  into the BUG_ON(!PageBuddy(page)) in __offline_isolated_pages() during
  memory hotplug stress test on s390. The page in question was still on the
  pcp list, because of a race with lru_add_drain_all() and drain_all_pages()
  on different cpus.

Actually, Almost machine has CONFIG_UNEVICTABLE_LRU=y. Then almost machine use
(1) version lru_add_drain_all although the machine is UP.

Then this ifdef is not valueable.
simple removing is better.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorevert "percpu_counter: new function percpu_counter_sum_and_set"
Andrew Morton [Tue, 9 Dec 2008 21:14:14 +0000 (13:14 -0800)]
revert "percpu_counter: new function percpu_counter_sum_and_set"

Revert

    commit e8ced39d5e8911c662d4d69a342b9d053eaaac4e
    Author: Mingming Cao <cmm@us.ibm.com>
    Date:   Fri Jul 11 19:27:31 2008 -0400

        percpu_counter: new function percpu_counter_sum_and_set

As described in

revert "percpu counter: clean up percpu_counter_sum_and_set()"

the new percpu_counter_sum_and_set() is racy against updates to the
cpu-local accumulators on other CPUs.  Revert that change.

This means that ext4 will be slow again.  But correct.

Reported-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mingming Cao <cmm@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: <stable@kernel.org> [2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorevert "percpu counter: clean up percpu_counter_sum_and_set()"
Andrew Morton [Tue, 9 Dec 2008 21:14:13 +0000 (13:14 -0800)]
revert "percpu counter: clean up percpu_counter_sum_and_set()"

Revert

    commit 1f7c14c62ce63805f9574664a6c6de3633d4a354
    Author: Mingming Cao <cmm@us.ibm.com>
    Date:   Thu Oct 9 12:50:59 2008 -0400

        percpu counter: clean up percpu_counter_sum_and_set()

Before this patch we had the following:

percpu_counter_sum(): return the percpu_counter's value

percpu_counter_sum_and_set(): return the percpu_counter's value, copying
that value into the central value and zeroing the per-cpu counters before
returning.

After this patch, percpu_counter_sum_and_set() has gone, and
percpu_counter_sum() gets the old percpu_counter_sum_and_set()
functionality.

Problem is, as Eric points out, the old percpu_counter_sum_and_set()
functionality was racy and wrong.  It zeroes out counters on "other" cpus,
without holding any locks which will prevent races agaist updates from
those other CPUS.

This patch reverts 1f7c14c62ce63805f9574664a6c6de3633d4a354.  This means
that percpu_counter_sum_and_set() still has the race, but
percpu_counter_sum() does not.

Note that this is not a simple revert - ext4 has since started using
percpu_counter_sum() for its dirty_blocks counter as well.

Note that this revert patch changes percpu_counter_sum() semantics.

Before the patch, a call to percpu_counter_sum() will bring the counter's
central counter mostly up-to-date, so a following percpu_counter_read()
will return a close value.

After this patch, a call to percpu_counter_sum() will leave the counter's
central accumulator unaltered, so a subsequent call to
percpu_counter_read() can now return a significantly inaccurate result.

If there is any code in the tree which was introduced after
e8ced39d5e8911c662d4d69a342b9d053eaaac4e was merged, and which depends
upon the new percpu_counter_sum() semantics, that code will break.

Reported-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mingming Cao <cmm@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agopercpu_counter: fix CPU unplug race in percpu_counter_destroy()
Eric Dumazet [Tue, 9 Dec 2008 21:14:11 +0000 (13:14 -0800)]
percpu_counter: fix CPU unplug race in percpu_counter_destroy()

We should first delete the counter from percpu_counters list
before freeing memory, or a percpu_counter_hotcpu_callback()
could dereference a NULL pointer.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agortc: fix missing id_table in rtc-ds1672 and rtc-max6900 drivers
Alessandro Zummo [Tue, 9 Dec 2008 21:14:11 +0000 (13:14 -0800)]
rtc: fix missing id_table in rtc-ds1672 and rtc-max6900 drivers

Add missing id_table to the drivers in subject.  Patch is against the
latest git.  It should go in with 2.6.28 if possible, the drivers won't
work without the id_table bits.

Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Reported-by: Imre Kaloz <kaloz@openwrt.org>
Tested-by: Imre Kaloz <kaloz@openwrt.org>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorelayfs: fix infinite loop with splice()
Tom Zanussi [Tue, 9 Dec 2008 21:14:10 +0000 (13:14 -0800)]
relayfs: fix infinite loop with splice()

Running kmemtraced, which uses splice() on relayfs, causes a hard lock on
x86-64 SMP.  As described by Tom Zanussi:

  It looks like you hit the same problem as described here:

  commit 8191ecd1d14c6914c660dfa007154860a7908857

      splice: fix infinite loop in generic_file_splice_read()

  relay uses the same loop but it never got noticed or fixed.

Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Tested-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agouml: boot broken due to buffer overrun
Balbir Singh [Tue, 9 Dec 2008 21:14:07 +0000 (13:14 -0800)]
uml: boot broken due to buffer overrun

mconsole_init() passed 256 bytes as length in os_create_unix_socket, while
the sizeof UNIX_PATH_MAX is 108. This patch fixes that problem and avoids
a big overrun bug reported on UML bootup.

sockaddr_un.sun_path is UNIX_PATH_MAX long which causes the problem.
Reported-by: Vikas K Managutte <vikki.km@gmail.com>
Reported-by: Sarvesh Kumar Lal Das <skldas@gmail.com>
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Reviewed-by: WANG Cong <wangcong@zeuux.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: <stable@kernel.org> [please check with Jeff]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agomm/backing-dev.c: remove recently-added WARN_ON()
Andrew Morton [Tue, 9 Dec 2008 21:14:06 +0000 (13:14 -0800)]
mm/backing-dev.c: remove recently-added WARN_ON()

On second thoughts, this is just going to disturb people while telling us
things which we already knew.

Cc: Peter Korsgaard <jacmet@sunsite.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agocrypto: api - Disallow cryptomgr as a module if algorithms are built-in
Herbert Xu [Wed, 10 Dec 2008 12:29:44 +0000 (23:29 +1100)]
crypto: api - Disallow cryptomgr as a module if algorithms are built-in

If we have at least one algorithm built-in then it no longer makes
sense to have the testing framework, and hence cryptomgr to be a
module.  It should be either on or off, i.e., built-in or disabled.

This just happens to stop a potential runaway modprobe loop that
seems to trigger on at least one distro.

With fixes from Evgeniy Polyakov.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
16 years agofirewire: fw-ohci: fix IOMMU resource exhaustion
Stefan Richter [Tue, 9 Dec 2008 23:20:38 +0000 (00:20 +0100)]
firewire: fw-ohci: fix IOMMU resource exhaustion

There is a DMA map/ unmap imbalance whenever a block write request
packet is sent and then dequeued with ohci_cancel_packet.  The latter
may happen frequently if the AR resp tasklet is executed before the AT
req tasklet for the same transaction.

Add the missing dma_unmap_single.  This fixes
https://bugzilla.redhat.com/show_bug.cgi?id=475156

Reported-by: Emmanuel Kowalski
Tested-by: Emmanuel Kowalski
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
16 years agotracehook: exec double-reporting fix
Roland McGrath [Wed, 10 Dec 2008 03:36:38 +0000 (19:36 -0800)]
tracehook: exec double-reporting fix

The patch 6341c39 "tracehook: exec" introduced a small regression in
2.6.27 regarding binfmt_misc exec event reporting.  Since the reporting
is now done in the common search_binary_handler() function, an exec
of a misc binary will result in two (or possibly multiple) exec events
being reported, instead of just a single one, because the misc handler
contains a recursive call to search_binary_handler.

To add to the confusion, if PTRACE_O_TRACEEXEC is not active, the multiple
SIGTRAP signals will in fact cause only a single ptrace intercept, as the
signals are not queued.  However, if PTRACE_O_TRACEEXEC is on, the debugger
will actually see multiple ptrace intercepts (PTRACE_EVENT_EXEC).

The test program included below demonstrates the problem.

This change fixes the bug by calling tracehook_report_exec() only in the
outermost search_binary_handler() call (bprm->recursion_depth == 0).

The additional change to restore bprm->recursion_depth after each binfmt
load_binary call is actually superfluous for this bug, since we test the
value saved on entry to search_binary_handler().  But it keeps the use of
of the depth count to its most obvious expected meaning.  Depending on what
binfmt handlers do in certain cases, there could have been false-positive
tests for recursion limits before this change.

    /* Test program using PTRACE_O_TRACEEXEC.
       This forks and exec's the first argument with the rest of the arguments,
       while ptrace'ing.  It expects to see one PTRACE_EVENT_EXEC stop and
       then a successful exit, with no other signals or events in between.

       Test for kernel doing two PTRACE_EVENT_EXEC stops for a binfmt_misc exec:

       $ gcc -g traceexec.c -o traceexec
       $ sudo sh -c 'echo :test:M::foobar::/bin/cat: > /proc/sys/fs/binfmt_misc/register'
       $ echo 'foobar test' > ./foobar
       $ chmod +x ./foobar
       $ ./traceexec ./foobar; echo $?
       ==> good <==
       foobar test
       0
       $
       ==> bad <==
       foobar test
       unexpected status 0x4057f != 0
       3
       $

    */

    #include <stdio.h>
    #include <sys/types.h>
    #include <sys/wait.h>
    #include <sys/ptrace.h>
    #include <unistd.h>
    #include <signal.h>
    #include <stdlib.h>

    static void
    wait_for (pid_t child, int expect)
    {
      int status;
      pid_t p = wait (&status);
      if (p != child)
{
  perror ("wait");
  exit (2);
}
      if (status != expect)
{
  fprintf (stderr, "unexpected status %#x != %#x\n", status, expect);
  exit (3);
}
    }

    int
    main (int argc, char **argv)
    {
      pid_t child = fork ();

      if (child < 0)
{
  perror ("fork");
  return 127;
}
      else if (child == 0)
{
  ptrace (PTRACE_TRACEME);
  raise (SIGUSR1);
  execv (argv[1], &argv[1]);
  perror ("execve");
  _exit (127);
}

      wait_for (child, W_STOPCODE (SIGUSR1));

      if (ptrace (PTRACE_SETOPTIONS, child,
  0L, (void *) (long) PTRACE_O_TRACEEXEC) != 0)
{
  perror ("PTRACE_SETOPTIONS");
  return 4;
}

      if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
  perror ("PTRACE_CONT");
  return 5;
}

      wait_for (child, W_STOPCODE (SIGTRAP | (PTRACE_EVENT_EXEC << 8)));

      if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
  perror ("PTRACE_CONT");
  return 6;
}

      wait_for (child, W_EXITCODE (0, 0));

      return 0;
    }

Reported-by: Arnd Bergmann <arnd@arndb.de>
CC: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
16 years agoPCIe: ASPM: Break out of endless loop waiting for PCI config bits to switch
Thomas Renninger [Tue, 9 Dec 2008 12:05:09 +0000 (13:05 +0100)]
PCIe: ASPM: Break out of endless loop waiting for PCI config bits to switch

Makes a Compaq 6735s boot reliably again.  It used to hang in the loop
on some boots.  Give the link one second to train, otherwise break out
of the loop and reset the previously set clock bits.

Cc: stable@vger.kernel.org
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
16 years agoPCI: stop leaking 'slot_name' in pci_create_slot
Alex Chiang [Tue, 2 Dec 2008 01:17:21 +0000 (18:17 -0700)]
PCI: stop leaking 'slot_name' in pci_create_slot

In pci_create_slot(), the local variable 'slot_name' is allocated by
make_slot_name(), but never freed. We never use it after passing it to
the kobject core, so we should free it upon function exit.

Cc: stable@kernel.org
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
16 years agoMIPS: Better than nothing implementation of PCI mmap to fix X.
Ralf Baechle [Tue, 9 Dec 2008 17:58:46 +0000 (17:58 +0000)]
MIPS: Better than nothing implementation of PCI mmap to fix X.

Certain X11 servers such as the SIS server will only work if PCI mmap is
implemented.  This patch implements PCI mmap but to be on the same side
so close to a release it only supports uncached mappings so performance
will not be optimal for some uses such as framebuffers.

Thanks to Zhang Le <r0bertz@gentoo.org> for the original report and
testing.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
16 years ago[IA64] SN: prevent IRQ retargetting in request_irq()
John Keller [Mon, 8 Dec 2008 17:44:11 +0000 (11:44 -0600)]
[IA64] SN: prevent IRQ retargetting in request_irq()

With the introduction of the generic affinity autoselector,
irq_select_affinity(), IRQs are now being retargetted,
using a default mask, via the request_irq() path.
This results in all IRQs targetted at CPU 0.

SN Altix assigns affinity in the SN PROM, and does not
expect that to be changed as part of request_irq().

Set the IRQ_AFFINITY_SET flag to prevent
request_irq() from resetting affinity.

Signed-off-by: John Keller <jpk@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
16 years agoieee1394: node manager causes up to ~3.25s delay in freezing tasks
Nigel Cunningham [Tue, 9 Dec 2008 11:40:20 +0000 (22:40 +1100)]
ieee1394: node manager causes up to ~3.25s delay in freezing tasks

The firewire nodemanager function "nodemgr_host_thread" contains a loop
that calls try_to_freeze near the top of the loop, but then delays for
up to 3.25 seconds (plus time to do work) before getting back to the top
of the loop. When starting a cycle post-boot, this doesn't seem to bite,
but it is causing a noticeable delay at boot time, when freezing
processes prior to starting to read the image.

The following patch adds invocation of try_to_freeze to the subloops
that are used in the body of this function. With these additions, the
time to freeze when starting to resume at boot time is virtually zero.
I'm no expert on firewire, and so don't know that we shouldn't check
the return value and jump back to the top of the loop or such like after
being frozen, but I submit it for your consideration.

Signed-off-by: Nigel Cunningham <nigel@tuxonice.net>
The delay until nodemgr freezes was up to 0.25s (plus time for node
probes) in Linux 2.6.27 and older and up to 3.25s (plus ~) since Linux
2.6.28-rc1, hence much more noticeable.

try_to_freeze() without any jump is correct.  The surrounding code in
the respective loops will catch whether another bus reset happens during
the freeze and handle it.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
16 years agoperfcounters: consolidate global-disable codepaths
Ingo Molnar [Tue, 9 Dec 2008 11:23:59 +0000 (12:23 +0100)]
perfcounters: consolidate global-disable codepaths

Impact: cleanup

Simplify global disable handling.

Signed-off-by: Ingo Molnar <mingo@elte.hu>