Ingo Molnar [Tue, 9 Dec 2008 10:40:46 +0000 (11:40 +0100)]
perfcounters, x86: simplify disable/enable of counters
Impact: fix spurious missed counter wakeups
In the case of NMI events, close a race window that can occur if an NMI
hits counter code that temporarily disables+enables a counter, and the NMI
leaks into the disabled section.
Ingo Molnar [Mon, 8 Dec 2008 13:20:16 +0000 (14:20 +0100)]
x86, perfcounters: read out MSR_CORE_PERF_GLOBAL_STATUS with counters disabled
Impact: make perfcounter NMI and IRQ sequence more robust
Make __smp_perf_counter_interrupt() a bit more conservative: first disable
all counters, then read out the status. Most invocations are because there
are real events, so there's no performance impact.
Thomas Gleixner [Thu, 4 Dec 2008 19:12:29 +0000 (20:12 +0100)]
performance counters: core code
Implement the core kernel bits of Performance Counters subsystem.
The Linux Performance Counter subsystem provides an abstraction of
performance counter hardware capabilities. It provides per task and per
CPU counters, and it provides event capabilities on top of those.
Performance counters are accessed via special file descriptors.
There's one file descriptor per virtual counter used.
The special file descriptor is opened via the perf_counter_open()
system call:
int
perf_counter_open(u32 hw_event_type,
u32 hw_event_period,
u32 record_type,
pid_t pid,
int cpu);
The syscall returns the new fd. The fd can be used via the normal
VFS system calls: read() can be used to read the counter, fcntl()
can be used to set the blocking mode, etc.
Multiple counters can be kept open at a time, and the counters
can be poll()ed.
See more details in Documentation/perf-counters.txt.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jonathan Corbet [Fri, 5 Dec 2008 23:12:48 +0000 (16:12 -0700)]
Fix a race condition in FASYNC handling
Changeset a238b790d5f99c7832f9b73ac8847025815b85f7 (Call fasync()
functions without the BKL) introduced a race which could leave
file->f_flags in a state inconsistent with what the underlying
driver/filesystem believes. Revert that change, and also fix the same
races in ioctl_fioasync() and ioctl_fionbio().
This is a minimal, short-term fix; the real fix will not involve the
BKL.
Linus Torvalds [Fri, 5 Dec 2008 22:49:18 +0000 (14:49 -0800)]
Enforce a minimum SG_IO timeout
There's no point in having too short SG_IO timeouts, since if the
command does end up timing out, we'll end up through the reset sequence
that is several seconds long in order to abort the command that timed
out.
As a result, shorter timeouts than a few seconds simply do not make
sense, as the recovery would be longer than the timeout itself.
Add a BLK_MIN_SG_TIMEOUT to match the existign BLK_DEFAULT_SG_TIMEOUT.
Suggested-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Andrew [Fri, 5 Dec 2008 16:34:56 +0000 (16:34 +0000)]
Fix incorrect use of loose in i2o_block.c
Fix incorrect use of loose in i2o_block.c
It should be 'lose', not 'loose'.
Signed-off-by: Nick Andrew <nick@nick-andrew.net> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Andrew [Fri, 5 Dec 2008 16:34:46 +0000 (16:34 +0000)]
Fix incorrect use of loose in tty/serial drivers
[Folded together as one diff from 3]
It should be 'lose', not 'loose'.
Signed-off-by: Nick Andrew <nick@nick-andrew.net> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 5 Dec 2008 21:30:03 +0000 (13:30 -0800)]
Revert "ACPI: battery: Convert discharge energy rate to current properly"
This reverts commit 558073dd56707864f09d563b64e7c37c021e89d2, along with
the failed try to fix the regression it caused ("ACPI: Fix ACPI battery
regression introduced by commit 558073"), which just made things worse.
Commit aaad077638be1a25871bcae5e43952d6b63abfca (that failed "Fix ACPI
battery regression") got the voltage conversion confused, and fixed the
problem with Rafael's battery monitor apparently just by mistake.
So revert them both, getting us back to the 2.6.27 state in this, and
let's revisit it when people understand what's going on.
Noted-by: Paul Martin <pm@debian.org> Requested-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Len Brown <len.brown@intel.com> Cc: Alexey Starikovskiy <astarikovskiy@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 5 Dec 2008 05:45:44 +0000 (21:45 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev:
[PATCH] fix bogus argument of blkdev_put() in pktcdvd
[PATCH 2/2] documnt FMODE_ constants
[PATCH 1/2] kill FMODE_NDELAY_NOW
[PATCH] clean up blkdev_get a little bit
[PATCH] Fix block dev compat ioctl handling
[PATCH] kill obsolete temporary comment in swsusp_close()
Linus Torvalds [Fri, 5 Dec 2008 05:44:40 +0000 (21:44 -0800)]
Merge branch 'drm-gem-update' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-gem-update' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/i915: Return error in i915_gem_set_to_gtt_domain if we're not in the GTT.
drm/i915: Retry execbuffer pinning after clearing the GTT
drm/i915: Move the execbuffer domain computations together
drm/i915: Rename object_set_domain to object_set_to_gpu_domain
drm/i915: Make a single set-to-cpu-domain path and use it wherever needed.
drm/i915: Make a single set-to-gtt-domain path.
drm/i915: If interrupted while setting object domains, still emit the flush.
drm/i915: Move flushing list cleanup from flush request retire to request emit.
drm/i915: Respect GM965/GM45 bit-17-instead-of-bit-11 option for swizzling.
Rafael J. Wysocki [Fri, 5 Dec 2008 00:07:51 +0000 (01:07 +0100)]
ACPI: Fix ACPI battery regression introduced by commit 558073
Commit 558073dd56707864f09d563b64e7c37c021e89d2 ("ACPI: battery: Convert
discharge energy rate to current properly") caused the battery subsystem
to report wrong values of the remaining time on battery power and the
time until fully charged on Toshiba Portege R500 (and presumably on
other boxes too).
Fix the issue by correcting the conversion from mW to mA.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 5 Dec 2008 05:40:29 +0000 (21:40 -0800)]
Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
time: catch xtime_nsec underflows and fix them
posix-cpu-timers: fix clock_gettime with CLOCK_PROCESS_CPUTIME_ID
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc64: Sync FPU state in VIS emulation handler.
sparc64: Fix VIS emulation bugs
sparc: asm/bitops.h should define __fls
sparc64: Fix bug in PTRACE_SETFPREGS64 handling.
Linus Torvalds [Fri, 5 Dec 2008 05:40:08 +0000 (21:40 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix early panic with boot option "nosmp"
x86/oprofile: fix Intel cpu family 6 detection
oprofile: fix CPU unplug panic in ppro_stop()
AMD IOMMU: fix possible race while accessing iommu->need_sync
AMD IOMMU: set device table entry for aliased devices
AMD IOMMU: struct amd_iommu remove padding on 64 bit
x86: fix broken flushing in GART nofullflush path
x86: fix dma_mapping_error for 32bit x86
Linus Torvalds [Fri, 5 Dec 2008 05:39:41 +0000 (21:39 -0800)]
Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
check_hung_task(): unsigned sysctl_hung_task_warnings cannot be less than 0
documentation: local_ops fix on_each_cpu
Linus Torvalds [Fri, 5 Dec 2008 05:39:21 +0000 (21:39 -0800)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: Return ENOSYS from sys32_syscall on 64bit kernels like elsewhere.
MIPS: 64-bit: vmsplice needs to use the compat wrapper for o32 and N32.
MIPS: o32: Fix number of arguments to splice(2).
MIPS: Malta: Consolidate platform device code.
MIPS: IP22, Fulong, Malta: Update defconfigs.
MIPS: Malta: Add back RTC support
MIPS: Fix potential DOS by untrusted user app.
Dave Chinner [Wed, 3 Dec 2008 22:09:34 +0000 (09:09 +1100)]
[XFS] Fix hang after disallowed rename across directory quota domains
When project quota is active and is being used for directory tree
quota control, we disallow rename outside the current directory
tree. This requires a check to be made after all the inodes
involved in the rename are locked. We fail to unlock the inodes
correctly if we disallow the rename when the target is outside the
current directory tree. This results in a hang on the next access
to the inodes involved in failed rename.
Anton Vorontsov [Thu, 4 Dec 2008 17:52:31 +0000 (20:52 +0300)]
powerpc/83xx: Enable FIXED_PHY in mpc834x_itx and mpc83xx defconfigs
This is needed so that Vitesse 7385 5-port switch could work on
MPC8349E-mITX boards.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Ralf Baechle [Wed, 3 Dec 2008 16:33:17 +0000 (16:33 +0000)]
MIPS: o32: Fix number of arguments to splice(2).
The syscall code was assuming splice only takes 4 arguments so no stack
arguments were being copied from the userspace stack to the kernel stack.
As the result splice was likely to fail with EINVAL.
Ralf Baechle [Mon, 1 Dec 2008 08:09:10 +0000 (08:09 +0000)]
MIPS: IP22, Fulong, Malta: Update defconfigs.
These haven't seen much attention for too long but particularly important
enable RTC_CLASS and CONFIG_RTC_HCTOSYS so the wall clock time is set on
kernel startup.
Tiejun Chen [Tue, 25 Nov 2008 08:33:20 +0000 (16:33 +0800)]
MIPS: Malta: Add back RTC support
With the conversion of MIPS to RTC_LIB the old RTC driver CONFIG_RTC became
unselectable. Fix by setting up a platform device. Also enable
RTC_CLASS so system time gets set from RTC on kernel initialization.
[Ralf: Original patch by Tiejun; polished nice and shiny by me]
Vlad Malov [Tue, 18 Nov 2008 23:05:46 +0000 (15:05 -0800)]
MIPS: Fix potential DOS by untrusted user app.
On a 64 bit kernel if an o32 syscall was made with a syscall number less
than 4000, we would read the function from outside of the bounds of the
syscall table. This led to non-deterministic behavior including system
crashes.
While we were at it we reworked the 32 bit version as well to use fewer
instructions. Both 32 and 64 bit versions are use the same code now.
Christoph Hellwig [Wed, 5 Nov 2008 13:58:42 +0000 (14:58 +0100)]
[PATCH 1/2] kill FMODE_NDELAY_NOW
Update FMODE_NDELAY before each ioctl call so that we can kill the
magic FMODE_NDELAY_NOW. It would be even better to do this directly
in setfl(), but for that we'd need to have FMODE_NDELAY for all files,
not just block special files.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Andreas Schwab [Fri, 31 Oct 2008 21:39:46 +0000 (22:39 +0100)]
[PATCH] Fix block dev compat ioctl handling
Commit 33c2dca4957bd0da3e1af7b96d0758d97e708ef6 (trim file propagation
in block/compat_ioctl.c) removed the handling of some ioctls from
compat_blkdev_driver_ioctl. That caused them to be rejected as unknown
by the compat layer.
Signed-off-by: Andreas Schwab <schwab@suse.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 30 Nov 2008 06:47:12 +0000 (01:47 -0500)]
[PATCH] kill obsolete temporary comment in swsusp_close()
it had been put there to mark the call of blkdev_put() that
needed proper argument propagated to it; later patch in the
same series had done just that.
john stultz [Tue, 2 Dec 2008 02:34:41 +0000 (18:34 -0800)]
time: catch xtime_nsec underflows and fix them
Impact: fix time warp bug
Alex Shi, along with Yanmin Zhang have been noticing occasional time
inconsistencies recently. Through their great diagnosis, they found that
the xtime_nsec value used in update_wall_time was occasionally going
negative. After looking through the code for awhile, I realized we have
the possibility for an underflow when three conditions are met in
update_wall_time():
1) We have accumulated a second's worth of nanoseconds, so we
incremented xtime.tv_sec and appropriately decrement xtime_nsec.
(This doesn't cause xtime_nsec to go negative, but it can cause it
to be small).
2) The remaining offset value is large, but just slightly less then
cycle_interval.
3) clocksource_adjust() is speeding up the clock, causing a
corrective amount (compensating for the increase in the multiplier
being multiplied against the unaccumulated offset value) to be
subtracted from xtime_nsec.
This can cause xtime_nsec to underflow.
Unfortunately, since we notify the NTP subsystem via second_overflow()
whenever we accumulate a full second, and this effects the error
accumulation that has already occured, we cannot simply revert the
accumulated second from xtime nor move the second accumulation to after
the clocksource_adjust call without a change in behavior.
This leaves us with (at least) two options:
1) Simply return from clocksource_adjust() without making a change if we
notice the adjustment would cause xtime_nsec to go negative.
This would work, but I'm concerned that if a large adjustment was needed
(due to the error being large), it may be possible to get stuck with an
ever increasing error that becomes too large to correct (since it may
always force xtime_nsec negative). This may just be paranoia on my part.
2) Catch xtime_nsec if it is negative, then add back the amount its
negative to both xtime_nsec and the error.
This second method is consistent with how we've handled earlier rounding
issues, and also has the benefit that the error being added is always in
the oposite direction also always equal or smaller then the correction
being applied. So the risk of a corner case where things get out of
control is lessened.
This patch fixes bug 11970, as tested by Yanmin Zhang
http://bugzilla.kernel.org/show_bug.cgi?id=11970
Joseph Myers [Thu, 4 Dec 2008 03:36:05 +0000 (19:36 -0800)]
sparc64: Fix VIS emulation bugs
This patch fixes some bugs in VIS emulation that cause the GCC test
failure
FAIL: gcc.target/sparc/pdist-3.c execution test
for both 32-bit and 64-bit testing on hardware lacking these
instructions. The emulation code for the pdist instruction uses
RS1(insn) for both source registers rs1 and rs2, which is obviously
wrong and leads to the instruction doing nothing (the observed
problem), and further inspection of the code shows that RS1 uses a
shift of 24 and RD a shift of 25, which clearly cannot both be right;
examining SPARC documentation indicates the correct shift for RS1 is
14.
This patch fixes the bug if single-stepping over the affected
instruction in the debugger, but not if the testcase is run
standalone. For that, Wind River has another patch I hope they will
send as a followup to this patch submission.
Signed-off-by: Joseph Myers <joseph@codesourcery.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Anholt [Wed, 26 Nov 2008 21:58:13 +0000 (13:58 -0800)]
drm/i915: Return error in i915_gem_set_to_gtt_domain if we're not in the GTT.
It's only for flushing caches appropriately for GTT access, not for actually
getting it there. Prevents potential smashing of cpu read/write domains on
unbound objects.
Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
Keith Packard [Fri, 21 Nov 2008 07:23:03 +0000 (23:23 -0800)]
drm/i915: Move the execbuffer domain computations together
This eliminates the dev_set_domain function and just in-lines it
where its used, with the goal of moving the manipulation and use of
invalidate_domains and flush_domains closer together. This also
avoids calling add_request unless some domain has been flushed.
Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
Keith Packard [Fri, 21 Nov 2008 07:11:08 +0000 (23:11 -0800)]
drm/i915: Rename object_set_domain to object_set_to_gpu_domain
Now that the CPU and GTT domain operations are isolated to their own
functions, the previously general-purpose set_domain function is now used
only to set GPU domains. It also has no failure cases, which is important as
this eliminates any possible interruption of the computation of new object
domains and subsequent emmission of the flushing instructions into the ring.
Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
Eric Anholt [Fri, 14 Nov 2008 21:35:19 +0000 (13:35 -0800)]
drm/i915: Make a single set-to-cpu-domain path and use it wherever needed.
This fixes several domain management bugs, including potential lack of cache
invalidation for pread, potential failure to wait for set_domain(CPU, 0),
and more, along with producing more intelligible code.
Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
Eric Anholt [Mon, 10 Nov 2008 18:53:25 +0000 (10:53 -0800)]
drm/i915: Make a single set-to-gtt-domain path.
This fixes failure to flush caches in the relocation update path, and
failure to wait in the set_domain ioctl, each of which could lead to incorrect
rendering.
Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
Eric Anholt [Fri, 7 Nov 2008 00:00:31 +0000 (16:00 -0800)]
drm/i915: Move flushing list cleanup from flush request retire to request emit.
obj_priv->write_domain is "write domain if the GPU went idle now", not
"write domain at this moment." By postponing the clear, we confused the
concept, required more storage, and potentially emitted more flushes than
are required.
Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
Eric Anholt [Tue, 25 Nov 2008 22:02:05 +0000 (14:02 -0800)]
drm/i915: Respect GM965/GM45 bit-17-instead-of-bit-11 option for swizzling.
This fixes readpixels and buffer corruption when swapped out and in by
disabling tiling on them.
Now that we know that the bit 17 mode isn't just a mistake of older chipsets,
we'll need to work on a clever fix so that we can get the performance of
tiling on these chipsets, but that will require intrusive changes targeted
at the next kernel release, not this one.
Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
David Howells [Wed, 3 Dec 2008 16:33:14 +0000 (16:33 +0000)]
MN10300: Introduce barriers to replace removed volatiles in gdbstub 16550 driver
Introduce into the MN10300 gdbstub 16550 driver a couple of barrier() calls to
replace the removed volatility of the input/output index variables for the Rx
ring buffer. A previous patch added them into the on-chip serial port driver.
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
James Morris [Wed, 3 Dec 2008 16:19:45 +0000 (03:19 +1100)]
MAINTAINERS: Add security subsystem maintainer
Add myself as overall maintainer of the security subsystem (generally,
components under the top-level security directory). This addresses
the lack of an official maintainer for the increasing number of
security projects being incorporated into the kernel.
Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 4 Dec 2008 00:45:56 +0000 (16:45 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: fix setting of max_segment_size and seg_boundary mask
block: internal dequeue shouldn't start timer
block: set disk->node_id before it's being used
When block layer fails to map iov, it calls bio_unmap_user to undo
Linus Torvalds [Thu, 4 Dec 2008 00:41:15 +0000 (16:41 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc/83xx: Fix MCU support merge issue in mpc8349emitx.dts
powerpc: Fix dma_map_sg() cache flushing on non coherent platforms
Linus Torvalds [Thu, 4 Dec 2008 00:40:37 +0000 (16:40 -0800)]
Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.28' of git://linux-nfs.org/~bfields/linux:
NLM: client-side nlm_lookup_host() should avoid matching on srcaddr
nfsd: use of unitialized list head on error exit in nfs4recover.c
Add a reference to sunrpc in svc_addsock
nfsd: clean up grace period on early exit
Linus Torvalds [Thu, 4 Dec 2008 00:20:19 +0000 (16:20 -0800)]
iTCO_wdt: fix typo when setting TCO_EN bit
The code used '&= 0x00002000' when it tried to set the TCO_EN bit, which
obviously didn't set that bit at all, but instead just reset all the
other bits in the SMI_EN register.
This bug seemingly caused various random behavior, with Frans Pop
reporting that X.org just silently hung at startup and Rafael Wysocki
reports the fan spinning with full speed.
See
http://lkml.org/lkml/2008/12/3/178
http://bugzilla.kernel.org/show_bug.cgi?id=12162
The problem seems to have been triggered by "[WATCHDOG] iTCO_wdt :
problem with rebooting on new ICH9 based motherboards" (commit 7cd5b08be3c489df11b559fef210b81133764ad4), but the bogus code existed
before that too (in the "supermicro_old_pre_stop()" function), it just
apparently never showed up due to different logic.
In that commit the broken code got moved around and now gets executed
much more.
Reported-by: Rafael J. Wysocki <rjw@sisk.pl> Tested-by: Frans Pop <elendil@planet.nl> Cc: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
William Cohen [Sun, 30 Nov 2008 20:39:10 +0000 (15:39 -0500)]
x86/oprofile: fix Intel cpu family 6 detection
Alan Jenkins wrote:
> This is on an EeePC 701, /proc/cpuinfo as attached.
>
> Is this expected? Will the next release work?
>
> Thanks, Alan
>
> # opcontrol --setup --no-vmlinux
> cpu_type 'unset' is not valid
> you should upgrade oprofile or force the use of timer mode
>
> # opcontrol -v
> opcontrol: oprofile 0.9.4 compiled on Nov 29 2008 22:44:10
>
> # cat /dev/oprofile/cpu_type
> i386/p6
> # uname -r
> 2.6.28-rc6eeepc
Hi Alan,
Looking at the kernel driver code for oprofile it can return the "i386/p6" for
the cpu_type. However, looking at the user-space oprofile code there isn't the
matching entry in libop/op_cpu_type.c or the events/unit_mask files in
events/i386 directory.
The Intel AP-485 says this is a "Intel Pentium M processor model D". Seems like
the oprofile kernel driver should be identifying the processor as "i386/p6_mobile"
The driver identification code doesn't look quite right in nmi_init.c
Anton Vorontsov [Thu, 27 Nov 2008 17:36:45 +0000 (20:36 +0300)]
powerpc/83xx: Fix MCU support merge issue in mpc8349emitx.dts
Just found the merge issue in 442746989d92afc125040e0f29b33602ad94da99
("powerpc/83xx: Add support for MCU microcontroller in .dts files"):
the commit adds the MCU controller node into the DMA node, which is
wrong because the MCU sits on the I2C bus. Fix this by moving the MCU
node into the I2C controller node.
The original patch[1] was OK though. ;-)
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Milan Broz [Wed, 3 Dec 2008 11:55:08 +0000 (12:55 +0100)]
block: fix setting of max_segment_size and seg_boundary mask
Fix setting of max_segment_size and seg_boundary mask for stacked md/dm
devices.
When stacking devices (LVM over MD over SCSI) some of the request queue
parameters are not set up correctly in some cases by default, namely
max_segment_size and and seg_boundary mask.
If you create MD device over SCSI, these attributes are zeroed.
Problem become when there is over this mapping next device-mapper mapping
- queue attributes are set in DM this way:
Unfortunately bio_add_page (resp. bio_phys_segments) calculates number of
physical segments according to these parameters.
During the generic_make_request() is segment cout recalculated and can
increase bio->bi_phys_segments count over the allowed limit. (After
bio_clone() in stack operation.)
Thi is specially problem in CCISS driver, where it produce OOPS here
BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);
(MAXSEGENTRIES is 31 by default.)
Sometimes even this command is enough to cause oops:
This command generates bios with 250 sectors, allocated in 32 4k-pages
(last page uses only 1024 bytes).
For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
unfortunatelly on lower layer it is recalculated to 32 segments and this
violates CCISS restriction and triggers BUG_ON().
The patch tries to fix it by:
* initializing attributes above in queue request constructor
blk_queue_make_request()
* make sure that blk_queue_stack_limits() inherits setting
(DM uses its own function to set the limits because it
blk_queue_stack_limits() was introduced later. It should probably switch
to use generic stack limit function too.)
* sets the default seg_boundary value in one place (blkdev.h)
* use this mask as default in DM (instead of -1, which differs in 64bit)
Bugs related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=471639
http://bugzilla.kernel.org/show_bug.cgi?id=8672
Signed-off-by: Milan Broz <mbroz@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com> Cc: Neil Brown <neilb@suse.de> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Cc: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Wed, 3 Dec 2008 11:41:26 +0000 (12:41 +0100)]
block: internal dequeue shouldn't start timer
blkdev_dequeue_request() and elv_dequeue_request() are equivalent and
both start the timeout timer. Barrier code dequeues the original
barrier request but doesn't passes the request itself to lower level
driver, only broken down proxy requests; however, as the original
barrier code goes through the same dequeue path and timeout timer is
started on it. If barrier sequence takes long enough, this timer
expires but the low level driver has no idea about this request and
oops follows.
Timeout timer shouldn't have been started on the original barrier
request as it never goes through actual IO. This patch unexports
elv_dequeue_request(), which has no external user anyway, and makes it
operate on elevator proper w/o adding the timer and make
blkdev_dequeue_request() call elv_dequeue_request() and add timer.
Internal users which don't pass the request to driver - barrier code
and end_that_request_last() - are converted to use
elv_dequeue_request().
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Petr Vandrovec [Wed, 19 Nov 2008 10:12:14 +0000 (11:12 +0100)]
When block layer fails to map iov, it calls bio_unmap_user to undo
mapping. Which is good if pages were mapped - but if they were provided
by someone else and just copied then bad things happen - pages are
released once here, and once by caller, leading to user triggerable BUG
at include/linux/mm.h:246.
Signed-off-by: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Joerg Roedel [Wed, 3 Dec 2008 11:19:27 +0000 (12:19 +0100)]
AMD IOMMU: fix possible race while accessing iommu->need_sync
The access to the iommu->need_sync member needs to be protected by the
iommu->lock. Otherwise this is a possible race condition. Fix it with
this patch.
Joerg Roedel [Tue, 25 Nov 2008 11:56:12 +0000 (12:56 +0100)]
AMD IOMMU: set device table entry for aliased devices
In some rare cases a request can arrive an IOMMU with its originial
requestor id even it is aliased. Handle this by setting the device table
entry to the same protection domain for the original and the aliased
requestor id.
Joerg Roedel [Tue, 2 Dec 2008 19:16:03 +0000 (20:16 +0100)]
x86: fix broken flushing in GART nofullflush path
Impact: remove stale IOTLB entries
In the non-default nofullflush case the GART is only flushed when
next_bit wraps around. But it can happen that an unmap operation unmaps
memory which is behind the current next_bit location. If these addresses
are reused it may result in stale GART IO/TLB entries. Fix this by
setting the GART next_bit always behind an unmapped location.
Chris Torek [Wed, 3 Dec 2008 08:47:28 +0000 (00:47 -0800)]
sparc64: Fix bug in PTRACE_SETFPREGS64 handling.
From: Chris Torek <chris.torek@windriver.com>
>The SPARC64 kernel code for PTRACE_SETFPREGS64 appears to be an exact copy
>of that for PTRACE_GETFPREGS64. This means that gdbserver and native
>64-bit GDB cannot set floating-point registers.
It looks like a simple typo.
Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin Herrenschmidt [Sun, 30 Nov 2008 18:53:40 +0000 (18:53 +0000)]
powerpc: Fix dma_map_sg() cache flushing on non coherent platforms
On PowerPC 4xx or other non cache-coherent platforms, we lost the
appropriate cache flushing in dma_map_sg() when merging the 32 and
64-bit DMA code (commit 4fc665b88a79a45bae8bbf3a05563c27c7337c3d,
"powerpc: Merge 32 and 64-bit dma code"). This restores it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] hpwdt: Fix kdump when using hpwdt
[WATCHDOG] hpwdt: set the mapped BIOS address space as executable
[WATCHDOG] iTCO_wdt: add PCI ID's for ICH9 & ICH10 chipsets
[WATCHDOG] iTCO_wdt : correct status clearing
[WATCHDOG] iTCO_wdt : problem with rebooting on new ICH9 based motherboards
[WATCHDOG] fix mtx1_wdt compilation failure
Linus Torvalds [Tue, 2 Dec 2008 23:56:55 +0000 (15:56 -0800)]
Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6
* 'linux-next' of git://git.infradead.org/ubifs-2.6:
UBIFS: pre-allocate bulk-read buffer
UBIFS: do not allocate too much
UBIFS: do not print scary memory allocation warnings
UBIFS: allow for gaps when dirtying the LPT
UBIFS: fix compilation warnings
MAINTAINERS: change UBI/UBIFS git tree URLs
UBIFS: endian handling fixes and annotations
UBIFS: remove printk
Linus Torvalds [Tue, 2 Dec 2008 23:56:17 +0000 (15:56 -0800)]
Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: MMU: avoid creation of unreachable pages in the shadow
KVM: ppc: stop leaking host memory on VM exit
KVM: MMU: fix sync of ptes addressed at owner pagetable
KVM: ia64: Fix: Use correct calling convention for PAL_VPS_RESUME_HANDLER
KVM: ia64: Fix incorrect kbuild CFLAGS override
KVM: VMX: Fix interrupt loss during race with NMI
KVM: s390: Fix problem state handling in guest sigp handler
Linus Torvalds [Tue, 2 Dec 2008 23:53:41 +0000 (15:53 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Update defconfigs for 2.6.28-rc7
macfb: Do not overflow fb_fix_screeninfo.id
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] stex: switch to block timeout
[SCSI] make scsi_eh_try_stu use block timeout
[SCSI] megaraid_sas: switch to block timeout
[SCSI] ibmvscsi: switch to block timeout
[SCSI] aacraid: switch to block timeout
[SCSI] zfcp: prevent double decrement on host_busy while being busy
[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP
[SCSI] zfcp: eliminate race between validation and locking
[SCSI] zfcp: verify for correct rport state before scanning for SCSI devs
[SCSI] zfcp: returning an ERR_PTR where a NULL value is expected
[SCSI] zfcp: Fix opening of wka ports
[SCSI] zfcp: fix remote port status check
[SCSI] fc_transport: fix old bug on bitflag definitions
[SCSI] Fix hang in starved list processing
Mark Salter [Tue, 2 Dec 2008 14:38:09 +0000 (14:38 +0000)]
MN10300: Fix application of kernel module relocations
This fixes the MN10300 kernel module linking to match the toolchain. RELA
relocs don't use the value at the location being relocated. This has been
working because the tools always leave the value at the target location
cleared.
Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dean Nelson [Tue, 2 Dec 2008 14:06:01 +0000 (08:06 -0600)]
sgi-gru: call fs_initcall() if statically linked
If xpc.ko and gru.ko are both statically linked into the kernel, then
xpc_init() can get called before gru_init() and make a call to one of the
gru's exported functions before the gru has initialized itself. The end
result is a NULL dereference.
Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kumar Gala [Tue, 2 Dec 2008 19:37:01 +0000 (13:37 -0600)]
powerpc: Use physical cpu id when setting the processor affinity
In the CONFIG_SMP case the irq_choose_cpu() code was returning back
a logical cpu id not the physical id. We were writing that directly
into the HW register.
We need to be calling get_hard_smp_processor_id() so irq_choose_cpu()
always returns a physical cpu id.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rik van Riel [Tue, 2 Dec 2008 18:31:52 +0000 (10:31 -0800)]
vmscan: evict streaming IO first
Count the insertion of new pages in the statistics used to drive the
pageout scanning code. This should help the kernel quickly evict
streaming file IO.
We count on the fact that new file pages start on the inactive file LRU
and new anonymous pages start on the active anon list. This means
streaming file IO will increment the recent scanned file statistic, while
leaving the recent rotated file statistic alone, driving pageout scanning
to the file LRUs.
Kay Sievers [Tue, 2 Dec 2008 18:31:50 +0000 (10:31 -0800)]
bdi: register sysfs bdi device only once per queue
Devices which share the same queue, like floppies and mtd devices, get
registered multiple times in the bdi interface, but bdi accounts only the
last registered device of the devices sharing one queue.
On remove, all earlier registered devices leak, stay around in sysfs, and
cause "duplicate filename" errors if the devices are re-created.
This prevents the creation of multiple bdi interfaces per queue, and the
bdi device will carry the dev_t name of the block device which is the
first one registered, of the pool of devices using the same queue.
[akpm@linux-foundation.org: add a WARN_ON so we know which drivers are misbehaving] Tested-by: Peter Korsgaard <jacmet@sunsite.dk> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Junjiro R. Okajima [Tue, 2 Dec 2008 18:31:46 +0000 (10:31 -0800)]
nfsd: fix vm overcommit crash fix #2
The previous patch from Alan Cox ("nfsd: fix vm overcommit crash",
commit 731572d39fcd3498702eda4600db4c43d51e0b26) fixed the problem where
knfsd crashes on exported shmemfs objects and strict overcommit is set.
But the patch forgot supporting the case when CONFIG_SECURITY is
disabled.
This patch copies a part of his fix which is mainly for detecting a bug
earlier.
Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Junjiro R. Okajima <hooanon05@yahoo.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bartlomiej Zolnierkiewicz [Tue, 2 Dec 2008 19:40:03 +0000 (20:40 +0100)]
amd74xx: workaround unreliable AltStatus register for nVidia controllers
It seems that on some nVidia controllers using AltStatus register
can be unreliable so default to Status register if the PCI device
is in Compatibility Mode. In order to achieve this:
* Add ide_pci_is_in_compatibility_mode() inline helper to <linux/ide.h>.
* Add IDE_HFLAG_BROKEN_ALTSTATUS host flag and set it in amd74xx host
driver for nVidia controllers in Compatibility Mode.
* Teach actual_try_to_identify() and drive_is_ready() about the new flag.
This fixes the regression caused by removal of CONFIG_IDEPCI_SHARE_IRQ
config option in 2.6.25 and using AltStatus register unconditionally when
available (kernel.org bugs #11659 and #10216).
[ Moreover for CONFIG_IDEPCI_SHARE_IRQ=y (which is what most people
and distributions use) it never worked correctly. ]
Thanks to Remy LABENE and Lars Winterfeld for help with debugging the problem.
More info at:
http://bugzilla.kernel.org/show_bug.cgi?id=11659
http://bugzilla.kernel.org/show_bug.cgi?id=10216