www.infradead.org Git - users/jedix/linux-maple.git/log

SPEC: v2.6.39-100.0.5 xen/config/tmem support
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>

config: Enable PARAVIRT and Xen options
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>

Revert "ipc semaphores: reduce ipc_lock contention in semtimedop"

This reverts commit c7fa322dd72b08450a440ef800124705a1fa148c.

Revert "ipc semaphores: order wakeups based on waiter CPU"

This reverts commit 8102e1ff9d667661b581209323faaf7a84f0f528.

Revert "use rwlocks for ipc"

This reverts commit 78fe45325c8e2e3f4b6ebb1ee15b6c2e8af5ddb1.

Revert "IPC lock reduction corners"

This reverts commit 8385de45ab8e4b40eaf8341f599bf0c19b08bb64.

Revert "IPC reduce lock contention in semctl"

This reverts commit a8fc9c3f989c474f44e6d4b4f126961207261a1e.

Merge branch 'in-3.1/bug.fixes' of git://oss.oracle.com/git/kwilk/xen into uek2-stable

config: from 6.1 and review
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>

xen/e820: if there is no dom0_mem=, don't tweak extra_pages.

The patch "xen: use maximum reservation to limit amount of usable RAM"
(d312ae878b6aed3912e1acaaf5d0b2a9d08a4f11) breaks machines that
do not use 'dom0_mem=' argument with:

reserve RAM buffer: 000000133f2e2000 - 000000133fffffff
(XEN) mm.c:4976:d0 Global bit is set to kernel page fffff8117e
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
...

The reason being that the last E820 entry is created using the
'extra_pages' (which is based on how many pages have been freed).
The mentioned git commit sets the initial value of 'extra_pages'
using a hypercall which returns the number of pages (if dom0_mem
has been used) or -1 otherwise. If the later we return with
MAX_DOMAIN_PAGES as basis for calculation:

    return min(max_pages, MAX_DOMAIN_PAGES);

and use it:

     extra_limit = xen_get_max_pages();
     if (extra_limit >= max_pfn)
             extra_pages = extra_limit - max_pfn;
     else
             extra_pages = 0;

which means we end up with extra_pages = 128GB in PFNs (33554432)
- 8GB in PFNs (2097152, on this specific box, can be larger or smaller),
and then we add that value to the E820 making it:

  Xen: 00000000ff000000 - 0000000100000000 (reserved)
  Xen: 0000000100000000 - 000000133f2e2000 (usable)

which is clearly wrong. It should look as so:

  Xen: 00000000ff000000 - 0000000100000000 (reserved)
  Xen: 0000000100000000 - 000000027fbda000 (usable)

Naturally this problem does not present itself if dom0_mem=max:X
is used.

CC: stable@kernel.org
CC: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: disable PV spinlocks on HVM

PV spinlocks cannot possibly work with the current code because they are
enabled after pvops patching has already been done, and because PV
spinlocks use a different data structure than native spinlocks so we
cannot switch between them dynamically. A spinlock that has been taken
once by the native code (__ticket_spin_lock) cannot be taken by
__xen_spin_lock even after it has been released.

Reported-and-Tested-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/smp: Warn user why they keel over - nosmp or noapic and what to use instead.

We have hit a couple of customer bugs where they would like to
use those parameters to run an UP kernel - but both of those
options turn of important sources of interrupt information so
we end up not being able to boot. The correct way is to
pass in 'dom0_max_vcpus=1' on the Xen hypervisor line and
the kernel will patch itself to be a UP kernel.

Fixes bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=637308

CC: stable@kernel.org
Acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: x86_32: do not enable iterrupts when returning from exception in interrupt context

If vmalloc page_fault happens inside of interrupt handler with interrupts
disabled then on exit path from exception handler when there is no pending
interrupts, the following code (arch/x86/xen/xen-asm_32.S:112):

cmpw $0x0001, XEN_vcpu_info_pending(%eax)
sete XEN_vcpu_info_mask(%eax)

will enable interrupts even if they has been previously disabled according to
eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99)

testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp)
setz XEN_vcpu_info_mask(%eax)

Solution is in setting XEN_vcpu_info_mask only when it should be set
according to
cmpw $0x0001, XEN_vcpu_info_pending(%eax)
but not clearing it if there isn't any pending events.

Reproducer for bug is attached to RHBZ 707552

CC: stable@kernel.org
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: use maximum reservation to limit amount of usable RAM

Use the domain's maximum reservation to limit the amount of extra RAM
for the memory balloon. This reduces the size of the pages tables and
the amount of reserved low memory (which defaults to about 1/32 of the
total RAM).

On a system with 8 GiB of RAM with the domain limited to 1 GiB the
kernel reports:

Before:

Memory: 627792k/4472000k available

After:

Memory: 549740k/11132224k available

A increase of about 76 MiB (~1.5% of the unused 7 GiB). The reserved
low memory is also reduced from 253 MiB to 32 MiB. The total
additional usable RAM is 329 MiB.

For dom0, this requires at patch to Xen ('x86: use 'dom0_mem' to limit
the number of pages for dom0') (c/s 23790)

CC: stable@kernel.org
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

SCSI: Fix oops dereferencing queue

Commit 86cbfb5607d4b81b1a993ff689bbd2addd5d3a9b introduced a regression
where requests could be queued after a device had disappeared.
Subsequent commits have attempted to fix some but not all of these
issues.

Since there appears to be ongoing discussion about the proper way to fix
this we'll partially revert the upstream commit.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: guru.anbalagane <guru.anbalagane@oracle.com>

Merge branch 'in-3.1/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into uek2-stable

Merge branch 'stable/tracing' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into uek2-stable

Merge branch 'stable/xen-pciback-0.6.3.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into uek2-stable

Merge branches 'stable/drivers.other' and 'stable/drivers.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into uek2-stable

Merge branch 'stable/pci.cleanups.v1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into uek2-stable

Merge branch 'stable/drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into uek2-stable

Merge branch 'for-guru' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem into uek2-stable

xen-blkback: fixed indentation and comments

This patch fixes belows:

1. Fix code style issue.
2. Fix incorrect functions name in comments.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/blkback: Make description more obvious.

With the frontend having Xen but the backend not, it just looks odd:

<*> Xen virtual block device support
<*> Block-device backend driver

Fix it to have the 'Xen' in front of it.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen-blkfront: Drop name and minor adjustments for emulated scsi devices

These were intended to avoid the namespace clash when representing
emulated IDE and SCSI devices. However that seems to confuse users
more than expected (a disk defined as sda becomes xvde).
So for now go back to the scheme which does no adjustments. This
will break when mixing IDE and SCSI names in the configuration of
guests but should be by now expected.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen-blkfront: Fix one off warning about name clash

Avoid telling users to use xvde and onwards when using xvde.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: Do not enable PV IPIs when vector callback not present

Fix regression for HVM case on older (<4.1.1) hypervisors caused by

  commit 99bbb3a84a99cd04ab16b998b20f01a72cfa9f4f
  Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
  Date:   Thu Dec 2 17:55:10 2010 +0000

    xen: PV on HVM: support PV spinlocks and IPIs

This change replaced the SMP operations with event based handlers without
taking into account that this only works when the hypervisor supports
callback vectors. This causes unexplainable hangs early on boot for
HVM guests with more than one CPU.

BugLink: http://bugs.launchpad.net/bugs/791850
CC: stable@kernel.org
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-and-Reported-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/x86: replace order-based range checking of M2P table by linear one

The order-based approach is not only less efficient (requiring a shift
and a compare, typical generated code looking like this

mov eax, [machine_to_phys_order]
mov ecx, eax
shr ebx, cl
test ebx, ebx
jnz ...

whereas a direct check requires just a compare, like in

cmp ebx, [machine_to_phys_nr]
jae ...

), but also slightly dangerous in the 32-on-64 case - the element
address calculation can wrap if the next power of two boundary is
sufficiently far away from the actual upper limit of the table, and
hence can result in user space addresses being accessed (with it being
unknown what may actually be mapped there).

Additionally, the elimination of the mistaken use of fls() here (should
have been __fls()) fixes a latent issue on x86-64 that would trigger
if the code was run on a system with memory extending beyond the 44-bit
boundary.

CC: stable@kernel.org
Signed-off-by: Jan Beulich <jbeulich@novell.com>
[v1: Based on Jeremy's feedback]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: Fix misleading WARN message at xen_release_chunk

WARN message should not complain
"Failed to release memory %lx-%lx err=%d\n"
^^^^^^^
about range when it fails to release just one page,
instead it should say what pfn is not freed.

In addition line:
printk(KERN_INFO "xen_release_chunk: looking at area pfn %lx-%lx: "
...
printk(KERN_CONT "%lu pages freed\n", len);
will be broken if WARN in between this line is fired. So fix it
by using a single printk for this.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: Fix printk() format in xen/setup.c

Use correct format specifier for unsigned long.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/grant: Fix compile warning.

drivers/xen/grant-table.c:85: warning: ‘rc’ may be used uninitialized in this function

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: xen-selfballoon.c needs more header files

Fix build errors (found when CONFIG_SYSFS is not enabled):

drivers/xen/xen-selfballoon.c:446: warning: data definition has no type or storage class
drivers/xen/xen-selfballoon.c:446: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
drivers/xen/xen-selfballoon.c:446: warning: parameter names (without types) in function declaration
drivers/xen/xen-selfballoon.c:485: error: expected declaration specifiers or '...' before string constant
drivers/xen/xen-selfballoon.c:485: warning: data definition has no type or storage class
drivers/xen/xen-selfballoon.c:485: warning: type defaults to 'int' in declaration of 'MODULE_LICENSE'
drivers/xen/xen-selfballoon.c:485: warning: function declaration isn't a prototype

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/self-balloon: Add dependency on tmem.

Without enabling CONFIG_XEN_TMEM we get this:

drivers/xen/xen-selfballoon.c:461: undefined reference to `tmem_enabled'

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/balloon: Fix compile errors - missing header files.

With a specific enough .config file compile errors show
for missing workqueue declarations.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: convert to 64 bit stats interface

Convert xen driver to 64 bit statistics interface.
Use stats_sync to ensure that 64 bit update is read atomically on 32 bit platform.
Put hot statistics into per-cpu table.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

xen/netback: Add module alias for autoloading

Add xen-backend:vif module alias to the xen-netback module. This allows
automatic loading of the module.

Signed-off-by: Bastian Blank <waldi@debian.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/blkback: Don't let in-flight requests defer pending ones.

Running RING_FINAL_CHECK_FOR_REQUESTS from make_response is a bad
idea. It means that in-flight I/O is essentially blocking continued
batches. This essentially kills throughput on frontends which unplug
(or even just notify) early and rightfully assume addtional requests
will be picked up on time, not synchronously.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
[v1: Rebased and fixed compile problems]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/blkback: Add module alias for autoloading

Add xen-backend:vbd module alias to the xen-blkback module. This allows
automatic loading of the module.

Signed-off-by: Bastian Blank <waldi@debian.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

mm: extend memory hotplug API to allow memory hotplug in virtual machines

This patch contains online_page_callback and apropriate functions for
registering/unregistering online page callbacks.  It allows to do some
machine specific tasks during online page stage which is required to
implement memory hotplug in virtual machines.  Currently this patch is
required by latest memory hotplug support for Xen balloon driver patch
which will be posted soon.

Additionally, originial online_page() function was splited into
following functions doing "atomic" operations:

  - __online_page_set_limits() - set new limits for memory management code,
  - __online_page_increment_counters() - increment totalram_pages and totalhigh_pages,
  - __online_page_free() - free page to allocator.

It was done to:
  - not duplicate existing code,
  - ease hotplug code devolpment by usage of well defined interface,
  - avoid stupid bugs which are unavoidable when the same code
    (by design) is developed in many places.

[akpm@linux-foundation.org: use explicit indirect-call syntax]
Signed-off-by: Daniel Kiper <dkiper@net-space.pl>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

xen/balloon: memory hotplug support for Xen balloon driver

Memory hotplug support for Xen balloon driver.  It should be mentioned
that hotplugged memory is not onlined automatically.  It should be onlined
by user through standard sysfs interface.

Memory could be hotplugged in following steps:

  1) dom0: xl mem-max <domU> <maxmem>
     where <maxmem> is >= requested memory size,

  2) dom0: xl mem-set <domU> <memory>
     where <memory> is requested memory size; alternatively memory
     could be added by writing proper value to
     /sys/devices/system/xen_memory/xen_memory0/target or
     /sys/devices/system/xen_memory/xen_memory0/target_kb on dumU,

  3) domU: for i in /sys/devices/system/memory/memory*/state; do \
             [ "`cat "$i"`" = offline ] && echo online > "$i"; done

Memory could be onlined automatically on domU by adding following line to
udev rules:

  SUBSYSTEM=="memory", ACTION=="add", RUN+="/bin/sh -c '[ -f /sys$devpath/state ] && echo online > /sys$devpath/state'"

In that case step 3 should be omitted.

Signed-off-by: Daniel Kiper <dkiper@net-space.pl>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Input: xen-kbdfront - enable driver for HVM guests

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

xen/tracing: Fix tracing config option properly

Steven Rostedt says we should use CONFIG_EVENT_TRACING.

Cc:Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/trace: Fix compile error when CONFIG_XEN_PRIVILEGED_GUEST is not set

with CONFIG_XEN and CONFIG_FTRACE set we get this:

arch/x86/xen/trace.c:22: error: ‘__HYPERVISOR_console_io’ undeclared here (not in a function)
arch/x86/xen/trace.c:22: error: array index in initializer not of integer type
arch/x86/xen/trace.c:22: error: (near initialization for ‘xen_hypercall_names’)
arch/x86/xen/trace.c:23: error: ‘__HYPERVISOR_physdev_op_compat’ undeclared here (not in a function)

Issue was that the definitions of __HYPERVISOR were not pulled
if CONFIG_XEN_PRIVILEGED_GUEST was not set.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/tracing: it looks like we wanted CONFIG_FTRACE

Apparently we wanted CONFIG_FTRACE rather the CONFIG_FUNCTION_TRACER.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/tracing: fix compile errors when tracing is disabled.

When CONFIG_FUNCTION_TRACER is disabled, compilation fails as follows:
CC arch/x86/xen/setup.o
In file included from arch/x86/include/asm/xen/hypercall.h:42,
from arch/x86/xen/setup.c:19:
include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list
include/trace/events/xen.h:31: warning: its scope is only this definition or declaration, which is probably not what you want
include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list
include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list
include/trace/events/xen.h:31: warning: 'struct multicall_entry' declared inside parameter list
[...]
arch/x86/xen/trace.c:5: error: '__HYPERVISOR_set_trap_table' undeclared here (not in a function)
arch/x86/xen/trace.c:5: error: array index in initializer not of integer type
arch/x86/xen/trace.c:5: error: (near initialization for 'xen_hypercall_names')
arch/x86/xen/trace.c:6: error: '__HYPERVISOR_mmu_update' undeclared here (not in a function)
arch/x86/xen/trace.c:6: error: array index in initializer not of integer type
arch/x86/xen/trace.c:6: error: (near initialization for 'xen_hypercall_names')

Fix this by making sure struct multicall_entry has a declaration in
scope at all times, and don't bother compiling xen/trace.c when tracing
is disabled.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/trace: use class for multicall trace

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/trace: convert mmu events to use DECLARE_EVENT_CLASS()/DEFINE_EVENT()

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/multicall: move *idx fields to start of mc_buffer

The CPU would prefer small offsets.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/multicall: special-case singleton hypercalls

Singleton calls seem to end up being pretty common, so just
directly call the hypercall rather than going via multicall.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/multicalls: add unlikely around slowpath in __xen_mc_entry()

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/multicalls: disable MC_DEBUG

It's useful - and probably should be a config - but its very heavyweight,
especially with the tracing stuff to help sort out problems.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/mmu: tune pgtable alloc/release

Make sure the fastpath code is inlined. Batch the page permission change
and the pin/unpin, and make sure that it can be batched with any
adjacent set_pte/pmd/etc operations.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/mmu: use extend_args for more mmuext updates

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/trace: add tlb flush tracepoints

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/trace: add segment desc tracing

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/trace: add xen_pgd_(un)pin tracepoints

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/trace: add ptpage alloc/release tracepoints

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/trace: add mmu tracepoints

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/trace: add multicall tracing

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/trace: set up tracepoint skeleton

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/multicalls: remove debugfs stats

Remove debugfs stats to make way for tracing.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

trace/xen: add skeleton for Xen trace events

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

xen/pciback: remove duplicated #include

Remove duplicated #include('s) in
drivers/xen/xen-pciback/xenbus.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen:pvhvm: Modpost section mismatch fix

Removing __init from check_platform_magic since it is called by
xen_unplug_emulated_devices in non-init contexts (It probably gets inlined
because of -finline-functions-called-once, removing __init is more to avoid
mismatch being reported).

Signed-off-by: Raghavendra D Prabhu <rprabhu@wnohang.net>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

mm: frontswap: config and doc files

This fourth patch of four in the frontswap series adds configuration
and documentation files.

[v8: rebase to 3.0-rc4]
[v7: rebase to 3.0-rc3]
[v6: rebase to 3.0-rc1]
[v5: change config default to n]
[v4: rebase to 2.6.39]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Reviewed-by: Konrad Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <JBeulich@novell.com>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Rik Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

mm: frontswap: add swap hooks and extend try_to_unuse

This third patch of four in the frontswap series adds hooks in the swap
subsystem and extends try_to_unuse so that frontswap_shrink can do a
"partial swapoff". Also, declarations for the extern-ified swap variables
in the first patch are declared.

Note that failed frontswap_map allocation is safe... failure is noted
by lack of "FS" in the subsequent printk.

[v8: rebase to 3.0-rc4]
[v8: kamezawa.hiroyu@jp.fujitsu.com: add comment to clarify find_next_to_unuse]
[v7: rebase to 3.0-rc3]
[v7: JBeulich@novell.com: use new static inlines, no-ops if not config'd]
[v6: rebase to 3.1-rc1]
[v6: lliubbo@gmail.com: use vzalloc]
[v6: lliubbo@gmail.com: fix null pointer deref if vzalloc fails]
[v5: accidentally posted stale code for v4 that failed to compile :-(]
[v4: rebase to 2.6.39]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Reviewed-by: Konrad Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <JBeulich@novell.com>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Rik Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

mm: frontswap: core code

This second patch of four in this frontswap series provides the core code
for frontswap that interfaces between the hooks in the swap subsystem and
a frontswap backend via frontswap_ops.

Two new files are added: mm/frontswap.c and include/linux/frontswap.h

Credits: Frontswap_ops design derived from Jeremy Fitzhardinge
design for tmem; sysfs code modelled after mm/ksm.c

[v8: rebase to 3.0-rc4]
[v8: kamezawa.hiroyu@jp.fujitsu.com: change count to atomic_t to avoid races]
[v7: rebase to 3.0-rc3]
[v7: JBeulich@novell.com: new static inlines resolve to no-ops if not config'd]
[v7: JBeulich@novell.com: avoid redundant shifts/divides for *_bit lib calls]
[v6: rebase to 3.1-rc1]
[v6: lliubbo@gmail.com: fix null pointer deref if vzalloc fails]
[v6: konrad.wilk@oracl.com: various checks and code clarifications/comments]
[v5: no change from v4]
[v4: rebase to 2.6.39]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Reviewed-by: Konrad Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <JBeulich@novell.com>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Rik Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

mm: frontswap: swap data structure changes

This first patch of four in the frontswap series makes available core
swap data structures (swap_lock, swap_list and swap_info) that are
needed by frontswap.c but we don't need to expose them to the dozens
of files that include swap.h so we create a new swapfile.h just to
extern-ify these.

Also add frontswap-related elements to swap_info_struct. Frontswap_map
points to vzalloc'ed one-bit-per-swap-page metadata that indicates
whether the swap page is in frontswap or in the device and frontswap_pages
counts how many pages are in frontswap.

[v8: rebase to 3.0-rc4]
[v8: kamezawa.hiroyu@jp.fujitsu.com: frontswap_pages should be atomic_t]
[v8: kamezawa.hiroyu@jp.fujitsu.com: comment to clarify informational counters]
[v7: rebase to 3.0-rc3]
[v7: JBeulich@novell.com: add new swap struct elements only if config'd]
[v6: rebase to 3.0-rc1]
[v5: no change from v4]
[v4: rebase to 2.6.39]
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Reviewed-by: Konrad Wilk <konrad.wilk@oracle.com>
Reviewed-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Jan Beulich <JBeulich@novell.com>
Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Rik Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

v2.6.39-100.0.4 update spec file
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>

update makefile to reflect 2.6.39
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>

v2.6.39-100.0.3 update spec file for rebase to linux-3.0.3
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>

Fix i686 compilation for 2.6.39

Patch also makes it possible to compile i686 on 64bit machine.

update and revision changelog v2.6.39-100.0.2
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>

Update kernel-uek.spec for 2.6.39

Update ol5/kernel-uek.spec with changes from ol6

Guru's spec + latest changes from ol6 spec

Update kernel-uek.spec for 2.6.39

export list of msi irqs into sysfs

Signed-off-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>

[AUDIT/workaround] Increase AUDIT_NAMES array len

dmesg:
audit: name_count maxed, losing inode data: dev=00:05, inode=10118
audit: name_count maxed, losing inode data: dev=00:05, inode=10118
audit: name_count maxed, losing inode data: dev=00:05, inode=10118
audit: name_count maxed, losing inode data: dev=00:05, inode=10118
audit: name_count maxed, losing inode data: dev=00:05, inode=10118

This messages appear because of audit code has fixed array len for
handled inodes. This patch is workaround to increase array size for
inodes with minimal code changes. Mainstream does not have right fix for
it now so patch has to be back-ported from 3.0.1 or 3.0.1. Make debugging
message more verbose to track if we reached limit.

commit 449cedf099b23a250e7d61982e35555ccb871182
Author: Eric Paris <eparis@redhat.com>
Date:   Mon Apr 5 16:16:26 2010 -0400

    audit: preface audit printk with audit

    There have been a number of reports of people seeing the message:
    "name_count maxed, losing inode data: dev=00:05, inode=3185"
    in dmesg.  These usually lead to people reporting problems to the filesystem
    group who are in turn clueless what they mean.

    Eventually someone finds me and I explain what is going on and that
    these come from the audit system.  The basics of the problem is that the
    audit subsystem never expects a single syscall to 'interact' (for some
    wish washy meaning of interact) with more than 20 inodes.  But in fact
    some operations like loading kernel modules can cause changes to lots of
    inodes in debugfs.

    There are a couple real fixes being bandied about including removing the
    fixed compile time limit of 20 or not auditing changes in debugfs (or
    both) but neither are small and obvious so I am not sending them for
    immediate inclusion (I hope Al forwards a real solution next devel
    window).

    In the meantime this patch simply adds 'audit' to the beginning of the
    crap message so if a user sees it, they come blame me first and we can
    talk about what it means and make sure we understand all of the reasons
    it can happen and make sure this gets solved correctly in the long run.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Increase kernel log buffer to 1MB (SHIFT=20)

2.6.39 generic-config update

Module signature checker and key manager

Port to 2.6.39 following commit:

commit b0df24aa16fdaf7ccb0fb072a0f48b3f9b0c6b59
Author: David Howells <dhowells@redhat.com>
Date:   Wed Nov 11 15:47:32 2009 -0500

    Module signature checker and key manager

    Add a facility to retain public keys and to verify signatures made with those
    public keys, given a signature and crypto_hash of the data that was signed.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Multiprecision maths library

2.6.39 port of following commit:

    Add a multiprecision maths library (MPILIB) required for doing cryptographic
    operations based on very large numbers.

    This is derived from GPG, reduced to the minimum necessary bits for doing DSA
    signature verification with error handling added.  This is used to do kernel
    module signing.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

add uek config/spec files
Signed-off-by: guru.anbalagane <guru.anbalagane@oracle.com>

IPC reduce lock contention in semctl

Signed-off-by: Chris Mason <chris.mason@oracle.com>

IPC lock reduction corners

Deal properly with out of order semtimedop args and other corners.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

use rwlocks for ipc

Signed-off-by: Chris Mason <chris.mason@oracle.com>

ipc semaphores: order wakeups based on waiter CPU

When IPC semaphores are used in a bulk post and wait system, we
can end up waking a very large number of processes per semtimedop call.
At least one major database will use a single process to kick hundreds
of other processes at a time.

This patch tries to reduce the runqueue lock contention by ordering the
wakeups based on the CPU the waiting process was on when it went to
sleep.

A later patch could add some code in the scheduler to help
wake these up in bulk and take the various runqueue locks less often.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

ipc semaphores: reduce ipc_lock contention in semtimedop

One feature of the ipc semaphores is they are defined to be
atomic for the full set of operations done per syscall.  So if you do a
semtimedop syscall changing 100 semaphores, the kernel needs to try all
100 changes and only apply them when all 100 are able to succeed without
putting the process to sleep.

Today we use a single lock per semaphore array (the ipc lock).  This lock is
held every time we try a set of operations requested by userland, and
when taken again when a process is woken up.

Whenever a given set of changes sent to semtimedop would sleep, that
set is queued up on a big list of pending changes for the entire
semaphore array.

Whenever a semtimedop call changes a single semaphore value, it
walks the entire list of pending operations to see if any of them
can now succeed.  The ipc lock is held for this entire loop.

This change makes two major changes, pushing both the list of pending
operations and a spinlock down to each individual semaphore.  Now:

Whenever a given semaphore modification is going to block, the set of
operations semtimedop wants to do is saved onto that semaphore's list.

Whenever a givem semtimedop call changes a single semaphore value, it
walks the list of pending operations on that single semaphore to see if
they can now succeed.  If any of the operations will block on a
different semaphore, they are moved to that semaphore's list.

The locking is now done per-semaphore.  In order to get the changes done
atomically, the lock of every semaphore being changed is taken while we
test the requested operations.  We sort the operations by semaphore id
to make sure we don't deadlock in the kernel.

I have a microbenchmark to test how quickly we can post and wait in
bulk.  With this change, semtimedop is able do to more than twice
as much work in the same run.  On a large numa machine, it brings
the IPC lock system time (reported by perf) down from 85% to 15%.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Linux 3.0.3

perf tools: do not look at ./config for configuration

commit aba8d056078e47350d85b06a9cabd5afcc4b72ea upstream.

In addition to /etc/perfconfig and $HOME/.perfconfig, perf looks for
configuration in the file ./config, imitating git which looks at
$GIT_DIR/config. If ./config is not a perf configuration file, it
fails, or worse, treats it as a configuration file and changes behavior
in some unexpected way.

"config" is not an unusual name for a file to be lying around and perf
does not have a private directory dedicated for its own use, so let's
just stop looking for configuration in the cwd. Callers needing
context-sensitive configuration can use the PERF_CONFIG environment
variable.

Requested-by: Christian Ohm <chr.ohm@gmx.net>
Cc: 632923@bugs.debian.org
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Christian Ohm <chr.ohm@gmx.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110805165838.GA7237@elie.gateway.2wire.net
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/radeon/kms: don't try to be smart in the hpd handler

commit d5811e8731213f80c80d89e980505052f16aca1c upstream.

Attempting to try and turn off disconnected display hw in the
hotput handler lead to more problems than it helped. For
now just register an event and only attempt the do something
interesting with DP. Other connectors are just too problematic:
- Some systems have an HPD pin assigned to LVDS, but it's rarely
if ever connected properly and we don't really care about hpd
events on LVDS anyway since it's always connected.
- The HPD pin is wired up correctly for eDP, but we don't really
have to do anything since the events since it's always connected.
- Some HPD pins fire more than once when you connect/disconnect
- etc.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=39882

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/radeon/kms: fix regression is handling >2 heads on cedar/caicos

commit 33ae1827d6c3c79c5957536ec29d5a8780623147 upstream.

Need to add support for 4 crtcs when setting the possible crtcs
for the encoders.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/radeon/kms: don't enable connectors that are off in the hotplug handler

commit 73104b5cfe3067d68f2c2de3f3d4d4964c55873e upstream.

If we get a hotplug event on an connector that is off, don't
attempt to turn it on or off, it should already be off.

Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=728228

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

lguest: allow booting guest with CONFIG_RELOCATABLE=y

commit e22a539824e8ddb82c87b4f415165ede82e6ab56 upstream.

The CONFIG_RELOCATABLE code tries to align the unpack destination to
the value of 'kernel_alignment' in the setup_hdr. If that's 0, it
tries to unpack to address 0, which in fact causes the gunzip code
to call 'error("Out of memory while allocating output buffer")'.

The bootloader (ie. the lguest Launcher in this case) should be doing
setting this field; the normal bzImage is 16M, we can use the same.

Reported-by: Stefanos Geraggelos <sgerag@cslab.ece.ntua.gr>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: fix wrong vmap address calculations with odd NR_CPUS values

commit f982f91516fa4cfd9d20518833cd04ad714585be upstream.

Commit db64fe02258f ("mm: rewrite vmap layer") introduced code that does
address calculations under the assumption that VMAP_BLOCK_SIZE is a
power of two. However, this might not be true if CONFIG_NR_CPUS is not
set to a power of two.

Wrong vmap_block index/offset values could lead to memory corruption.
However, this has never been observed in practice (or never been
diagnosed correctly); what caught this was the BUG_ON in vb_alloc() that
checks for inconsistent vmap_block indices.

To fix this, ensure that VMAP_BLOCK_SIZE always is a power of two.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=31572
Reported-by: Pavel Kysilka <goldenfish@linuxsoft.cz>
Reported-by: Matias A. Fonzo <selk@dragora.org>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath5k: fix error handling in ath5k_beacon_send

commit bdc71bc59231f5542af13b5061b9ab124d093050 upstream.

This cleans up error handling for the beacon in case of dma mapping
failure. We need to free the skb when dma mapping fails instead of
nulling and leaking the pointer, and we should bail out to avoid
giving the hardware the bad descriptor.

Finally, we need to perform the null check after trying to update
the beacon, or else beacons will never be sent after a single
mapping failure.

Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ASoC: Tegra: wm8903 machine driver: Allow re-insertion of module

commit 29591ed4ac6fe00e3ff23b5be0cdc7016ef9c47e upstream.

Two issues were preventing module snd-soc-tegra-wm8903.ko from being
removed and re-inserted:

a) The speaker-enable GPIO is hosted by the WM8903 chip. This GPIO must
   be freed before snd_soc_unregister_card() is called, because that
   triggers wm8903.c:wm8903_remove(), which calls gpiochip_remove(), which
   then fails if any of the GPIOs are in use. To solve this, free all GPIOs
   first, so the code doesn't care where they come from.

b) We need to call snd_soc_jack_free_gpios() to match the call to
   snd_soc_jack_add_gpios() during initialization. Without this, the
   call to snd_soc_jack_add_gpios() fails during any subsequent modprobe
   and initialization, since the GPIO and IRQ are already registered. In
   turn, this causes the headphone state not to be monitored, so the
   headphone is assumed not to be plugged in, and the audio path to it is
   never enabled.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ASoC: Tegra: tegra_pcm_deallocate_dma_buffer: Don't OOPS

commit a96edd59b2bc88b3d1ea47e0ba48076d65db9302 upstream.

Not all PCM devices have all sub-streams. Specifically, the SPDIF driver
only supports playback and hence has no capture substream. Check whether
a substream exists before dereferencing it, when de-allocating DMA
buffers in tegra_pcm_deallocate_dma_buffer.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ASoC: Fix binding of WM8750 on Jive

commit 6678050442e90a4e9511a9ed14b9bdfc5e393323 upstream.

The I2C address is misformatted and would never match.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ALSA: snd-usb-caiaq: Correct offset fields of outbound iso_frame_desc

commit 15439bde3af7ff88459ea2b5520b77312e958df2 upstream.

This fixes faulty outbount packets in case the inbound packets
received from the hardware are fragmented and contain bogus input
iso frames. The bug has been there for ages, but for some strange
reasons, it was only triggered by newer machines in 64bit mode.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Reported-and-tested-by: William Light <wrl@illest.net>
Reported-by: Pedro Ribeiro <pedrib@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>