Konrad Rzeszutek Wilk [Fri, 11 May 2012 20:37:46 +0000 (16:37 -0400)]
Merge branch 'stable/not-upstreamed' into uek2-merge
* stable/not-upstreamed:
xen/mce: Register native mce handler as vMCE bounce back point
xen/mce: Add mcelog support for Xen platform
Revert "Add mcelog support from xen platform"
Revert "xen/mce: Change the machine check point"
When MCA error occurs, it would be handled by Xen hypervisor first,
and then the error information would be sent to initial domain for logging.
This patch gets error information from Xen hypervisor and convert
Xen format error into Linux format mcelog. This logic is basically
self-contained, not touching other kernel components.
By using tools like mcelog tool users could read specific error information,
like what they did under native Linux.
To test follow directions outlined in Documentation/acpi/apei/einj.txt
Konrad Rzeszutek Wilk [Fri, 11 May 2012 20:29:09 +0000 (16:29 -0400)]
xen/gntdev: Fix merge error.
Somehow a merge error ensued were an important part of
"xen/gnt{dev,alloc}: reserve event channels for notify"
went missing. Fortunatly for us we aren't using this driver yet.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Fri, 11 May 2012 18:40:21 +0000 (14:40 -0400)]
Merge branch 'stable/for-linus-3.5.rebased' into uek2-merge
* stable/for-linus-3.5.rebased: (22 commits)
x86/apic: Fix UP boot crash
xen/apic: implement io apic read with hypercall
xen/x86: Implement x86_apic_ops
x86/apic: Replace io_apic_ops with x86_io_apic_ops.
x86/ioapic: Add io_apic_ops driver layer to allow interception
xen: implement IRQ_WORK_VECTOR handler
xen: implement apic ipi interface
xen/gnttab: add deferred freeing logic
xen: enter/exit lazy_mmu_mode around m2p_override calls
xen/setup: update VA mapping when releasing memory during setup
xen/setup: Combine the two hypercall functions - since they are quite similar.
xen/setup: Populate freed MFNs from non-RAM E820 entries and gaps to E820 RAM
xen/setup: Only print "Freeing XXX-YYY pfn range: Z pages freed" if Z > 0
xen/p2m: An early bootup variant of set_phys_to_machine
xen/p2m: Collapse early_alloc_p2m_middle redundant checks.
xen/p2m: Allow alloc_p2m_middle to call reserve_brk depending on argument
xen/p2m: Move code around to allow for better re-usage.
xen: only limit memory map to maximum reservation for domain 0.
xen: release all pages within 1-1 p2m mappings
xen: allow extra memory to be in multiple regions
...
Lin Ming [Mon, 30 Apr 2012 16:16:27 +0000 (00:16 +0800)]
xen/apic: implement io apic read with hypercall
Implements xen_io_apic_read with hypercall, so it returns proper
IO-APIC information instead of fabricated one.
Fallback to return an emulated IO_APIC values if hypercall fails.
[v2: fallback to return an emulated IO_APIC values if hypercall fails] Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Tue, 20 Mar 2012 22:53:10 +0000 (18:53 -0400)]
xen/x86: Implement x86_apic_ops
Or rather just implement one different function as opposed
to the native one : the read function.
We synthesize the values.
[upstream git commit 31b3c9d723407b395564d1fff3624cc0083ae520] Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
[v1: Rebased on top of tip/x86/urgent]
[v2: Return 0xfd instead of 0xff in the default case] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
arch/x86/xen/Makefile
arch/x86/xen/enlighten.c
arch/x86/xen/xen-ops.h
[In the makefile, the vga.o didn't exist in 3.0 - it was added
in 3.1]
Konrad Rzeszutek Wilk [Wed, 28 Mar 2012 16:37:36 +0000 (12:37 -0400)]
x86/apic: Replace io_apic_ops with x86_io_apic_ops.
Which makes the code fit within the rest of the x86_ops functions.
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
[v1: Changed x86_apic -> x86_ioapic per Yinghai Lu <yinghai@kernel.org> suggestion]
[v2: Rebased on tip/x86/urgent and redid to match Ingo's syntax style] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jeremy Fitzhardinge [Thu, 22 Mar 2012 02:58:08 +0000 (22:58 -0400)]
x86/ioapic: Add io_apic_ops driver layer to allow interception
Xen dom0 needs to paravirtualize IO operations to the IO APIC,
so add a io_apic_ops for it to intercept. Do this as ops
structure because there's at least some chance that another
paravirtualized environment may want to intercept these.
[upstream git commit 136d249ef7dbf0fefa292082cc40be1ea864cbd6] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: jwboyer@redhat.com Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/1332385090-18056-2-git-send-email-konrad.wilk@oracle.com
[ Made all the affected code easier on the eyes ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Conflicts:
Lin Ming [Fri, 20 Apr 2012 16:11:05 +0000 (00:11 +0800)]
xen: implement IRQ_WORK_VECTOR handler
[upstream git commit 1ff2b0c303698e486f1e0886b4d9876200ef8ca5] Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ben Guthro [Fri, 20 Apr 2012 16:11:04 +0000 (00:11 +0800)]
xen: implement apic ipi interface
Map native ipi vector to xen vector.
Implement apic ipi interface with xen_send_IPI_one.
[upstream git commit f447d56d36af18c5104ff29dcb1327c0c0ac3634] Tested-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Ben Guthro <ben@guthro.net> Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
Jan Beulich [Thu, 5 Apr 2012 15:10:07 +0000 (16:10 +0100)]
xen/gnttab: add deferred freeing logic
Rather than just leaking pages that can't be freed at the point where
access permission for the backend domain gets revoked, put them on a
list and run a timer to (infrequently) retry freeing them. (This can
particularly happen when unloading a frontend driver when devices are
still present, and the backend still has them in non-closed state or
hasn't finished closing them yet.)
[upstream git commit 569ca5b3f94cd0b3295ec5943aa457cf4a4f6a3a] Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen: enter/exit lazy_mmu_mode around m2p_override calls
This patch is a significant performance improvement for the
m2p_override: about 6% using the gntdev device.
Each m2p_add/remove_override call issues a MULTI_grant_table_op and a
__flush_tlb_single if kmap_op != NULL. Batching all the calls together
is a great performance benefit because it means issuing one hypercall
total rather than two hypercall per page.
If paravirt_lazy_mode is set PARAVIRT_LAZY_MMU, all these calls are
going to be batched together, otherwise they are issued one at a time.
Adding arch_enter_lazy_mmu_mode/arch_leave_lazy_mmu_mode around the
m2p_add/remove_override calls forces paravirt_lazy_mode to
PARAVIRT_LAZY_MMU, therefore makes sure that they are always batched.
However it is not safe to call arch_enter_lazy_mmu_mode if we are in
interrupt context or if we are already in PARAVIRT_LAZY_MMU mode, so
check for both conditions before doing so.
Changes in v4:
- rebased on 3.4-rc4: all the m2p_override users call gnttab_unmap_refs
and gnttab_map_refs;
- check whether we are in interrupt context and the lazy_mode we are in
before calling arch_enter/leave_lazy_mmu_mode.
Changes in v3:
- do not call arch_enter/leave_lazy_mmu_mode in xen_blkbk_unmap, that
can be called in interrupt context.
David Vrabel [Thu, 3 May 2012 15:15:42 +0000 (16:15 +0100)]
xen/setup: update VA mapping when releasing memory during setup
In xen_memory_setup(), if a page that is being released has a VA
mapping this must also be updated. Otherwise, the page will be not
released completely -- it will still be referenced in Xen and won't be
freed util the mapping is removed and this prevents it from being
reallocated at a different PFN.
This was already being done for the ISA memory region in
xen_ident_map_ISA() but on many systems this was omitting a few pages
as many systems marked a few pages below the ISA memory region as
reserved in the e820 map.
This fixes errors such as:
(XEN) page_alloc.c:1148:d0 Over-allocation for domain 0: 2097153 > 2097152
(XEN) memory.c:133:d0 Could not allocate order=0 extent: id=0 memflags=0 (0 of 17)
[upstream git commit 83d51ab473dddde7df858015070ed22b84ebe9a9] Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The fun comes when a PV guest that is run with a machine E820 - that
can either be the initial domain or a PCI PV guest, where the E820
looks like the normal thing:
Those 283999 pages are subtracted from the nr_pages and are returned
to the hypervisor. The end result is that the initial domain
boots with 1GB less memory as the nr_pages has been subtracted by
the amount of pages residing within the PCI hole. It can balloon up
to that if desired using 'xl mem-set 0 8092', but the balloon driver
is not always compiled in for the initial domain.
This patch, implements the populate hypercall (XENMEM_populate_physmap)
which increases the the domain with the same amount of pages that
were released.
The other solution (that did not work) was to transplant the MFN in
the P2M tree - the ones that were going to be freed were put in
the E820_RAM regions past the nr_pages. But the modifications to the
M2P array (the other side of creating PTEs) were not carried away.
As the hypervisor is the only one capable of modifying that and the
only two hypercalls that would do this are: the update_va_mapping
(which won't work, as during initial bootup only PFNs up to nr_pages
are mapped in the guest) or via the populate hypercall.
The end result is that the kernel can now boot with the
nr_pages without having to subtract the 283999 pages.
On a 8GB machine, with various dom0_mem= parameters this is what we get:
no dom0_mem
-Memory: 6485264k/9435136k available (5817k kernel code, 1136060k absent, 1813812k reserved, 2899k data, 696k init)
+Memory: 7619036k/9435136k available (5817k kernel code, 1136060k absent, 680040k reserved, 2899k data, 696k init)
Konrad Rzeszutek Wilk [Fri, 30 Mar 2012 19:37:07 +0000 (15:37 -0400)]
xen/setup: Only print "Freeing XXX-YYY pfn range: Z pages freed" if Z > 0
Otherwise we can get these meaningless:
Freeing bad80-badf4 pfn range: 0 pages freed
We also can do this for the summary ones - no point of printing
"Set 0 page(s) to 1-1 mapping"
[upstream git commit ca1182387e57470460294ce1e39e2d5518809811] Acked-by: David Vrabel <david.vrabel@citrix.com>
[v1: Extended to the summary printks] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Fri, 30 Mar 2012 18:33:14 +0000 (14:33 -0400)]
xen/p2m: An early bootup variant of set_phys_to_machine
During early bootup we can't use alloc_page, so to allocate
leaf pages in the P2M we need to use extend_brk. For that
we are utilizing the early_alloc_p2m and early_alloc_p2m_middle
functions to do the job for us. This function follows the
same logic as set_phys_to_machine.
Konrad Rzeszutek Wilk [Fri, 30 Mar 2012 18:15:14 +0000 (14:15 -0400)]
xen/p2m: Allow alloc_p2m_middle to call reserve_brk depending on argument
For identity cases we want to call reserve_brk only on the boundary
conditions of the middle P2M (so P2M[x][y][0] = extend_brk). This is
to work around identify regions (PCI spaces, gaps in E820) which are not
aligned on 2MB regions.
However for the case were we want to allocate P2M middle leafs at the
early bootup stage, irregardless of this alignment check we need some
means of doing that. For that we provide the new argument.
Ian Campbell [Wed, 14 Dec 2011 12:16:08 +0000 (12:16 +0000)]
xen: only limit memory map to maximum reservation for domain 0.
d312ae878b6a "xen: use maximum reservation to limit amount of usable RAM"
clamped the total amount of RAM to the current maximum reservation. This is
correct for dom0 but is not correct for guest domains. In order to boot a guest
"pre-ballooned" (e.g. with memory=1G but maxmem=2G) in order to allow for
future memory expansion the guest must derive max_pfn from the e820 provided by
the toolstack and not the current maximum reservation (which can reflect only
the current maximum, not the guest lifetime max). The existing algorithm
already behaves this correctly if we do not artificially limit the maximum
number of pages for the guest case.
David Vrabel [Wed, 28 Sep 2011 16:46:36 +0000 (17:46 +0100)]
xen: release all pages within 1-1 p2m mappings
In xen_memory_setup() all reserved regions and gaps are set to an
identity (1-1) p2m mapping. If an available page has a PFN within one
of these 1-1 mappings it will become inaccessible (as it MFN is lost)
so release them before setting up the mapping.
This can make an additional 256 MiB or more of RAM available
(depending on the size of the reserved regions in the memory map) if
the initial pages overlap with reserved regions.
The 1:1 p2m mappings are also extended to cover partial pages. This
fixes an issue with (for example) systems with a BIOS that puts the
DMI tables in a reserved region that begins on a non-page boundary.
[upstream git commit f3f436e] Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
David Vrabel [Thu, 29 Sep 2011 11:26:19 +0000 (12:26 +0100)]
xen: allow extra memory to be in multiple regions
Allow the extra memory (used by the balloon driver) to be in multiple
regions (typically two regions, one for low memory and one for high
memory). This allows the balloon driver to increase the number of
available low pages (if the initial number if pages is small).
As a side effect, the algorithm for building the e820 memory map is
simpler and more obviously correct as the map supplied by the
hypervisor is (almost) used as is (in particular, all reserved regions
and gaps are preserved). Only RAM regions are altered and RAM regions
above max_pfn + extra_pages are marked as unused (the region is split
in two if necessary).
[upstream git commit dc91c72] Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
David Vrabel [Wed, 28 Sep 2011 16:46:34 +0000 (17:46 +0100)]
xen: allow balloon driver to use more than one memory region
Allow the xen balloon driver to populate its list of extra pages from
more than one region of memory. This will allow platforms to provide
(for example) a region of low memory and a region of high memory.
The maximum possible number of extra regions is 128 (== E820MAX) which
is quite large so xen_extra_mem is placed in __initdata. This is safe
as both xen_memory_setup() and balloon_init() are in __init.
The balloon regions themselves are not altered (i.e., there is still
only the one region).
[upstream git commit 8b5d44a] Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
drivers/xen/balloon.c
[The "Add support for pv hugepages and support for huge balloon pages." has
a lot of these fixes in]
Konrad Rzeszutek Wilk [Mon, 7 May 2012 16:38:51 +0000 (12:38 -0400)]
Merge branch 'stable/for-linus-3.4.rebased' into uek2-merge
* stable/for-linus-3.4.rebased:
xen/pci: don't use PCI BIOS service for configuration space accesses
xen/pte: Fix crashes when trying to see non-existent PGD/PMD/PUD/PTEs
xen/apic: Return the APIC ID (and version) for CPU 0.
drivers/video/xen-fbfront.c: add missing cleanup code
xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries'
David Vrabel [Fri, 4 May 2012 13:29:46 +0000 (14:29 +0100)]
xen/pci: don't use PCI BIOS service for configuration space accesses
The accessing PCI configuration space with the PCI BIOS32 service does
not work in PV guests.
On systems without MMCONFIG or where the BIOS hasn't marked the
MMCONFIG region as reserved in the e820 map, the BIOS service is
probed (even though direct access is preferred) and this hangs.
CC: stable@kernel.org Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[upstream git commit 76a8df7b49168509df02461f83fab117a4a86e08]
Conflicts:
which is due to the fact we are trying to access a PFN that is not
accessible to us. The reason (at least in this case) was that
PGD[256] is set to __HYPERVISOR_VIRT_START which setup (by the
hypervisor) to point to a read-only linear map of the MFN->PFN array.
During our parsing we would get the MFN (a valid one), try to look
it up in the MFN->PFN tree and find it invalid and return ~0 as PFN.
Then pte_mfn_to_pfn would happilly feed that in, attach the flags
and return it back to the caller. In this case the ptdump_show
bitshifts it and we get this invalid value.
Instead of doing all of that, we detect the ~0 case and just return
!_PAGE_PRESENT.
CC: stable@kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 2 May 2012 19:04:51 +0000 (15:04 -0400)]
xen/apic: Return the APIC ID (and version) for CPU 0.
On x86_64 on AMD machines where the first APIC_ID is not zero, we get:
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled)
BIOS bug: APIC version is 0 for CPU 1/0x10, fixing up to 0x10
BIOS bug: APIC version mismatch, boot CPU: 0, CPU 1: version 10
which means that when the ACPI processor driver loads and
tries to parse the _Pxx states it fails to do as, as it
ends up calling acpi_get_cpuid which does this:
for_each_possible_cpu(i) {
if (cpu_physical_id(i) == apic_id)
return i;
}
And the bootup CPU, has not been found so it fails and returns -1
for the first CPU - which then subsequently in the loop that
"acpi_processor_get_info" does results in returning an error, which
means that "acpi_processor_add" failing and per_cpu(processor)
is never set (and is NULL).
That means that when xen-acpi-processor tries to load (much much
later on) and parse the P-states it gets -ENODEV from
acpi_processor_register_performance() (which tries to read
the per_cpu(processor)) and fails to parse the data.
Reported-by-and-Tested-by: Stefan Bader <stefan.bader@canonical.com> Suggested-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
[v2: Bit-shift APIC ID by 24 bits] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The reason we are dying is b/c the call acpi_get_override_irq() is used,
which returns the polarity and trigger for the IRQs. That function calls
mp_find_ioapics to get the 'struct ioapic' structure - which along with the
mp_irq[x] is used to figure out the default values and the polarity/trigger
overrides. Since the mp_find_ioapics now returns -1 [b/c the IOAPIC is filled
with 0xffffffff], the acpi_get_override_irq() stops trying to lookup in the
mp_irq[x] the proper INT_SRV_OVR and we can't install the SCI interrupt.
The proper fix for this is going in v3.5 and adds an x86_io_apic_ops
struct so that platforms can override it. But for v3.4 lets carry this
work-around. This patch does that by providing a slightly different variant
of the fake IOAPIC entries.
[upstream git commit] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Merge branch 'stable/for-linus-3.4.rebased' into uek2-merge
* stable/for-linus-3.4.rebased:
drivers/video/xen-fbfront.c: add missing cleanup code
xen: correctly check for pending events when restoring irq flags
xen/smp: Fix crash when booting with ACPI hotplug CPUs.
xen: use the pirq number to check the pirq_eoi_map
We did a similar check for the P-states but did not do it for
the C-states. What we want to do is ignore cases where the DSDT
has definition for sixteen CPUs, but the machine only has eight
CPUs and we get:
xen-acpi-processor: (CX): Hypervisor error (-22) for ACPI CPU14
xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded.
There are exactly four users of __monitor and __mwait:
- cstate.c (which allows acpi_processor_ffh_cstate_enter to be called
when the cpuidle API drivers are used. However patch
"cpuidle: replace xen access to x86 pm_idle and default_idle"
provides a mechanism to disable the cpuidle and use safe_halt.
- smpboot (which allows mwait_play_dead to be called). However
safe_halt is always used so we skip that.
- intel_idle (same deal as above).
- acpi_pad.c. This the one that we do not want to run as we
will hit the below crash.
Why do we want to expose MWAIT_LEAF in the first place?
We want it for the xen-acpi-processor driver - which uploads
C-states to the hypervisor. If MWAIT_LEAF is set, the cstate.c
sets the proper address in the C-states so that the hypervisor
can benefit from using the MWAIT functionality. And that is
the sole reason for using it.
Without this patch, if a module performs mwait or monitor we
get this:
David Vrabel [Thu, 26 Apr 2012 18:44:06 +0000 (19:44 +0100)]
xen: correctly check for pending events when restoring irq flags
In xen_restore_fl_direct(), xen_force_evtchn_callback() was being
called even if no events were pending. This resulted in (depending on
workload) about a 100 times as many xen_version hypercalls as
necessary.
Fix this by correcting the sense of the conditional jump.
This seems to give a significant performance benefit for some
workloads.
There is some subtle tricksy "..since the check here is trying to
check both pending and masked in a single cmpw, but I think this is
correct. It will call check_events now only when the combined
mask+pending word is 0x0001 (aka unmasked, pending)." (Ian)
[upstream git commit 7eb7ce4d2e8991aff4ecb71a81949a907ca755ac] CC: stable@kernel.org Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen/smp: Fix crash when booting with ACPI hotplug CPUs.
When we boot on a machine that can hotplug CPUs and we
are using 'dom0_max_vcpus=X' on the Xen hypervisor line
to clip the amount of CPUs available to the initial domain,
we get this:
(XEN) Command line: com1=115200,8n1 dom0_mem=8G noreboot dom0_max_vcpus=8 sync_console mce_verbosity=verbose console=com1,vga loglvl=all guest_loglvl=all
.. snip..
DMI: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.99.99.x032.072520111118 07/25/2011
.. snip.
SMP: Allowing 64 CPUs, 32 hotplug CPUs
installing Xen timer for CPU 7
cpu 7 spinlock event irq 361
NMI watchdog: disabled (cpu7): hardware events not enabled
Brought up 8 CPUs
.. snip..
[acpi processor finds the CPUs are not initialized and starts calling
arch_register_cpu, which creates /sys/devices/system/cpu/cpu8/online]
CPU 8 got hotplugged
CPU 9 got hotplugged
CPU 10 got hotplugged
.. snip..
initcall 1_acpi_battery_init_async+0x0/0x1b returned 0 after 406 usecs
calling erst_init+0x0/0x2bb @ 1
[and the scheduler sticks newly started tasks on the new CPUs, but
said CPUs cannot be initialized b/c the hypervisor has limited the
amount of vCPUS to 8 - as per the dom0_max_vcpus=8 flag.
The spinlock tries to kick the other CPU, but the structure for that
is not initialized and we crash.]
BUG: unable to handle kernel paging request at fffffffffffffed8
IP: [<ffffffff81035289>] xen_spin_lock+0x29/0x60
PGD 180d067 PUD 180e067 PMD 0
Oops: 0002 [#1] SMP
CPU 7
Modules linked in:
xen: use the pirq number to check the pirq_eoi_map
In pirq_check_eoi_map use the pirq number rather than the Linux irq
number to check whether an eoi is needed in the pirq_eoi_map.
The reason is that the irq number is not always identical to the
pirq number so if we wrongly use the irq number to check the
pirq_eoi_map we are going to check for the wrong pirq to EOI.
As a consequence some interrupts might not be EOI'ed by the
guest correctly.
[upstream git commit 521394e4e679996955bc351cb6b64639751db2ff] Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Tobias Geiger <tobias.geiger@vido.info>
[v1: Added some extra wording to git commit] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Merge branch 'stable/for-linus-3.4.rebased' into uek2-merge
* stable/for-linus-3.4.rebased: (29 commits)
xen/blkback: Fix warning error.
xen/blkback: Make optional features be really optional.
xen-blkfront: module exit handling adjustments
xen-blkfront: properly name all devices
xen-blkfront: set pages are FOREIGN_FRAME when sharing them
xen: EXPORT_SYMBOL set_phys_to_machine
xen-blkfront: make blkif_io_lock spinlock per-device
xen/blkfront: don't put bdev right after getting it
xen-blkfront: use bitmap_set() and bitmap_clear()
xen/blkback: Enable blkback on HVM guests
xen/blkback: use grant-table.c hypercall wrappers
xen/p2m: m2p_find_override: use list_for_each_entry_safe
xen/gntdev: do not set VM_PFNMAP
xen/grant-table: add error-handling code on failure of gnttab_resume
xen: only check xen_platform_pci_unplug if hvm
xen: initialize platform-pci even if xen_emul_unplug=never
xen kconfig: relax INPUT_XEN_KBDDEV_FRONTEND deps
xen: support pirq_eoi_map
xen/smp: Remove unnecessary call to smp_processor_id()
xen/smp: Fix bringup bug in AP code.
...
drivers/block/xen-blkback/xenbus.c: In function 'xen_blkbk_discard':
drivers/block/xen-blkback/xenbus.c:419:4: warning: passing argument 1 of 'dev_warn' makes pointer from integer without a cast
+[enabled by default]
include/linux/device.h:894:5: note: expected 'const struct device *' but argument is of type 'long int'
It is unclear how that mistake made it in. It surely is wrong.
[upstream git commit a71e23d] Acked-by: Jens Axboe <axboe@kernel.dk> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
Konrad Rzeszutek Wilk [Wed, 14 Mar 2012 17:04:00 +0000 (13:04 -0400)]
xen/blkback: Make optional features be really optional.
They were using the xenbus_dev_fatal() function which would
change the state of the connection immediately. Which is not
what we want when we advertise optional features.
So make 'feature-discard','feature-barrier','feature-flush-cache'
optional.
[upstream git commit 3389bb8] Suggested-by: Jan Beulich <JBeulich@suse.com>
[v1: Made the discard function void and static] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
Merge branch 'stable/xen-network-3.3.rebase' into uek2-merge
* stable/xen-network-3.3.rebase:
xen-netback: make ops structs const
netback: fix typo in comment
netback: remove redundant assignment
netback: Fix alert message.
xen-netback: use correct index for invalidation in xen_netbk_tx_check_gop()
net: xen-netback: correctly restart Tx after a VM restore/migrate
xen/netback: Add module alias for autoloading
Jan Beulich [Thu, 5 Apr 2012 15:04:52 +0000 (16:04 +0100)]
xen-blkfront: module exit handling adjustments
The blkdev major must be released upon exit, or else the module can't
attach to devices using the same majors upon being loaded again. Also
avoid leaking the minor tracking bitmap.
[upstream git commit 4e55b3c] Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Thu, 5 Apr 2012 15:37:22 +0000 (16:37 +0100)]
xen-blkfront: properly name all devices
- devices beyond xvdzz didn't get proper names assigned at all
- extended devices with minors not representable within the kernel's
major/minor bit split spilled into foreign majors
[upstream git commit 85b6984] Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen-blkfront: set pages are FOREIGN_FRAME when sharing them
Set pages as FOREIGN_FRAME whenever blkfront shares them with another
domain. Then when blkfront un-share them, also removes the
FOREIGN_FRAME_BIT from the p2m.
We do it so that when the source and the destination domain are the same
(blkfront connected to a disk backend in the same domain) we can more easily
recognize which ones are the source pfns and which ones are the
destination pfns (both are going to be pointing to the same mfns).
Without this patch enstablishing a connection between blkfront and QEMU
qdisk in the same domain causes QEMU to hang and never return.
The scenario where this used is when a disk image in QCOW2 is used
for extracting the kernel and initrd image. The QCOW2 image file cannot
be loopback-ed and to run 'pygrub', the weird scaffolding of:
- setup QEMU and qdisk with the qcow2 image [disk backend]
- setup xen-blkfront mounting said disk backend in the domain.
- extract kernel and initrd
- tear it down.
The MFNs shared shared by the frontend are going to back two
different sets of PFNs: the original PFNs allocated by the frontend and
the new ones allocated by gntdev for the backend.
The problem is that when Linux calls mfn_to_pfn, passing as argument
one of the MFN shared by the frontend, we want to get the PFN returned by
m2p_find_override_pfn (that is the PFN setup by gntdev) but actually we
get the original PFN allocated by the frontend because considering that
the frontend and the backend are in the same domain:
One possible solution would be to always call m2p_find_override_pfn to
check out whether we have an entry for a given MFN. However it is not
very efficient or scalable.
The other option (that this patch is implementing) is to mark the pages
shared by the frontend as "foreign", so that mfn != mfn2.
It makes sense because from the frontend point of view they are donated
to the backend and while so they are not supposed to be used by the
frontend. In a way, they don't belong to the frontend anymore, at least
temporarily.
[upstream git commit 6a2c6177]
[v3: only set_phys_to_machine if xen_pv_domain] Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
[v1: Redid description a bit] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Steven Noonan [Fri, 17 Feb 2012 20:04:44 +0000 (12:04 -0800)]
xen-blkfront: make blkif_io_lock spinlock per-device
This patch moves the global blkif_io_lock to the per-device structure. The
spinlock seems to exists for two reasons: to disable IRQs when in the interrupt
handlers for blkfront, and to protect the blkfront VBDs when a detachment is
requested.
Having a global blkif_io_lock doesn't make sense given the use case, and it
drastically hinders performance due to contention. All VBDs with pending IOs
have to take the lock in order to get work done, which serializes everything
pretty badly.
[upstream git commit 3467811] Signed-off-by: Steven Noonan <snoonan@amazon.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Jones [Thu, 16 Feb 2012 12:16:25 +0000 (13:16 +0100)]
xen/blkfront: don't put bdev right after getting it
We should hang onto bdev until we're done with it.
[upstream git commit dad5cf6] Signed-off-by: Andrew Jones <drjones@redhat.com>
[v1: Fixed up git commit description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Since we are using the m2p_override we do have struct pages
corresponding to the user vma mmap'ed by gntdev.
Removing the VM_PFNMAP flag makes get_user_pages work on that vma.
An example test case would be using a Xen userspace block backend
(QDISK) on a file on NFS using O_DIRECT.
Igor Mammedov [Tue, 27 Mar 2012 17:31:08 +0000 (19:31 +0200)]
xen: only check xen_platform_pci_unplug if hvm
commit b9136d207f08
xen: initialize platform-pci even if xen_emul_unplug=never
breaks blkfront/netfront by not loading them because of
xen_platform_pci_unplug=0 and it is never set for PV guest.
[upstream git commit e95ae5a] Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Igor Mammedov [Wed, 21 Mar 2012 14:08:38 +0000 (15:08 +0100)]
xen: initialize platform-pci even if xen_emul_unplug=never
When xen_emul_unplug=never is specified on kernel command line
reading files from /sys/hypervisor is broken (returns -EBUSY).
It is caused by xen_bus dependency on platform-pci and
platform-pci isn't initialized when xen_emul_unplug=never is
specified.
Fix it by allowing platform-pci to ignore xen_emul_unplug=never,
and do not intialize xen_[blk|net]front instead.
[upstream git commit b9136d2] Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Jones [Fri, 6 Jan 2012 09:43:09 +0000 (10:43 +0100)]
xen kconfig: relax INPUT_XEN_KBDDEV_FRONTEND deps
PV-on-HVM guests may want to use the xen keyboard/mouse frontend, but
they don't use the xen frame buffer frontend. For this case it doesn't
make much sense for INPUT_XEN_KBDDEV_FRONTEND to depend on
XEN_FBDEV_FRONTEND. The opposite direction always makes more sense, i.e.
if you're using xenfb, then you'll want xenkbd. Switch the dependencies.
[upstream git commit 4bc25af] Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Stefano Stabellini [Mon, 30 Jan 2012 16:21:48 +0000 (16:21 +0000)]
xen: support pirq_eoi_map
The pirq_eoi_map is a bitmap offered by Xen to check which pirqs need to
be EOI'd without having to issue an hypercall every time.
We use PHYSDEVOP_pirq_eoi_gmfn_v2 to map the bitmap, then if we
succeed we use pirq_eoi_map to check whether pirqs need eoi.
Changes in v3:
- explicitly use PHYSDEVOP_pirq_eoi_gmfn_v2 rather than
PHYSDEVOP_pirq_eoi_gmfn;
- introduce pirq_check_eoi_map, a function to check if a pirq needs an
eoi using the map;
-rename pirq_needs_eoi into pirq_needs_eoi_flag;
- introduce a function pointer called pirq_needs_eoi that is going to be
set to the right implementation depending on the availability of
PHYSDEVOP_pirq_eoi_gmfn_v2.
Stefano Stabellini [Tue, 13 Mar 2012 18:30:44 +0000 (18:30 +0000)]
xen/xenbus: ignore console/0
Unfortunately xend creates a bogus console/0 frotend/backend entry pair
on xenstore that console backends cannot properly cope with.
Any guest behavior that is not completely ignoring console/0 is going
to either cause problems with xenconsoled or qemu.
Returning 0 or -ENODEV from xencons_probe is not enough because it is
going to cause the frontend state to become 4 or 6 respectively.
The best possible thing we can do here is just ignore the entry from
xenbus_probe_frontend.
Stefano Stabellini [Mon, 30 Jan 2012 16:02:31 +0000 (16:02 +0000)]
hvc_xen: implement multiconsole support
This patch implements support for multiple consoles:
consoles other than the first one are setup using the traditional xenbus
and grant-table based mechanism.
We use a list to keep track of the allocated consoles, we don't
expect too many of them anyway.
Changes in v3:
- call hvc_remove before removing the console from xenconsoles;
- do not lock xencons_lock twice in the destruction path;
- use the DEFINE_XENBUS_DRIVER macro.
Alex Shi [Fri, 13 Jan 2012 15:53:35 +0000 (23:53 +0800)]
xen: use this_cpu_xxx replace percpu_xxx funcs
percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace them
for further code clean up.
I don't know much of xen code. But, since the code is in x86 architecture,
the percpu_xxx is exactly same as this_cpu_xxx serials functions. So, the
change is safe.
[upstream git commit 2113f46] Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Christoph Lameter <cl@gentwo.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Mon, 5 Mar 2012 17:11:31 +0000 (17:11 +0000)]
xenbus: don't free other end details too early
The individual drivers' remove functions could legitimately attempt to
access this information (for logging messages if nothing else). Note
that I did not in fact observe a problem anywhere, but I came across
this while looking into the reasons for what turned out to need the
fix at https://lkml.org/lkml/2012/3/5/336 to vsprintf().
[upstream git commit bd0d5aa] Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
linux/drivers/xen/manage.c: In function 'do_suspend':
linux/drivers/xen/manage.c:160:5: warning: 'si.cancelled' may be used uninitialized in this function
[git upstream commit 186bab1] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
xen/xenbus: Add quirk to deal with misconfigured backends.
A rather annoying and common case is when booting a PVonHVM guest
and exposing the PV KBD and PV VFB - as broken toolstacks don't
always initialize the backends correctly.
Normally The HVM guest is using the VGA driver and the emulated
keyboard for this (though upstream version of QEMU implements
PV KBD, but still uses a VGA driver). We provide a very basic
two-stage wait mechanism - where we wait for 30 seconds for all
devices, and then for 270 for all them except the two mentioned.
That allows us to wait for the essential devices, like network
or disk for the full 6 minutes.
To trigger this, put this in your guest config:
vfb = [ 'vnc=1, vnclisten=0.0.0.0 ,vncunused=1']
instead of this:
vnc=1
vnclisten="0.0.0.0"
[upstream git commit 3066616] CC: stable@kernel.org Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
[v3: Split delay in non-essential (30 seconds) and essential
devices per Ian and Stefano suggestion]
[v4: Added comments per Stefano suggestion] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Merge branch 'stable/xen-pciback-0.6.3.bugfixes' into uek2-merge
* stable/xen-pciback-0.6.3.bugfixes:
xen/pcifront: avoid pci_frontend_enable_msix() falsely returning success
xen/pciback: fix XEN_PCI_OP_enable_msix result
xen/pciback: Support pci_reset_function, aka FLR or D3 support.
PCI: Introduce __pci_reset_function_locked to be used when holding device_lock.
The original XenoLinux code has always had things this way, and for
compatibility reasons (in particular with a subsequent pciback
adjustment) upstream Linux should behave the same way (allowing for two
distinct error indications to be returned by the backend).
[upstream git commit f09d843] Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Mon, 2 Apr 2012 14:32:22 +0000 (15:32 +0100)]
xen/pciback: fix XEN_PCI_OP_enable_msix result
Prior to 2.6.19 and as of 2.6.31, pci_enable_msix() can return a
positive value to indicate the number of vectors (less than the amount
requested) that can be set up for a given device. Returning this as an
operation value (secondary result) is fine, but (primary) operation
results are expected to be negative (error) or zero (success) according
to the protocol. With the frontend fixed to match the XenoLinux
behavior, the backend can now validly return zero (success) here,
passing the upper limit on the number of vectors in op->value.
[upstream git commit 0ee46ec] Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Thu, 12 Jan 2012 17:06:47 +0000 (12:06 -0500)]
xen/pciback: Support pci_reset_function, aka FLR or D3 support.
We use the __pci_reset_function_locked to perform the action.
Also on attaching ("bind") and detaching ("unbind") we save and
restore the configuration states. When the device is disconnected
from a guest we use the "pci_reset_function" to also reset the
device before being passed to another guest.
and ends up calling:
driver_bind:
device_lock(dev); <=== TAKES LOCK
XXXX_probe:
.. pci_enable_device()
...__pci_reset_function(), which calls
pci_dev_reset(dev, 0):
if (!0) {
device_lock(dev) <==== DEADLOCK
The __pci_reset_function_locked function allows the the drivers
'probe' function to call the "pci_reset_function" while still holding
the driver mutex lock.
Konrad Rzeszutek Wilk [Sat, 24 Mar 2012 13:18:57 +0000 (09:18 -0400)]
xen/acpi: Fix Kconfig dependency on CPU_FREQ
The functions: "acpi_processor_*" sound like they depend on CONFIG_ACPI_PROCESSOR
but in reality they are exposed when CONFIG_CPU_FREQ=[y|m]. As such
update the Kconfig to have this dependency and fix compile issues:
Konrad Rzeszutek Wilk [Tue, 13 Mar 2012 17:28:12 +0000 (13:28 -0400)]
xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
With patch "xen/cpufreq: Disable the cpu frequency scaling drivers
from loading." we do not have to worry about said drivers loading
themselves before the xen-acpi-processor driver. Hence we can remove
the default selection (=y if CPU frequency drivers were built-in, or
=m if CPU frequency drivers were built as modules), and just
select =m for the default case.
[git commit 102b208] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 14 Mar 2012 00:06:57 +0000 (20:06 -0400)]
xen/cpufreq: Disable the cpu frequency scaling drivers from loading.
By using the functionality provided by "[CPUFREQ]: provide
disable_cpuidle() function to disable the API."
Under the Xen hypervisor we do not want the initial domain to exercise
the cpufreq scaling drivers. This is b/c the Xen hypervisor is
in charge of doing this as well and we can end up with both the
Linux kernel and the hypervisor trying to change the P-states
leading to weird performance issues.
[upstream 48cdd82] Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Fix compile error spotted by Benjamin Schweikert <b.schweikert@googlemail.com>]
Merge branch 'stable/not-upstreamed' into uek2-merge
* stable/not-upstreamed:
Xen: Export host physical CPU information to dom0
xen/mce: Change the machine check point
Add mcelog support from xen platform
mismerged the Makefile. The end result that the line
obj-$(CONFIG_XEN_PVHVM) += platform_pci.o was removed
leading to no PVonHVM driver to tell QEMU to unplug the device.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
x86/PCI: reduce severity of host bridge window conflict warnings
Host bridge windows are top-level resources, so if we find a host bridge
window conflict, it's probably with a hard-coded legacy reservation.
Moving host bridge windows is theoretically possible, but we don't support
it; we just ignore windows with conflicts, and it's not worth making this
a user-visible error.
Konrad Rzeszutek Wilk [Tue, 13 Mar 2012 22:05:58 +0000 (18:05 -0400)]
xen/blkback: Disable DISCARD support for loopback device (but leave for phy).
Until we back-port the changes from 3.3 which alter the loopback device
to support the full gamma of discard attributes. Otherwise we have to
punch through loop device to retrieve the underlaying disk size and
do other nasty things to get the proper information.
Also there is the outstanding issue that Logical Volumes won't pass
through the DISCARD support, so in most cases we can't take advantage of
this code until that gets fixed.
Fixes Oracle BZ#13779884 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Mike Snitzer [Wed, 6 Jul 2011 19:30:50 +0000 (21:30 +0200)]
block: eliminate potential for infinite loop in blkdev_issue_discard
Due to the recently identified overflow in read_capacity_16() it was
possible for max_discard_sectors to be zero but still have discards
enabled on the associated device's queue.
Eliminate the possibility for blkdev_issue_discard to infinitely loop.
Interestingly this issue wasn't identified until a device, whose
discard_granularity was 0 due to read_capacity_16 overflow, was consumed
by blk_stack_limits() to construct limits for a higher-level DM
multipath device. The multipath device's resulting limits never had the
discard limits stacked because blk_stack_limits() will only do so if
the bottom device's discard_granularity != 0. This resulted in the
multipath device's limits.max_discard_sectors being 0.
Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
[Fixes Oracle BZ13779884] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk>
Konrad Rzeszutek Wilk [Wed, 14 Mar 2012 00:47:44 +0000 (20:47 -0400)]
config: Use the xen-acpi-processor instead of the cpufreq-xen driver.
The xen-acpi-processor (CONFIG_XEN_ACPI_PROCESSOR=y) is a more modern
version of the cpufreq-xen driver that can automatically inhibit
the cpufreq scaling drivers.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Fri, 3 Feb 2012 21:03:20 +0000 (16:03 -0500)]
xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.
This driver solves three problems:
1). Parse and upload ACPI0007 (or PROCESSOR_TYPE) information to the
hypervisor - aka P-states (cpufreq data).
2). Upload the the Cx state information (cpuidle data).
3). Inhibit CPU frequency scaling drivers from loading.
The reason for wanting to solve 1) and 2) is such that the Xen hypervisor
is the only one that knows the CPU usage of different guests and can
make the proper decision of when to put CPUs and packages in proper states.
Unfortunately the hypervisor has no support to parse ACPI DSDT tables, hence it
needs help from the initial domain to provide this information. The reason
for 3) is that we do not want the initial domain to change P-states while the
hypervisor is doing it as well - it causes rather some funny cases of P-states
transitions.
For this to work, the driver parses the Power Management data and uploads said
information to the Xen hypervisor. It also calls acpi_processor_notify_smm()
to inhibit the other CPU frequency scaling drivers from being loaded.
Everything revolves around the 'struct acpi_processor' structure which
gets updated during the bootup cycle in different stages. At the startup, when
the ACPI parser starts, the C-state information is processed (processor_idle)
and saved in said structure as 'power' element. Later on, the CPU frequency
scaling driver (powernow-k8 or acpi_cpufreq), would call the the
acpi_processor_* (processor_perflib functions) to parse P-states information
and populate in the said structure the 'performance' element.
Since we do not want the CPU frequency scaling drivers from loading
we have to call the acpi_processor_* functions to parse the P-states and
call "acpi_processor_notify_smm" to stop them from loading.
There is also one oddity in this driver which is that under Xen, the
physical online CPU count can be different from the virtual online CPU count.
Meaning that the macros 'for_[online|possible]_cpu' would process only
up to virtual online CPU count. We on the other hand want to process
the full amount of physical CPUs. For that, the driver checks if the ACPI IDs
count is different from the APIC ID count - which can happen if the user
choose to use dom0_max_vcpu argument. In such a case a backup of the PM
structure is used and uploaded to the hypervisor.
[v1-v2: Initial RFC implementations that were posted]
[v3: Changed the name to passthru suggested by Pasi Kärkkäinen <pasik@iki.fi>]
[v4: Added vCPU != pCPU support - aka dom0_max_vcpus support]
[v5: Cleaned up the driver, fix bug under Athlon XP]
[v6: Changed the driver to a CPU frequency governor]
[v7: Jan Beulich <jbeulich@suse.com> suggestion to make it a cpufreq scaling driver
made me rework it as driver that inhibits cpufreq scaling driver]
[v8: Per Jan's review comments, fixed up the driver] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts: