www.infradead.org Git - users/jedix/linux-maple.git/log

Merge branch 'uek-2.6.39-200_be2netv2' of git://ca-git.us.oracle.com/linux-muvarov-public into uek-2.6.39-200

Orabug: 13448441

[mpt2sas] Bump driver vesion to 13.100.00.00

Orabug: 14040678
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[mpt2sas] fix NULL pointer at ioc->pfacts

Orabug: 14040678
The ioc->pfacts member in the IOC structure is getting set to zero
following a call to _base_get_ioc_facts due to the memset in that routine.
So if the ioc->pfacts was read after a host reset, there would be a NULL
pointer dereference. The routine _base_get_ioc_facts is called from context
of host reset. The problem in _base_get_ioc_facts is the size of
Mpi2IOCFactsReply is 64, whereas the sizeof "struct mpt2sas_facts" is 60,
so there is a four byte overflow resulting from the memset.

Also, there is memset in _base_get_port_facts using the incorrect structure,
it should be "struct mpt2sas_port_facts" instead of Mpi2PortFactsReply.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[mpt2sas] A hard drive is going OFFLINE when there is a hard reset issued
and simultaneously another hard drive is hot unplugged

Orabug: 14040678
Following the host reset, the firmware discovery is reassigning another
hard drive in the topology to the same device handle as that device is
getting hot removed. Until the driver device removal routine is called,
there will be two hard drive with the matching device handle in the
internal device link list. In the device removal routine, a separate
function which moves the device from BLOCKED into OFFLINE state.
Since this routine is passed with the device handle passed as input parameter,
the routine will be traversing the internal device link list searching for
matching device handle. This results in two devices with matching
device handle, therefore both devices goes OFFLINE.

To fix this issue,the input parameter is changed from device handle to
SAS address, therefore only the device that is hot unplugged will be placed
in OFFLINE state.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[mpt2sas] Set the phy identifier of the end device to to the phy number of the parent device
it is linked to

Orabug: 14040678
The phy_identifier inside the routine _transport_set_identify()
is set to sas_device_page_zero->PhyNum. This returns the
phy number of the parent device this device is linked to.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[mpt2sas] While enabling phy, read the current port number from sas iounit page 0
instead of page 1

Orabug: 14040678
The port number is changing after disabling/enabling phys using the SysFS interface
This is because the firmware behavour changed where it would read the the port number
then set it to some different value even though Auto Port Config is turned on.
With this change of behavour in FW, it is possible that the expanders are moved
from one port to another after disabling /enabling phys. This is occuring because
the port number in sas iounit page 1 is not matching up to the current port in
page 0. In order to fix this the driver is modified to read the current
port number from sas iounit page 0 instead of page 1. Also copy the
port and phy flags over from page 0 to page 1.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[mpt2sas] Fix several endian issues found by runing sparse

Orabug: 14040678
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[mpt2sas] Modify the source code as per the findings reported by the source
code analysis tool

Orabug: 14040678
Modified the source code as per the findings reported by the source
code analysis tool. Source code for the following functionalities
has been touched. None of the driver functionalities has changed.

- SMP Passthrough IOCTL
- Debug messages for MPT Replies (i.e. bit 9 of Logging Level)
- Task Management using sysfs
- Device removal, i.e. when a target device (including any PD within a volume) is removed, and Volume Deletion.
- Trace Buffer

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[mpt2sas] Improvement were made to better protect the sas_device, raid_device,
and expander_device lists

There were possible race conditions surrounding reading an object
from the link list while from another context in the driver was
removing it. The nature of this enhancement is to rearrange locking
so the link lists are better protected.

Change set:
(1) numerous routines were rearranged so spin locks are held through
the entire time a link list object is being read from or written to.
(2) added new routines for object deletion from link list.  Thus ensuring
lock was held during the deletion of the link list object, then and memory
for object freed outside the lock. The memory was freed outside the lock
so driver had access to device object info which was required for
notifying the scsi mid layer that a device was getting deleted.
(3) added the ioc->blocking_handles parameter.  This is a bitmask used
to identify which devices need blocking when there is device loss.  This was
introduced so that lock can be held for the entire time traversing the link
list objects, and the bitmask was set to indicate which device handles need
blocking. Oustide the lock the ioc->blocking_handles bitmask is traversed,
with the respective device handle the scsi mid layer is called for moving
devices into blocking state.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[mpt2sas] Perform Target Reset instead of HBA reset when a SATA_PASSTHROUGH cmd
timeout happens

Orabug: 14040678
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[mpt2sas] Added multisegment mode support for Linux BSG Driver

Orabug: 14040678
Added support for Block IO requests with multiple segments (vectors) in
the SMP handler of the SAS Transport Class. This is required by the
BSG driver. Multisegment support added for both, Request and Response.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[mpt2sas] remove the global mutex

Orabug: 14040678
When the lock_kernel and unlock_kernel routines were removed in the
2.6.39 kernel, a global mutex was added on top of the existing mutex
which already existed. With this implementation, only one IOCTL
will be active at any time no matter how many ever controllers
are present. This causes poor performance.

Removed the global mutex so that the driver can work with the existing
semaphore that was already part of the existing code.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[mpt2sas] MPI next revision header update

Orabug: 14040678
Changeset in MPI headers:
1) Bumped MPI2_HEADER_VERSION_UNIT
2) Added 4K sectors supported bit to CapabilitiesFlags field of IOC Page 6.
3) Added UEFIVersion field to BIOS Page 1 and defined additional
BiosOptions bits to control UEFI behavior.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

be2net: make be_vlan_add_vid() void

Make compatibility for 3.0.x kernel, and make be_vlan_add_vid() void.
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

be2net: Record receive queue index in skb to aid RPS.

Signed-off-by: Sarveshwar Bandi <Sarveshwar.Bandi@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix FW download for BE

Skip flashing a FW component if that component is not present in a
particular FW UFI image.

Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix traffic stall INTx mode

EQ is getting armed wrongly in INTx mode as INTx interrupt is taking
some time to deassert. This can cause another interrupt while NAPI is
scheduled and scheduling a NAPI in interrupt does not take effect.
This causes interrupt to be missed and traffic stalls. Fixing this by
preventing wrong arming of EQ.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: fix ethtool get settings

ethtool get settings was not displaying all the settings correctly.
use the get_phy_info to get more information about the PHY to fix this.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: fix programming of VLAN tags for VF

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: reset queue address after freeing

This will prevent double free in some cases where be_clear() is called
for cleanup when be_setup() fails half-way.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: fix tx completion cleanup

As a part of be_close(), instead of waiting for a max of 200ms for each TXQ,
wait for a total of 200ms for completions from all TXQs to arrive.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: refactor/cleanup vf configuration code

Mainline commit:
11ac75ed1eb9d8f5ff067fa9a82ebf5075989281
- use adapter->num_vfs (and not the module param) to store the actual
number of vfs created. Use the same variable to reflect SRIOV
enable/disable state. So, drop the adapter->sriov_enabled field.

- use for_all_vfs() macro in VF configuration code

- drop the "vf_" prefix for the fields of be_vf_cfg; the prefix is
redundant and removing it helps reduce line wrap

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 11ac75ed1eb9d8f5ff067fa9a82ebf5075989281)

Conflicts:

drivers/net/benet/be_main.c

be2net: event queue re-design

v2: Fixed up the bad typecasting pointed out by David...

In the current design 8 TXQs are serviced by 1 EQ, while each RSS queue
is serviced by a separate EQ. This is being changed as follows:

- Upto 8 EQs will be used (based on the availabilty of msix vectors).
Each EQ will handle 1 RSS and 1 TX ring. The default non-RSS RX queue and
MCC queue are handled by the last EQ.

- On cards which provide support, upto 8 RSS rings will be used, instead
of the current limit of 4.

The new design allows spreading the TX multi-queue completion processing
across multiple CPUs unlike the previous design.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

be2net: update the driver version

Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com>

be2net: Fix EEH error reset before a flash dump completes

An EEH error can cause the FW to trigger a flash debug dump.
Resetting the card while flash dump is in progress can cause it not to recover.
Wait for it to finish before letting EEH flow to reset the card.

Signed-off-by: Sathya Perla <Sathya.Perla@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Ignore status of some ioctls during driver load

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix wrong status getting returned for MCC commands

MCC Response CQEs are processed as part of NAPI poll routine and
also synchronously. If MCC completions are consumed by NAPI poll
routine, wrong status is returned to synchronously waiting routine.
Fix this by getting status of MCC command from command response
instead of response CQEs.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix Lancer statistics

Fix port num sent in command to get stats. Also skip unnecessary
parsing of stats for Lancer.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix ethtool self test for Lancer

Lancer does not support DDR self test. Fix ethtool self test by
skipping this test for Lancer.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix FW download in Lancer

Increase time given by driver to adapter for completing FW download
to 30 seconds. Also return correct status when FW download times out.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix VLAN/multicast packet reception

VLAN and multicast hardware filters are limited and can get
exhausted in adapters with many PCI functions. If setting
a VLAN or multicast filter fails due to lack of sufficient
hardware resources, these packets get dropped. Fix this by
switching to VLAN or multicast promiscous mode so that these
packets are not dropped.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix number of vlan slots in flex mode

In flex10 mode the number of vlan slots supported is halved.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: enable WOL by default if h/w supports it

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Remove unused OFFSET_IN_PAGE() macro

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: enable RSS for ipv6 pkts

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Use new implementation of get mac list command

VFs use get mac list command to get their mac address. The format of
this command has changed. Update driver to use the new format.

Signed-off-by: Mammatha Edhala <mammatha.edhala@emulex.com>
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix link status query command

Version number in query link status command is getting overwritten in
be_wrb_cmd_hdr_prepare() routine. Move the initialization to fix this
issue. Also initialize the domain field.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethtool: Null-terminate filename passed to ethtool_ops::flash_device

The parameters for ETHTOOL_FLASHDEV include a filename, which ought to
be null-terminated. Currently the only driver that implements
ethtool_ops::flash_device attempts to add a null terminator if
necessary, but does it wrongly. Do it in the ethtool core instead.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: add descriptions for stat counters reported via ethtool

Also rename a few counters appropritely and delete 2 counters that are not
implemented in HW.

vlan_mismatch_drops does not exist in BE3 and is accounted for in
address_mismatch_drops. Do the same thing for BE2 and Lancer.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: allocate more headroom in incoming skbs

Allocation of 64 bytes in skb headroom is not enough if we have to pull
ethernet + ipv6 + tcp headers, and/or extra tunneling header.

Its currently not noticed because netdev_alloc_skb_ip_align(64) give us
more room, thanks to power-of-two kmalloc() roundups.

Make sure we ask for 128 bytes so that side effects of upcoming patches
from Ian Campbell dont decrease benet rx performance, because of extra
skb head reallocations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Vasundhara Volam <vasundhara.volam@emulex.com>
Cc: Sathya Perla <sathya.perla@emulex.com>
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdev: make net_device_ops const

More drivers where net_device_ops should be const.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: fix be_vlan_add/rem_vid

1) fix be_vlan_add/rem_vid to return proper status
2) perform appropriate housekeeping if firmware command succeeds.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix INTx processing for Lancer

Lancer does not have HW registers to indicate the EQ causing the INTx
interrupt. As a result EQE entries of one EQ may be consumed when interrupt
is caused by another EQ. Fix this by arming CQs at the end of NAPI poll
routine to regenerate the EQEs.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Add support for Skyhawk cards

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: fix ethtool ringparam reporting

The ethtool "-g" option is supposed to report the max queue length and
user modified queue length for RX and TX queues. be2net doesn't support
user modification of queue lengths. So, the correct values for these
would be the max numbers.
be2net incorrectly reports the queue used values for these fields.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: workaround to fix a bug in BE

disable Tx vlan offloading in certain cases.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: update some counters to display via ethtool

update pmem_fifo_overflow_drop, rx_priority_pause_frames counters.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: make vlan ndo_vlan_rx_[add/kill]_vid return error value

Let caller know the result of adding/removing vlan id to/from vlan
filter.

In some drivers I make those functions to just return 0. But in those
where there is able to see if hw setup went correctly, return value is
set appropriately.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: netpoll support

Add missing netpoll support.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'uek2-merge' of git://ca-git.us.oracle.com/linux-konrad-public into uek-2.6.39-200

Merge branch 'stable/for-linus-3.4.rebased' into uek2-merge

* stable/for-linus-3.4.rebased:
  xen/pci: don't use PCI BIOS service for configuration space accesses
  xen/pte: Fix crashes when trying to see non-existent PGD/PMD/PUD/PTEs
  xen/apic: Return the APIC ID (and version) for CPU 0.
  drivers/video/xen-fbfront.c: add missing cleanup code
  xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries'

Conflicts:
arch/x86/xen/enlighten.c

xen/pci: don't use PCI BIOS service for configuration space accesses

The accessing PCI configuration space with the PCI BIOS32 service does
not work in PV guests.

On systems without MMCONFIG or where the BIOS hasn't marked the
MMCONFIG region as reserved in the e820 map, the BIOS service is
probed (even though direct access is preferred) and this hangs.

CC: stable@kernel.org
Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[upstream git commit 76a8df7b49168509df02461f83fab117a4a86e08]
Conflicts:

arch/x86/xen/enlighten.c

xen/Kconfig: fix Kconfig layout

Fit it into 80 columns so that it is readable in menuconfig.

[upstream git commit 0d8deb3fb1ea198aeaf1359b69f15e173ef8d3ed]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/pte: Fix crashes when trying to see non-existent PGD/PMD/PUD/PTEs

If I try to do "cat /sys/kernel/debug/kernel_page_tables"
I end up with:

BUG: unable to handle kernel paging request at ffffc7fffffff000
IP: [<ffffffff8106aa51>] ptdump_show+0x221/0x480
PGD 0
Oops: 0000 [#1] SMP
CPU 0
.. snip..
RAX: 0000000000000000 RBX: ffffc00000000fff RCX: 0000000000000000
RDX: 0000800000000000 RSI: 0000000000000000 RDI: ffffc7fffffff000

which is due to the fact we are trying to access a PFN that is not
accessible to us. The reason (at least in this case) was that
PGD[256] is set to __HYPERVISOR_VIRT_START which setup (by the
hypervisor) to point to a read-only linear map of the MFN->PFN array.
During our parsing we would get the MFN (a valid one), try to look
it up in the MFN->PFN tree and find it invalid and return ~0 as PFN.
Then pte_mfn_to_pfn would happilly feed that in, attach the flags
and return it back to the caller. In this case the ptdump_show
bitshifts it and we get this invalid value.

Instead of doing all of that, we detect the ~0 case and just return
!_PAGE_PRESENT.

CC: stable@kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/apic: Return the APIC ID (and version) for CPU 0.

On x86_64 on AMD machines where the first APIC_ID is not zero, we get:

ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled)
BIOS bug: APIC version is 0 for CPU 1/0x10, fixing up to 0x10
BIOS bug: APIC version mismatch, boot CPU: 0, CPU 1: version 10

which means that when the ACPI processor driver loads and
tries to parse the _Pxx states it fails to do as, as it
ends up calling acpi_get_cpuid which does this:

for_each_possible_cpu(i) {
if (cpu_physical_id(i) == apic_id)
return i;
}

And the bootup CPU, has not been found so it fails and returns -1
for the first CPU - which then subsequently in the loop that
"acpi_processor_get_info" does results in returning an error, which
means that "acpi_processor_add" failing and per_cpu(processor)
is never set (and is NULL).

That means that when xen-acpi-processor tries to load (much much
later on) and parse the P-states it gets -ENODEV from
acpi_processor_register_performance() (which tries to read
the per_cpu(processor)) and fails to parse the data.

Reported-by-and-Tested-by: Stefan Bader <stefan.bader@canonical.com>
Suggested-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
[v2: Bit-shift APIC ID by 24 bits]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

drivers/video/xen-fbfront.c: add missing cleanup code

The operations in the subsequent error-handling code appear to be also
useful here.

Acked-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
[v1: Collapse some of the error handling functions]
[v2: Fix compile warning]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries'

The above mentioned patch checks the IOAPIC and if it contains
-1, then it unmaps said IOAPIC. But under Xen we get this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
IP: [<ffffffff8134e51f>] xen_irq_init+0x1f/0xb0
PGD 0
Oops: 0002 [#1] SMP
CPU 0
Modules linked in:

Pid: 1, comm: swapper/0 Not tainted 3.2.10-3.fc16.x86_64 #1 Dell Inc. Inspiron
1525                  /0U990C
RIP: e030:[<ffffffff8134e51f>]  [<ffffffff8134e51f>] xen_irq_init+0x1f/0xb0
RSP: e02b: ffff8800d42cbb70  EFLAGS: 00010202
RAX: 0000000000000000 RBX: 00000000ffffffef RCX: 0000000000000001
RDX: 0000000000000040 RSI: 00000000ffffffef RDI: 0000000000000001
RBP: ffff8800d42cbb80 R08: ffff8800d6400000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffef
R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000010
FS:  0000000000000000(0000) GS:ffff8800df5fe000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0:000000008005003b
CR2: 0000000000000040 CR3: 0000000001a05000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/0 (pid: 1, threadinfo ffff8800d42ca000, task ffff8800d42d0000)
Stack:
00000000ffffffef 0000000000000010 ffff8800d42cbbe0 ffffffff8134f157
ffffffff8100a9b2 ffffffff8182ffd1 00000000000000a0 00000000829e7384
0000000000000002 0000000000000010 00000000ffffffff 0000000000000000
Call Trace:
[<ffffffff8134f157>] xen_bind_pirq_gsi_to_irq+0x87/0x230
[<ffffffff8100a9b2>] ? check_events+0x12+0x20
[<ffffffff814bab42>] xen_register_pirq+0x82/0xe0
[<ffffffff814bac1a>] xen_register_gsi.part.2+0x4a/0xd0
[<ffffffff814bacc0>] acpi_register_gsi_xen+0x20/0x30
[<ffffffff8103036f>] acpi_register_gsi+0xf/0x20
[<ffffffff8131abdb>] acpi_pci_irq_enable+0x12e/0x202
[<ffffffff814bc849>] pcibios_enable_device+0x39/0x40
[<ffffffff812dc7ab>] do_pci_enable_device+0x4b/0x70
[<ffffffff812dc878>] __pci_enable_device_flags+0xa8/0xf0
[<ffffffff812dc8d3>] pci_enable_device+0x13/0x20

The reason we are dying is b/c the call acpi_get_override_irq() is used,
which returns the polarity and trigger for the IRQs. That function calls
mp_find_ioapics to get the 'struct ioapic' structure - which along with the
mp_irq[x] is used to figure out the default values and the polarity/trigger
overrides. Since the mp_find_ioapics now returns -1 [b/c the IOAPIC is filled
with 0xffffffff], the acpi_get_override_irq() stops trying to lookup in the
mp_irq[x] the proper INT_SRV_OVR and we can't install the SCI interrupt.

The proper fix for this is going in v3.5 and adds an x86_io_apic_ops
struct so that platforms can override it. But for v3.4 lets carry this
work-around. This patch does that by providing a slightly different variant
of the fake IOAPIC entries.

[upstream git commit]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

SPEC: v2.6.39-200.6.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[USB] cdc-acm: Increase number of devices to 64

Orabug: 13693812
Increase usb acm devices to 64.

Signed-off-by: Joe Jin <joe.jin@oracle.com>

git-changelog: generate date entry

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[scsi] hpsa: Remove some PCI IDs if for OL5.

Remove some PCI IDs if for OL5.

Signed-off-by: Joe Jin <joe.jin@oracle.com>

[block] cciss: fix incorrect PCI IDs and add two new ones

This patch backport from https://lkml.org/lkml/2011/3/4/207
commit d241b7cbd5b05c591aff96c5f1f0b7d616fdc0c3
Author: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Date: Fri Mar 4 21:45:14 2011 -0600

hpsa: fix some incorrect PCI IDs and add a couple of new ones.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

[scsi] hpsa: add some older controllers to the kdump blacklist

This backport from RHEL6U3.

Signed-off-by: Joe Jin <joe.jin@oracle.com>

[block] cciss: Add IRQF_SHARED back in for the non-MSI(X) interrupt handler

This backport from RHEL6U3.

Signed-off-by: Joe Jin <joe.jin@oracle.com>

[block] cciss: add some older controllers to the kdump blacklist

This backport from RHEL6U3.

Signed-off-by: Joe Jin <joe.jin@oracle.com>

SPEC: v2.6.39-200.5.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

be2net: query link status in be_open()

be2net gets an async link status notification from the FW when it creates
an MCC queue. There are some cases in which this gratuitous notification
is not received from FW. To cover this explicitly query the link status
in be_open().

Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com>

Merge branch 'stable/for-linus-3.4.rebased' into uek2-merge

* stable/for-linus-3.4.rebased:
  drivers/video/xen-fbfront.c: add missing cleanup code
  xen: correctly check for pending events when restoring irq flags
  xen/smp: Fix crash when booting with ACPI hotplug CPUs.
  xen: use the pirq number to check the pirq_eoi_map

xen/acpi: Workaround broken BIOSes exporting non-existing C-states.

We did a similar check for the P-states but did not do it for
the C-states. What we want to do is ignore cases where the DSDT
has definition for sixteen CPUs, but the machine only has eight
CPUs and we get:
xen-acpi-processor: (CX): Hypervisor error (-22) for ACPI CPU14

[upstream git commit b930fe5e1f5646e071facda70b25b137ebeae5af]
Reported-by: Tobias Geiger <tobias.geiger@vido.info>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded.

There are exactly four users of __monitor and __mwait:

- cstate.c (which allows acpi_processor_ffh_cstate_enter to be called
   when the cpuidle API drivers are used. However patch
   "cpuidle: replace xen access to x86 pm_idle and default_idle"
   provides a mechanism to disable the cpuidle and use safe_halt.
- smpboot (which allows mwait_play_dead to be called). However
   safe_halt is always used so we skip that.
- intel_idle (same deal as above).
- acpi_pad.c. This the one that we do not want to run as we
   will hit the below crash.

Why do we want to expose MWAIT_LEAF in the first place?
We want it for the xen-acpi-processor driver - which uploads
C-states to the hypervisor. If MWAIT_LEAF is set, the cstate.c
sets the proper address in the C-states so that the hypervisor
can benefit from using the MWAIT functionality. And that is
the sole reason for using it.

Without this patch, if a module performs mwait or monitor we
get this:

invalid opcode: 0000 [#1] SMP
CPU 2
.. snip..
Pid: 5036, comm: insmod Tainted: G           O 3.4.0-rc2upstream-dirty #2 Intel Corporation S2600CP/S2600CP
RIP: e030:[<ffffffffa000a017>]  [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check]
RSP: e02b:ffff8801c298bf18  EFLAGS: 00010282
RAX: ffff8801c298a010 RBX: ffffffffa03b2000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8801c29800d8 RDI: ffff8801ff097200
RBP: ffff8801c298bf18 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: ffffffffa000a000 R14: 0000005148db7294 R15: 0000000000000003
FS:  00007fbb364f2700(0000) GS:ffff8801ff08c000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000179f038 CR3: 00000001c9469000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process insmod (pid: 5036, threadinfo ffff8801c298a000, task ffff8801c29cd7e0)
Stack:
ffff8801c298bf48 ffffffff81002124 ffffffffa03b2000 00000000000081fd
000000000178f010 000000000178f030 ffff8801c298bf78 ffffffff810c41e6
00007fff3fb30db9 00007fff3fb30db9 00000000000081fd 0000000000010000
Call Trace:
[<ffffffff81002124>] do_one_initcall+0x124/0x170
[<ffffffff810c41e6>] sys_init_module+0xc6/0x220
[<ffffffff815b15b9>] system_call_fastpath+0x16/0x1b
Code: <0f> 01 c8 31 c0 0f 01 c9 c9 c3 00 00 00 00 00 00 00 00 00 00 00 00
RIP  [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check]
RSP <ffff8801c298bf18>
---[ end trace 16582fc8a3d1e29a ]---
Kernel panic - not syncing: Fatal exception

With this module (which is what acpi_pad.c would hit):

MODULE_AUTHOR("Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>");
MODULE_DESCRIPTION("mwait_check_and_back");
MODULE_LICENSE("GPL");
MODULE_VERSION();

static int __init mwait_check_init(void)
{
__monitor((void *)&current_thread_info()->flags, 0, 0);
__mwait(0, 0);
return 0;
}
static void __exit mwait_check_exit(void)
{
}
module_init(mwait_check_init);
module_exit(mwait_check_exit);

[upstream git commit df88b2d96e36d9a9e325bfcd12eb45671cbbc937]
Reported-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:

arch/x86/xen/enlighten.c

drivers/video/xen-fbfront.c: add missing cleanup code

The operations in the subsequent error-handling code appear to be also
useful here.

[upstream git commit c92c928b19bdc5da64d85fcc5bac95b5ab3a5743]
Acked-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
[v1: Collapse some of the error handling functions]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: correctly check for pending events when restoring irq flags

In xen_restore_fl_direct(), xen_force_evtchn_callback() was being
called even if no events were pending. This resulted in (depending on
workload) about a 100 times as many xen_version hypercalls as
necessary.

Fix this by correcting the sense of the conditional jump.

This seems to give a significant performance benefit for some
workloads.

There is some subtle tricksy "..since the check here is trying to
check both pending and masked in a single cmpw, but I think this is
correct. It will call check_events now only when the combined
mask+pending word is 0x0001 (aka unmasked, pending)." (Ian)

[upstream git commit 7eb7ce4d2e8991aff4ecb71a81949a907ca755ac]
CC: stable@kernel.org
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/smp: Fix crash when booting with ACPI hotplug CPUs.

When we boot on a machine that can hotplug CPUs and we
are using 'dom0_max_vcpus=X' on the Xen hypervisor line
to clip the amount of CPUs available to the initial domain,
we get this:

(XEN) Command line: com1=115200,8n1 dom0_mem=8G noreboot dom0_max_vcpus=8 sync_console mce_verbosity=verbose console=com1,vga loglvl=all guest_loglvl=all
.. snip..
DMI: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.99.99.x032.072520111118 07/25/2011
.. snip.
SMP: Allowing 64 CPUs, 32 hotplug CPUs
installing Xen timer for CPU 7
cpu 7 spinlock event irq 361
NMI watchdog: disabled (cpu7): hardware events not enabled
Brought up 8 CPUs
.. snip..
[acpi processor finds the CPUs are not initialized and starts calling
arch_register_cpu, which creates /sys/devices/system/cpu/cpu8/online]
CPU 8 got hotplugged
CPU 9 got hotplugged
CPU 10 got hotplugged
.. snip..
initcall 1_acpi_battery_init_async+0x0/0x1b returned 0 after 406 usecs
calling  erst_init+0x0/0x2bb @ 1

[and the scheduler sticks newly started tasks on the new CPUs, but
said CPUs cannot be initialized b/c the hypervisor has limited the
amount of vCPUS to 8 - as per the dom0_max_vcpus=8 flag.
The spinlock tries to kick the other CPU, but the structure for that
is not initialized and we crash.]
BUG: unable to handle kernel paging request at fffffffffffffed8
IP: [<ffffffff81035289>] xen_spin_lock+0x29/0x60
PGD 180d067 PUD 180e067 PMD 0
Oops: 0002 [#1] SMP
CPU 7
Modules linked in:

Pid: 1, comm: swapper/0 Not tainted 3.4.0-rc2upstream-00001-gf5154e8 #1 Intel Corporation S2600CP/S2600CP
RIP: e030:[<ffffffff81035289>]  [<ffffffff81035289>] xen_spin_lock+0x29/0x60
RSP: e02b:ffff8801fb9b3a70  EFLAGS: 00010282

With this patch, we cap the amount of vCPUS that the initial domain
can run, to exactly what dom0_max_vcpus=X has specified.

In the future, if there is a hypercall that will allow a running
domain to expand past its initial set of vCPUS, this patch should
be re-evaluated.

[upstream git commit cf405ae612b0f7e2358db7ff594c0e94846137aa]
CC: stable@kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: use the pirq number to check the pirq_eoi_map

In pirq_check_eoi_map use the pirq number rather than the Linux irq
number to check whether an eoi is needed in the pirq_eoi_map.

The reason is that the irq number is not always identical to the
pirq number so if we wrongly use the irq number to check the
pirq_eoi_map we are going to check for the wrong pirq to EOI.

As a consequence some interrupts might not be EOI'ed by the
guest correctly.

[upstream git commit 521394e4e679996955bc351cb6b64639751db2ff]
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Tobias Geiger <tobias.geiger@vido.info>
[v1: Added some extra wording to git commit]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

SPEC: v2.6.39-200.4.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

Merge branch 'uek2-merge' of git://ca-git.us.oracle.com/linux-konrad-public into uek-2.6.39-200

SPEC: v2.6.39-200.3.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

Merge branch 'loop' of git://ca-git.us.oracle.com/linux-dkleikam-public into uek-2.6.39-200

loop: loop_thread needs to set the PF_LESS_THROTTLE flag

The underlying file system may call balance_dirty_pages. We don't want
it to throttle there since we may be in the process of writing dirty
pages. This patch addresses the problem in the same manner as a local
nfs mount, as nfsd does the same.

Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>

Merge branch 'stable/for-linus-3.4.rebased' into uek2-merge

* stable/for-linus-3.4.rebased:
Revert "xen/p2m: m2p_find_override: use list_for_each_entry_safe"

Revert "xen/p2m: m2p_find_override: use list_for_each_entry_safe"

This reverts commit f977bf653bafd05c48e8db3f789da5e8c3006df1.

[upstream git commit 3d81acb1cdb242378a1acb3eb1bc28c6bb5895f1]

iov_iter: missing assignment of ii_bvec_ops.ii_shorten

This seems to have been an oversight when porting the patches. This
function should never be called, but it should be defined just in case.

Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>

SPEC: v2.6.39-200.2.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

regset: Return -EFAULT, not -EIO, on host-side memory fault

There is only one error code to return for a bad user-space buffer
pointer passed to a system call in the same address space as the
system call is executed, and that is EFAULT. Furthermore, the
low-level access routines, which catch most of the faults, return
EFAULT already.
This fixes: CVE-2012-1097

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@hack.frob.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

regset: Prevent null pointer reference on readonly regsets

The regset common infrastructure assumed that regsets would always
have .get and .set methods, but not necessarily .active methods.
Unfortunately people have since written regsets without .set methods.

Rather than putting in stub functions everywhere, handle regsets with
null .get or .set methods explicitly.
This fixes: CVE-2012-1097

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@hack.frob.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cifs: fix dentry refcount leak when opening a FIFO on lookup

commit 5bccda0ebc7c0331b81ac47d39e4b920b198b2cd upstream.

The cifs code will attempt to open files on lookup under certain
circumstances. What happens though if we find that the file we opened
was actually a FIFO or other special file?

Currently, the open filehandle just ends up being leaked leading to
a dentry refcount mismatch and oops on umount. Fix this by having the
code close the filehandle on the server if it turns out not to be a
regular file. While we're at it, change this spaghetti if statement
into a switch too.
This fixes: CVE-2012-1090

Reported-by: CAI Qian <caiqian@redhat.com>
Tested-by: CAI Qian <caiqian@redhat.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

git-changelog: add brackets around cve

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

Merge branch 'stable/for-linus-3.4.rebased' into uek2-merge

* stable/for-linus-3.4.rebased: (29 commits)
  xen/blkback: Fix warning error.
  xen/blkback: Make optional features be really optional.
  xen-blkfront: module exit handling adjustments
  xen-blkfront: properly name all devices
  xen-blkfront: set pages are FOREIGN_FRAME when sharing them
  xen: EXPORT_SYMBOL set_phys_to_machine
  xen-blkfront: make blkif_io_lock spinlock per-device
  xen/blkfront: don't put bdev right after getting it
  xen-blkfront: use bitmap_set() and bitmap_clear()
  xen/blkback: Enable blkback on HVM guests
  xen/blkback: use grant-table.c hypercall wrappers
  xen/p2m: m2p_find_override: use list_for_each_entry_safe
  xen/gntdev: do not set VM_PFNMAP
  xen/grant-table: add error-handling code on failure of gnttab_resume
  xen: only check xen_platform_pci_unplug if hvm
  xen: initialize platform-pci even if xen_emul_unplug=never
  xen kconfig: relax INPUT_XEN_KBDDEV_FRONTEND deps
  xen: support pirq_eoi_map
  xen/smp: Remove unnecessary call to smp_processor_id()
  xen/smp: Fix bringup bug in AP code.
  ...

xen/blkback: Fix warning error.

drivers/block/xen-blkback/xenbus.c: In function 'xen_blkbk_discard':
drivers/block/xen-blkback/xenbus.c:419:4: warning: passing argument 1 of 'dev_warn' makes pointer from integer without a cast
+[enabled by default]
include/linux/device.h:894:5: note: expected 'const struct device *' but argument is of type 'long int'

It is unclear how that mistake made it in. It surely is wrong.

[upstream git commit a71e23d]
Acked-by: Jens Axboe <axboe@kernel.dk>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:

drivers/block/xen-blkback/xenbus.c

xen/blkback: Make optional features be really optional.

They were using the xenbus_dev_fatal() function which would
change the state of the connection immediately. Which is not
what we want when we advertise optional features.

So make 'feature-discard','feature-barrier','feature-flush-cache'
optional.

[upstream git commit 3389bb8]
Suggested-by: Jan Beulich <JBeulich@suse.com>
[v1: Made the discard function void and static]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:

drivers/block/xen-blkback/xenbus.c

Merge branch 'stable/xen-network-3.3.rebase' into uek2-merge

* stable/xen-network-3.3.rebase:
  xen-netback: make ops structs const
  netback: fix typo in comment
  netback: remove redundant assignment
  netback: Fix alert message.
  xen-netback: use correct index for invalidation in xen_netbk_tx_check_gop()
  net: xen-netback: correctly restart Tx after a VM restore/migrate
  xen/netback: Add module alias for autoloading

xen-blkfront: module exit handling adjustments

The blkdev major must be released upon exit, or else the module can't
attach to devices using the same majors upon being loaded again. Also
avoid leaking the minor tracking bitmap.

[upstream git commit 4e55b3c]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen-blkfront: properly name all devices

- devices beyond xvdzz didn't get proper names assigned at all
- extended devices with minors not representable within the kernel's
major/minor bit split spilled into foreign majors

[upstream git commit 85b6984]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen-blkfront: set pages are FOREIGN_FRAME when sharing them

Set pages as FOREIGN_FRAME whenever blkfront shares them with another
domain. Then when blkfront un-share them, also removes the
FOREIGN_FRAME_BIT from the p2m.

We do it so that when the source and the destination domain are the same
(blkfront connected to a disk backend in the same domain) we can more easily
recognize which ones are the source pfns and which ones are the
destination pfns (both are going to be pointing to the same mfns).

Without this patch enstablishing a connection between blkfront and QEMU
qdisk in the same domain causes QEMU to hang and never return.

The scenario where this used is when a disk image in QCOW2 is used
for extracting the kernel and initrd image. The QCOW2 image file cannot
be loopback-ed and to run 'pygrub', the weird scaffolding of:
- setup QEMU and qdisk with the qcow2 image [disk backend]
- setup xen-blkfront mounting said disk backend in the domain.
- extract kernel and initrd
- tear it down.

The MFNs shared shared by the frontend are going to back two
different sets of PFNs: the original PFNs allocated by the frontend and
the new ones allocated by gntdev for the backend.

The problem is that when Linux calls mfn_to_pfn, passing as argument
one of the MFN shared by the frontend, we want to get the PFN returned by
m2p_find_override_pfn (that is the PFN setup by gntdev) but actually we
get the original PFN allocated by the frontend because considering that
the frontend and the backend are in the same domain:

pfn = machine_to_phys_mapping[mfn];
mfn2 = get_phys_to_machine(pfn);

in this case mfn == mfn2.

One possible solution would be to always call m2p_find_override_pfn to
check out whether we have an entry for a given MFN. However it is not
very efficient or scalable.

The other option (that this patch is implementing) is to mark the pages
shared by the frontend as "foreign", so that mfn != mfn2.

It makes sense because from the frontend point of view they are donated
to the backend and while so they are not supposed to be used by the
frontend. In a way, they don't belong to the frontend anymore, at least
temporarily.

[upstream git commit 6a2c6177]
[v3: only set_phys_to_machine if xen_pv_domain]
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
[v1: Redid description a bit]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen: EXPORT_SYMBOL set_phys_to_machine

[upstream git commit 134c1d7]
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen-blkfront: make blkif_io_lock spinlock per-device

This patch moves the global blkif_io_lock to the per-device structure. The
spinlock seems to exists for two reasons: to disable IRQs when in the interrupt
handlers for blkfront, and to protect the blkfront VBDs when a detachment is
requested.

Having a global blkif_io_lock doesn't make sense given the use case, and it
drastically hinders performance due to contention. All VBDs with pending IOs
have to take the lock in order to get work done, which serializes everything
pretty badly.

[upstream git commit 3467811]
Signed-off-by: Steven Noonan <snoonan@amazon.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/blkfront: don't put bdev right after getting it

We should hang onto bdev until we're done with it.

[upstream git commit dad5cf6]
Signed-off-by: Andrew Jones <drjones@redhat.com>
[v1: Fixed up git commit description]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen-blkfront: use bitmap_set() and bitmap_clear()

Use bitmap_set and bitmap_clear rather than modifying individual bits
in a memory region.

[upstream git commit 34ae2e4]
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/blkback: Enable blkback on HVM guests

[upstream git commit b2167ba]
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

xen/blkback: use grant-table.c hypercall wrappers

[upstream git commit 4f14faa]
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>