Joe Jin [Fri, 20 Jul 2012 13:30:51 +0000 (21:30 +0800)]
[scsi] hpsa: add all support devices for ol5
Orabug: 14106006
To support uek2 on ol5, commit 29a8828 disable some devices from support list,
this made ovm3 upgrade from 3.0.3 to 3.1.1 failed to addressed local disk for
disk device name changed.
If kernel run as ovm3.1.1 dom0 kernel, please pass cciss_allow_hpsa=1 when
load cciss driver, for ol5.
Adnan Misherfi [Thu, 2 Aug 2012 20:17:44 +0000 (16:17 -0400)]
Disable VLAN 0 tagging for none VLAN traffic
Orabug: 14406424
Cisco enic driver on UCS blades tags a None VLAN traffic with VLAN 0, this causes VMs
that do not have the kernel patch " VLAN 0 should be treated as no vlan tag" to drop all
receive traffic as these VMs do not know how to deal with the VLAN 0 tag.
This is also a problem for older VMs that can not take the mentioned patch.
This fix disables the enic driver from tagging a None VLAN traffic with VLAN 0.This
fix is controlled by a driver parameters " disable_vlan0". the default value is disable_vlan0=1
which to disable the driver from tagging traffic with VLAN 0. To revert to original behavior
add "options enic disable_vlan0=0" to /etc/modprobe.con
Jeff Mahoney [Thu, 2 Aug 2012 12:04:00 +0000 (05:04 -0700)]
dl2k: Clean up rio_ioctl
Orabug: 14126896
The dl2k driver's rio_ioctl call has a few issues:
- No permissions checking
- Implements SIOCGMIIREG and SIOCGMIIREG using the SIOCDEVPRIVATE numbers
- Has a few ioctls that may have been used for debugging at one point
but have no place in the kernel proper.
This patch removes all but the MII ioctls, renumbers them to use the
standard ones, and adds the proper permission check for SIOCSMIIREG.
We can also get rid of the dl2k-specific struct mii_data in favor of
the generic struct mii_ioctl_data.
Since we have the phyid on hand, we can add the SIOCGMIIPHY ioctl too.
Most of the MII code for the driver could probably be converted to use
the generic MII library but I don't have a device to test the results.
This fixes: CVE-2012-2313
Reported-by: Stephan Mueller <stephan.mueller@atsec.com> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com> Signed-off-by: Guangyu Sun <guangyu.sun@oracle.com>
Jeff Mahoney [Thu, 2 Aug 2012 12:04:00 +0000 (05:04 -0700)]
dl2k: Clean up rio_ioctl
Orabug: 14126896
The dl2k driver's rio_ioctl call has a few issues:
- No permissions checking
- Implements SIOCGMIIREG and SIOCGMIIREG using the SIOCDEVPRIVATE numbers
- Has a few ioctls that may have been used for debugging at one point
but have no place in the kernel proper.
This patch removes all but the MII ioctls, renumbers them to use the
standard ones, and adds the proper permission check for SIOCSMIIREG.
We can also get rid of the dl2k-specific struct mii_data in favor of
the generic struct mii_ioctl_data.
Since we have the phyid on hand, we can add the SIOCGMIIPHY ioctl too.
Most of the MII code for the driver could probably be converted to use
the generic MII library but I don't have a device to test the results.
This fixes: CVE-2012-2313
Reported-by: Stephan Mueller <stephan.mueller@atsec.com> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
This is not for upstream as it memblock_x86_reserve_range is not
used upstream anymore.
When I back-ported the patches:
xen/x86: Use memblock_reserve for sensitive areas.
xen/mmu: Recycle the Xen provided L4, L3, and L2 pages
I simply used sed s/memblock_reserve/memblock_x86_reserve_range/.
That was incorrect as the parameters are different - memblock_reserve
as second expects the size, while memblock_x86_reserve_range expects
the physical address. This patch fixes those bugs.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Merge branch 'stable/for-linus-3.7.rebased' into uek2-merge
* stable/for-linus-3.7.rebased:
xen/p2m: Reserve 8MB of _brk space for P2M leafs when populating back.
xen/mmu: Remove from __ka space PMD entries for pagetables.
xen/mmu: Copy and revector the P2M tree.
xen/p2m: Add logic to revector a P2M tree to use __va leafs.
xen/mmu: Recycle the Xen provided L4, L3, and L2 pages
xen/mmu: For 64-bit do not call xen_map_identity_early
xen/mmu: use copy_page instead of memcpy.
xen/mmu: Provide comments describing the _ka and _va aliasing issue
xen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything.
xen/x86: Use memblock_reserve for sensitive areas.
xen/p2m: Fix the comment describing the P2M tree.
xen/perf: Define .glob for the different hypercalls.
We then try to populate those pages back. In the P2M tree however
the space for those leafs must be reserved - as such we use extend_brk.
We reserve 8MB of _brk space, which means we can fit over 1048576 PFNs - which is more than we should ever need.
[v1: Made it 8MB of _brk space instead of 4MB per Jan's suggestion] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 99266871de5006ba7ad0bfece6bb283ede4094b9)
xen/mmu: Remove from __ka space PMD entries for pagetables.
Please first read the description in "xen/mmu: Copy and revector the
P2M tree."
At this stage, the __ka address space (which is what the old
P2M tree was using) is partially disassembled. The cleanup_highmap
has removed the PMD entries from 0-16MB and anything past _brk_end
up to the max_pfn_mapped (which is the end of the ramdisk).
The xen_remove_p2m_tree and code around has ripped out the __ka for
the old P2M array.
Here we continue on doing it to where the Xen page-tables were.
It is safe to do it, as the page-tables are addressed using __va.
For good measure we delete anything that is within MODULES_VADDR
and up to the end of the PMD.
At this point the __ka only contains PMD entries for the start
of the kernel up to __brk.
[v1: Per Stefano's suggestion wrapped the MODULES_VADDR in debug] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 4e928e1a48b6b76e0b8384160213a32d03197e4b)
Please first read the description in "xen/p2m: Add logic to revector a
P2M tree to use __va leafs" patch.
The 'xen_revector_p2m_tree()' function allocates a new P2M tree
copies the contents of the old one in it, and returns the new one.
At this stage, the __ka address space (which is what the old
P2M tree was using) is partially disassembled. The cleanup_highmap
has removed the PMD entries from 0-16MB and anything past _brk_end
up to the max_pfn_mapped (which is the end of the ramdisk).
We have revectored the P2M tree (and the one for save/restore as well)
to use new shiny __va address to new MFNs. The xen_start_info
has been taken care of already in 'xen_setup_kernel_pagetable()' and
xen_start_info->shared_info in 'xen_setup_shared_info()', so
we are free to roam and delete PMD entries - which is exactly what
we are going to do. We rip out the __ka for the old P2M array.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
As can be seen, the ramdisk, P2M and pagetables are taking
a bit of __ka addresses space. Which is a problem since the
MODULES_VADDR starts at 0xffffffffa0000000 - and P2M sits
right in there! This results during bootup with the inability to
load modules, with this error:
Since the __va and __ka are 1:1 up to MODULES_VADDR and
cleanup_highmap rids __ka of the ramdisk mapping, what
we want to do is similar - get rid of the P2M in the __ka
address space. There are two ways of fixing this:
1) All P2M lookups instead of using the __ka address would
use the __va address. This means we can safely erase from
__ka space the PMD pointers that point to the PFNs for
P2M array and be OK.
2). Allocate a new array, copy the existing P2M into it,
revector the P2M tree to use that, and return the old
P2M to the memory allocate. This has the advantage that
it sets the stage for using XEN_ELF_NOTE_INIT_P2M
feature. That feature allows us to set the exact virtual
address space we want for the P2M - and allows us to
boot as initial domain on large machines.
So we pick option 2).
This patch only lays the groundwork in the P2M code. The patch
that modifies the MMU is called "xen/mmu: Copy and revector the P2M tree."
xen/mmu: Recycle the Xen provided L4, L3, and L2 pages
As we are not using them. We end up only using the L1 pagetables
and grafting those to our page-tables.
[v1: Per Stefano's suggestion squashed two commits]
[v2: Per Stefano's suggestion simplified loop] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit cbc09be35990fb3d15671507f11c3e90479ef816)
xen/mmu: Provide comments describing the _ka and _va aliasing issue
Which is that the level2_kernel_pgt (__ka virtual addresses)
and level2_ident_pgt (__va virtual address) contain the same
PMD entries. So if you modify a PTE in __ka, it will be reflected
in __va (and vice-versa).
xen/x86: Use memblock_reserve for sensitive areas.
instead of a big memblock_reserve. This way we can be more
selective in freeing regions (and it also makes it easier
to understand where is what).
[v1: Move the auto_translate_physmap to proper line]
[v2: Per Stefano suggestion add more comments] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[upstream git commit 91addbf07abfdd109a9da4e02061e6ed3728b298]
Conflicts:
The P2M code is smart enough to return false (which means that it
cannot allocate anymore) and the error can perculate up the calling
stack without trouble - with the error logic doing the proper thing.
So check the __brk_limit values before allocating from extend_brk.
This allows us to boot on machines where we do not have enough
__brk space, and we would get this:
Interestingly enough, most of the time we are not going to hit this
b/c the _brk space is quite large (v3.5): ffffffff81a25000 B __brk_base ffffffff81e43000 B __brk_limit
= ~4MB.
vs earlier kernels (with this back-ported), the space is smaller: ffffffff81a25000 B __brk_base ffffffff81a7b000 B __brk_limit
= 344 kBytes.
With this patch, we would get now a limited amount of pages populated back:
Freeing 9f-100 pfn range: 97 pages freed
Freeing b7ee0-ecd9b pfn range: 216763 pages freed
Released 216860 pages of unused memory
Set 295297 page(s) to 1-1 mapping
Populating 100000-134f1c pfn range: 30720 pages added
[while it was instructed to populate 216860 pages back
on this particular machine]
qla2xxx: Perform ROM mbx cmd access only after ISP soft-reset during f/w recovery.
Initial assumption by driver was that the ROM mbx cmds will be accessible
even when FCoE operational f/w is in reset recovery. However it seems that
in case of "ISP System error" (i.e. 0x8002) there is a period when the ISP
ISP is not operational and firmware waits in tight loop for either the driver
to take a dump or perform soft-reset. During this time none of the ROM mbx
cmds will get serviced by f/w.
Hence the patch makes sure driver sends mbx only after soft reset is complete.
Arun Easi [Tue, 19 Jun 2012 23:56:27 +0000 (16:56 -0700)]
qla2xxx: Fix for continuous rescan attempts in arbitrated loop topology.
Stale information in the temporary fcport created in
qla2x00_configure_local_loop() causes qla2x00_get_port_database() call
to fail. This reschedules scan, which gets stuck continuously in the
rescheduling-of-scan loop due to the failure.
Chad Dupuis [Thu, 19 Jul 2012 09:13:50 +0000 (14:43 +0530)]
qla2xxx: Use bitmap to store loop_id's for fcports.
Store used fcport loop_id's in a bitmap so that as opposed to looping through
all fcports to find the next free loop_id, new loop_id lookup can be just be
done via bitops.
Joe Carnuccio [Wed, 18 Apr 2012 20:19:06 +0000 (13:19 -0700)]
qla2xxx: Add I2C BSG interface.
Add BSG interface to generically access I2C attached devices.
The transferred data limitations:
- the address must be on an even byte boundary,
- the chunk size must be no more than 64 bytes,
(these being limitations of the firmware).
So, to transfer more than 64 bytes, the caller must chunk up
the data into 64 byte chunks and perform multiple accesses.
The caller is responsible for setting the device address,
the data offset, the option bits, and the data length.
Giridhar Malavali [Thu, 29 Mar 2012 20:14:09 +0000 (13:14 -0700)]
qla2xxx: Proper completion to scsi-ml for scsi status task_set_full and busy.
In case of firmmware detected under-run condition and scsi status of task_set_full
or busy_condition, DID_OK is returned to scsi-ml for faster recovery.
Chetan Loke [Thu, 16 Feb 2012 19:34:56 +0000 (13:34 -0600)]
qla2xxx: Micro optimization in queuecommand handler
Optimized queuecommand handler's to eliminate double head-room checks.
The checks are moved inside the 1st if-loop otherwise you would end up checking twice when there is
enough head room.
Add an extra layer of logging granularity for messages that are necessary in
some circumstances but may flood the kernel log buffer with too many messages
otherwise.
Giridhar Malavali [Fri, 27 Jan 2012 15:09:15 +0000 (09:09 -0600)]
qla2xxx: Block flash access from application when device is initialized for ISP82xx.
This could lead to CRB initialization failures or as fail to capture minidump
data. So access to flash needs to be avoided when device is doing reset for
ISP82xx.
Chad Dupuis [Tue, 6 Dec 2011 19:53:04 +0000 (14:53 -0500)]
qla2xxx: Handle interrupt registration failures more gracefully.
If interrupt registration failed we could crash the machine as we were trying
to deference some pointers which weren't allocated yet. Move the allocation
a little earlier and make some checks to the free resource code to make sure
that we don't try to free a resource that was never allocated.
Andre Przywara [Tue, 29 May 2012 11:07:31 +0000 (13:07 +0200)]
xen/setup: filter APERFMPERF cpuid feature out
Xen PV kernels allow access to the APERF/MPERF registers to read the
effective frequency. Access to the MSRs is however redirected to the
currently scheduled physical CPU, making consecutive read and
compares unreliable. In addition each rdmsr traps into the hypervisor.
So to avoid bogus readouts and expensive traps, disable the kernel
internal feature flag for APERF/MPERF if running under Xen.
This will
a) remove the aperfmperf flag from /proc/cpuinfo
b) not mislead the power scheduler (arch/x86/kernel/cpu/sched.c) to
use the feature to improve scheduling (by default disabled)
c) not mislead the cpufreq driver to use the MSRs
This does not cover userland programs which access the MSRs via the
device file interface, but this will be addressed separately.
[upstream git commit 5e626254206a709c6e937f3dda69bf26c7344f6f] Signed-off-by: Andre Przywara <andre.przywara@amd.com> Cc: stable@vger.kernel.org # v3.0+ Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Merge branch 'stable/for-linus-3.6.rebased' into uek2-merge
* stable/for-linus-3.6.rebased:
xen PVonHVM: move shared_info to MMIO before kexec
xen: simplify init_hvm_pv_info
xen: remove cast from HYPERVISOR_shared_info assignment
xen: enable platform-pci only in a Xen guest
xen/pv-on-hvm kexec: shutdown watches from old kernel
Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"
xen/hvc: Fix up checks when the info is allocated.
xen/mm: zero PTEs for non-present MFNs in the initial page table
xen/mm: do direct hypercall in xen_set_pte() if batching is unavailable
xen/x86: add desc_equal() to compare GDT descriptors
x86/xen: avoid updating TLS descriptors if they haven't changed
xen: populate correct number of pages when across mem boundary (v2)
xen/mce: add .poll method for mcelog device driver
Olaf Hering [Tue, 17 Jul 2012 15:43:35 +0000 (17:43 +0200)]
xen PVonHVM: move shared_info to MMIO before kexec
Currently kexec in a PVonHVM guest fails with a triple fault because the
new kernel overwrites the shared info page. The exact failure depends on
the size of the kernel image. This patch moves the pfn from RAM into
MMIO space before the kexec boot.
The pfn containing the shared_info is located somewhere in RAM. This
will cause trouble if the current kernel is doing a kexec boot into a
new kernel. The new kernel (and its startup code) can not know where the
pfn is, so it can not reserve the page. The hypervisor will continue to
update the pfn, and as a result memory corruption occours in the new
kernel.
One way to work around this issue is to allocate a page in the
xen-platform pci device's BAR memory range. But pci init is done very
late and the shared_info page is already in use very early to read the
pvclock. So moving the pfn from RAM to MMIO is racy because some code
paths on other vcpus could access the pfn during the small window when
the old pfn is moved to the new pfn. There is even a small window were
the old pfn is not backed by a mfn, and during that time all reads
return -1.
Because it is not known upfront where the MMIO region is located it can
not be used right from the start in xen_hvm_init_shared_info.
To minimise trouble the move of the pfn is done shortly before kexec.
This does not eliminate the race because all vcpus are still online when
the syscore_ops will be called. But hopefully there is no work pending
at this point in time. Also the syscore_op is run last which reduces the
risk further.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Tue, 17 Jul 2012 09:59:15 +0000 (11:59 +0200)]
xen: simplify init_hvm_pv_info
init_hvm_pv_info is called only in PVonHVM context, move it into ifdef.
init_hvm_pv_info does not fail, make it a void function.
remove arguments from init_hvm_pv_info because they are not used by the
caller.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Tue, 10 Jul 2012 13:31:39 +0000 (15:31 +0200)]
xen: enable platform-pci only in a Xen guest
While debugging kexec issues in a PVonHVM guest I modified
xen_hvm_platform() to return false to disable all PV drivers. This
caused a crash in platform_pci_init() because it expects certain data
structures to be initialized properly.
To avoid such a crash make sure the driver is initialized only if
running in a Xen guest.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Tue, 10 Jul 2012 12:50:03 +0000 (14:50 +0200)]
xen/pv-on-hvm kexec: shutdown watches from old kernel
Add xs_reset_watches function to shutdown watches from old kernel after
kexec boot. The old kernel does not unregister all watches in the
shutdown path. They are still active, the double registration can not
be detected by the new kernel. When the watches fire, unexpected events
will arrive and the xenwatch thread will crash (jumps to NULL). An
orderly reboot of a hvm guest will destroy the entire guest with all its
resources (including the watches) before it is rebuilt from scratch, so
the missing unregister is not an issue in that case.
With this change the xenstored is instructed to wipe all active watches
for the guest. However, a patch for xenstored is required so that it
accepts the XS_RESET_WATCHES request from a client (see changeset
23839:42a45baf037d in xen-unstable.hg). Without the patch for xenstored
the registration of watches will fail and some features of a PVonHVM
guest are not available. The guest is still able to boot, but repeated
kexec boots will fail.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
David Vrabel [Mon, 9 Jul 2012 10:39:06 +0000 (11:39 +0100)]
xen/mm: zero PTEs for non-present MFNs in the initial page table
When constructing the initial page tables, if the MFN for a usable PFN
is missing in the p2m then that frame is initially ballooned out. In
this case, zero the PTE (as in decrease_reservation() in
drivers/xen/balloon.c).
This is obviously safe instead of having an valid PTE with an MFN of
INVALID_P2M_ENTRY (~0).
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
David Vrabel [Mon, 9 Jul 2012 10:39:05 +0000 (11:39 +0100)]
xen/mm: do direct hypercall in xen_set_pte() if batching is unavailable
In xen_set_pte() if batching is unavailable (because the caller is in
an interrupt context such as handling a page fault) it would fall back
to using native_set_pte() and trapping and emulating the PTE write.
On 32-bit guests this requires two traps for each PTE write (one for
each dword of the PTE). Instead, do one mmu_update hypercall
directly.
During construction of the initial page tables, continue to use
native_set_pte() because most of the PTEs being set are in writable
and unpinned pages (see phys_pmd_init() in arch/x86/mm/init_64.c) and
using a hypercall for this is very expensive.
This significantly improves page fault performance in 32-bit PV
guests.
lmbench3 test Before After Improvement
----------------------------------------------
lat_pagefault 3.18 us 2.32 us 27%
lat_proc fork 356 us 313.3 us 11%
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
David Vrabel [Mon, 9 Jul 2012 10:39:08 +0000 (11:39 +0100)]
x86/xen: avoid updating TLS descriptors if they haven't changed
When switching tasks in a Xen PV guest, avoid updating the TLS
descriptors if they haven't changed. This improves the speed of
context switches by almost 10% as much of the time the descriptors are
the same or only one is different.
The descriptors written into the GDT by Xen are modified from the
values passed in the update_descriptor hypercall so we keep shadow
copies of the three TLS descriptors to compare against.
lmbench3 test Before After Improvement
--------------------------------------------
lat_ctx -s 32 24 7.19 6.52 9%
lat_pipe 12.56 11.66 7%
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
xen: populate correct number of pages when across mem boundary (v2)
When populate pages across a mem boundary at bootup, the page count
populated isn't correct. This is due to mem populated to non-mem
region and ignored.
Pfn range is also wrongly aligned when mem boundary isn't page aligned.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
[v2: If xen_do_chunk fail(populate), abort this chunk and any others]
Suggested by David, thanks.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 14306942
RDS will be using devinet_ioctl() to implement IP failover/fallback for IB
devices to support active/active. This is an enhancement request to export
devinet_ioctl() so non-kernel modules such as RDS can use it. Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>