]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
12 years agobe2net: reduce gso_max_size setting to account for ethernet header.
Sarveshwar Bandi [Wed, 13 Jun 2012 19:51:43 +0000 (19:51 +0000)]
be2net: reduce gso_max_size setting to account for ethernet header.

The maximum size of packet that can be handled by controller including ethernet
header is 65535. Reducing gso_max_size accordingly.

Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: fix a race in be_xmit()
Eric Dumazet [Thu, 7 Jun 2012 22:59:59 +0000 (22:59 +0000)]
be2net: fix a race in be_xmit()

As soon as hardware is notified of a transmit, we no longer can assume
skb can be dereferenced, as TX completion might have freed the packet.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: Fix driver load for VFs for Lancer
Padmanabh Ratnakar [Fri, 24 Aug 2012 13:26:33 +0000 (18:56 +0530)]
be2net: Fix driver load for VFs for Lancer

Permanent MAC is wrongly supplied in create iface command. Call the
command with no MAC address and then MAC address should be later queried
and applied.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: update driver version
Sathya Perla [Fri, 24 Aug 2012 13:08:24 +0000 (18:38 +0530)]
be2net: update driver version

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: do not use SCRATCHPAD register
Sathya Perla [Fri, 24 Aug 2012 13:05:10 +0000 (18:35 +0530)]
be2net: do not use SCRATCHPAD register

The CUST_SCRATCHPAD_CSR register is used for marking if FW cleanup is
needed. This is used in a crash kernel scenario. Do no use this register as
it is not available for some functions. Instead, always issue an FLR when
a function is probed *except* when VFs are preset (enabled in the previous
PF load).

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: remove unnecessary usage of unlikely()
Sathya Perla [Tue, 5 Jun 2012 19:37:21 +0000 (19:37 +0000)]
be2net: remove unnecessary usage of unlikely()

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: fix reporting number of actual rx queues
Sathya Perla [Tue, 5 Jun 2012 19:37:20 +0000 (19:37 +0000)]
be2net: fix reporting number of actual rx queues

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: do not modify PCI MaxReadReq size
Sathya Perla [Tue, 5 Jun 2012 19:37:19 +0000 (19:37 +0000)]
be2net: do not modify PCI MaxReadReq size

Setting the PCI MRRS to a value of 4096 (overriding the system decided
value) had provided perf improvement in TX.
But, IBM has provided feedback that on POWER platforms, this value is set
by the system firmware, and drivers modifying this value can cause
unpredictable results (like EEH errors.) So, backing off this change.
On POWER7 platforms most slots, it seems, do get a MRRS of 4096.

This patch reverts the following commit:
"be2net: Modified PCI MaxReadReq size to 4096 bytes"
commit 5a56eb10babbcd7b3796dc3c28c271260aa3608d.

Suggested-by: Brian King <bjking1@us.ibm.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: cleanup be_vid_config()
Sathya Perla [Tue, 5 Jun 2012 19:37:18 +0000 (19:37 +0000)]
be2net: cleanup be_vid_config()

- get rid of 2 unused arguments to the routine and some unused code
- don't use the term "vlan_tag" in place of "vid" as they are different

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: don't call vid_config() when there's no vlan config
Sathya Perla [Tue, 5 Jun 2012 19:37:17 +0000 (19:37 +0000)]
be2net: don't call vid_config() when there's no vlan config

be_vid_config() is called from be_setup() to replay config cmds after
a card reset. Skip calling it when no vlans are configured.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: Fix to allow get/set of debug levels in the firmware.
Somnath Kotur [Fri, 24 Aug 2012 13:00:51 +0000 (18:30 +0530)]
be2net: Fix to allow get/set of debug levels in the firmware.

Patch re-spin.
Incorporated review comments by Ben Hutchings.

Signed-off-by: Suresh Reddy <suresh.reddy@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: avoid disabling sriov while VFs are assigned
Sathya Perla [Fri, 24 Aug 2012 12:08:10 +0000 (17:38 +0530)]
be2net: avoid disabling sriov while VFs are assigned

Calling pci_disable_sriov() while VFs are assigned to VMs causes
kernel panic. This patch uses PCI_DEV_FLAGS_ASSIGNED bit state of the
VF's pci_dev to avoid this. Also, the unconditional function reset cmd
issued on a PF probe can delete the VF configuration for the
previously enabled VFs. A scratchpad register is now used to issue a
function reset only when needed (i.e., in a crash dump scenario.)

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobe2net: Add functionality to support RoCE driver
Parav Pandit [Fri, 24 Aug 2012 10:01:27 +0000 (15:31 +0530)]
be2net: Add functionality to support RoCE driver

- Increase MSI-X vectors by 5 for RoCE traffic.
- Add macro to check roce support on a device.
- Add device-specific doorbell and MSI-X vector fields shared with NIC
  functionality.
- Provide RoCE driver registration and deregistration functions.
- Add support functions which will be invoked on adapter add/remove
  and port up/down events.
- Traverse through the list of adapters to invoke callback functions.

Signed-off-by: Parav Pandit <parav.pandit@emulex.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Roland Dreier <roland@purestorage.com>
12 years agobe2net: Add function to issue mailbox cmd on MQ
Parav Pandit [Mon, 26 Mar 2012 14:27:12 +0000 (14:27 +0000)]
be2net: Add function to issue mailbox cmd on MQ

- Add generic function to issue mailbox cmd on MQ as export function.
- RoCE driver will use this before it setups its own MQ.

Signed-off-by: Parav Pandit <parav.pandit@emulex.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Roland Dreier <roland@purestorage.com>
12 years agoMerge branch '2.6.39-300#bug14096387' of git://ca-git.us.oracle.com/linux-guasun...
Maxim Uvarov [Mon, 27 Aug 2012 13:53:10 +0000 (06:53 -0700)]
Merge branch '2.6.39-300#bug14096387' of git://ca-git.us.oracle.com/linux-guasun-public

12 years agoqla2xxx: Update the driver version to 8.04.00.07.39.0-k.
Saurav Kashyap [Mon, 27 Aug 2012 07:35:32 +0000 (13:05 +0530)]
qla2xxx: Update the driver version to 8.04.00.07.39.0-k.

Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
12 years agoqla2xxx: Delay for legacy interrupts not rquired for all board for ISP83xx.
Giridhar Malavali [Tue, 21 Aug 2012 21:18:00 +0000 (14:18 -0700)]
qla2xxx: Delay for legacy interrupts not rquired for all board for ISP83xx.

JIRA Key: V2632FC-263

Acked-by: Chad Dupuis <chad.dupuis@qlogic.com>
12 years agoqla2xxx: Use the right field for container_of.
Arun Easi [Thu, 9 Aug 2012 19:01:17 +0000 (12:01 -0700)]
qla2xxx: Use the right field for container_of.

JIRA Key: V2632FC-259

Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Acked-by: Chad Dupuis <chad.dupuis@qlogic.com>
12 years agoqla2xxx: Allow MSI interrupt registration for ISP82xx.
Giridhar Malavali [Fri, 10 Aug 2012 13:39:56 +0000 (06:39 -0700)]
qla2xxx: Allow MSI interrupt registration for ISP82xx.

JIRA Key: V2632FC-260

Acked-by: Chad Dupuis <chad.dupuis@qlogic.com>
12 years agoqla2xxx: Don't toggle RISC interrupt bits after IRQ lines are attached.
Giridhar Malavali [Wed, 8 Aug 2012 14:21:28 +0000 (07:21 -0700)]
qla2xxx: Don't toggle RISC interrupt bits after IRQ lines are attached.

JIRA Key: V2632FC-246.

Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Acked-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
12 years agoqla2xxx: Fix incorrect status reporting on DIF errors.
Arun Easi [Fri, 3 Aug 2012 17:09:35 +0000 (10:09 -0700)]
qla2xxx: Fix incorrect status reporting on DIF errors.

JIRA Key: V2632FC-254

Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
12 years agoqla2xxx: Remove dumping fw on timeout for bidirectional commands.
Chad Dupuis [Fri, 3 Aug 2012 19:19:42 +0000 (12:19 -0700)]
qla2xxx: Remove dumping fw on timeout for bidirectional commands.

JIRA Key: V2632FC-255

12 years agoqla2xxx: T10 DIF - ISP83xx changes.
Arun Easi [Mon, 27 Aug 2012 07:31:59 +0000 (13:01 +0530)]
qla2xxx: T10 DIF - ISP83xx changes.

JIRA Key: V2632FC-136

Acked-by: Giridhar Malvali <giridhar.malvali@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
12 years agoqla2xxx: Fix for legacy interrupts for ISP83xx.
Chad Dupuis [Fri, 3 Aug 2012 02:46:01 +0000 (08:16 +0530)]
qla2xxx: Fix for legacy interrupts for ISP83xx.

JIRA Key: V2632FC-253

Acked-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Acked-by: Chad Dupuis <chad.dupuis.qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
12 years agoqla2xxx: Restrict nic core reset to one function for mctp.
Saurav Kashyap [Thu, 2 Aug 2012 06:04:17 +0000 (11:34 +0530)]
qla2xxx: Restrict nic core reset to one function for mctp.

In case of mctp enable board both functions receive 8200 AEN, both captures
the dump and both ends up restarting the nic core. This patch prevents allow
only function to perform nic core reset.

JIRA Key: V2632FC-251

ER: ER96691

Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
12 years agoqla2xxx: Update to Implementation of the mctp.
Saurav Kashyap [Mon, 27 Aug 2012 07:28:00 +0000 (12:58 +0530)]
qla2xxx: Update to Implementation of the mctp.

- Changes the size to 0x86064.
- The fw expects dword len instead of bytes.
- removed family version, its removed from requirements.
- Do nic core reset even on dump failure.
- mctp dump failure used to return failure even in case of success.
- memset mailbox regs and correctly set them.

Jira Key: V2632FC-213

ER: ER95908

Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
12 years agoqla2xxx: Enable fw attributes for ISP24xx and above.
Saurav Kashyap [Tue, 31 Jul 2012 07:59:46 +0000 (13:29 +0530)]
qla2xxx: Enable fw attributes for ISP24xx and above.

JIRA Key: V2632FC-250

Acked-by: Chad Dupuis <chad.dupuis@qlogic.com>
Acked-by: Armen Baloyan <armen.baloyan@qlogic.com>
12 years agoqla2xxx: Get fcal position map should not be called for p2p topology.
Saurav Kashyap [Tue, 31 Jul 2012 07:34:51 +0000 (13:04 +0530)]
qla2xxx: Get fcal position map should not be called for p2p topology.

JIRA Key: V2632FC-249

Acked-by: Chad Dupuis <chad.dupuis@qlogic.com>
Acked-by: Armen Baloyan <armen.baloyan@qlogic.com>
12 years agoqla2xxx: Change log messages to dbg and remove dumping fw on timeout for bidirectional.
Saurav Kashyap [Wed, 1 Aug 2012 19:48:36 +0000 (12:48 -0700)]
qla2xxx: Change log messages to dbg and remove dumping fw on timeout for bidirectional.

JIRA Key: V2632FC-248

Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Acked-by: Chad Dupuis <chad.dupuis@qlogic.com>
12 years agoqla2xxx: Set Maximum Read Request Size to 4K.
Chad Dupuis [Mon, 30 Jul 2012 14:36:13 +0000 (07:36 -0700)]
qla2xxx: Set Maximum Read Request Size to 4K.

JIRA Key: V2632FC-247

Acked-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
12 years agoqla2xxx: Enclose adapter related calls in adapter check in failed state handler.
Saurav Kashyap [Mon, 30 Jul 2012 14:30:50 +0000 (07:30 -0700)]
qla2xxx: Enclose adapter related calls in adapter check in failed state handler.

JIRA Key: V2632FC-243

12 years agoqla2xxx: Fix for handling some error conditions in loopback.
Chad Dupuis [Mon, 27 Aug 2012 07:24:57 +0000 (12:54 +0530)]
qla2xxx: Fix for handling some error conditions in loopback.

Fixes the bug where in case we do not receive DCBX completion aen after a
set-port mbx Or when we get a bad status for IDC completion (mbx 8100) AEN
(i.e. bit 15 of mb2 is set) then we need to return error status and fail the
loopback operation.

JIRA Key: V2632FC-245

Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Acked-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
12 years agoqla2xxx: Fix description of qla2xmaxqdepth parameter.
Chad Dupuis [Mon, 23 Jul 2012 15:07:59 +0000 (08:07 -0700)]
qla2xxx: Fix description of qla2xmaxqdepth parameter.

JIRA Key: V2632FC-241

12 years agoqla2xxx: set idc version if function is first one to come.
Saurav Kashyap [Mon, 27 Aug 2012 07:22:39 +0000 (12:52 +0530)]
qla2xxx: set idc version if function is first one to come.

JIRA Key: V2632FC-235

ER: ER95274

Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
12 years agoqla2xxx: Do not restrict the number of NPIV ports for ISP83xx.
Saurav Kashyap [Fri, 13 Jul 2012 20:14:13 +0000 (15:14 -0500)]
qla2xxx: Do not restrict the number of NPIV ports for ISP83xx.

JIRA Key: V2632FC-236

12 years agoqla2xxx: Do PCI fundamental reset for 83xx
Joe Carnuccio [Wed, 11 Jul 2012 01:20:35 +0000 (01:20 +0000)]
qla2xxx: Do PCI fundamental reset for 83xx

On ISP83xx cards perform a fundamental reset instead of hot reset.

JIRA Key: V2632FC-234

ER: ER88065

Acked-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Acked-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
12 years agoqla2xxx: Fail initialization if unable to load RISC code.
Andrew Vasquez [Fri, 3 Aug 2012 01:54:56 +0000 (07:24 +0530)]
qla2xxx: Fail initialization if unable to load RISC code.

JIRA Key: V2632FC-233

ER: ER95512

Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
12 years agoqla2xxx: Ensure PLOGI is sent to Fabric Management-Server upon request.
Andrew Vasquez [Tue, 3 Jul 2012 16:51:56 +0000 (09:51 -0700)]
qla2xxx: Ensure PLOGI is sent to Fabric Management-Server upon request.

The internal firmware state for this 'well known port' may
be out-of-sync after a link-flop, causing a follow-on
CT-request to be dropped due to the requestor not having
been 'logged in'.  Correct the code by not passing the
'conditional' directive for the PLOGI request.

JIRA Key: V2632FC-232

12 years agoqla2xxx: Remove setting Scsi_host->this_id during adapter probe.
Chad Dupuis [Fri, 6 Jul 2012 19:44:58 +0000 (14:44 -0500)]
qla2xxx: Remove setting Scsi_host->this_id during adapter probe.

Setting this to 255 will cause any target with id 255 to not show up so leave
it at the default in our host template.

JIRA Key: V2632FC-231

ER: ER93655

12 years agoqla2xxx: Use #defines instead of hardcoded values for intr status.
Arun Easi [Thu, 15 Sep 2011 03:54:33 +0000 (20:54 -0700)]
qla2xxx: Use #defines instead of hardcoded values for intr status.

JIRA Key: V2632FC-191

12 years agodon't warn on for mlock ulimits on shm_hugetlb
chris.mason@oracle.com [Sat, 17 Jul 2010 02:49:36 +0000 (22:49 -0400)]
don't warn on for mlock ulimits on shm_hugetlb

Orabug: 14096387

cherry-picked from git://ca-git.us.oracle.com/linux-2.6-unbreakable.git
commit ec208c6e401a2b6100ac8fa00eeb03bd79434c07

We get this once per DB startup, lets turn it off for now.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
Conflicts:

fs/hugetlbfs/inode.c

Signed-off-by: Guangyu Sun <guangyu.sun@oracle.com>
12 years agoSPEC: v2.6.39-300.5.0
Maxim Uvarov [Mon, 20 Aug 2012 12:37:15 +0000 (05:37 -0700)]
SPEC: v2.6.39-300.5.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
12 years agoMerge branch 'uek2-merge' of git://ca-git.us.oracle.com/linux-konrad-public
Maxim Uvarov [Mon, 20 Aug 2012 11:01:39 +0000 (04:01 -0700)]
Merge branch 'uek2-merge' of git://ca-git.us.oracle.com/linux-konrad-public

Conflicts:
arch/x86/xen/enlighten.c

12 years ago[ovmapi] fix memcpy overrun, leaks and mutex unlock
Cathy Avery [Fri, 17 Aug 2012 19:15:28 +0000 (15:15 -0400)]
[ovmapi] fix memcpy overrun, leaks and mutex unlock

Added bug fixes:
mempy overrun of name and value buffer when strings are too long.
Fixed memory leaks.
Fixed not unlocking mutex on some error returns.

Signed-off-by: Cathy Avery <cathy.avery@oracle.com>
12 years agoMerge branch 'stable/for-linus-3.7.rebased' into uek2-merge
Konrad Rzeszutek Wilk [Fri, 17 Aug 2012 14:26:21 +0000 (10:26 -0400)]
Merge branch 'stable/for-linus-3.7.rebased' into uek2-merge

* stable/for-linus-3.7.rebased:
  xen/mmu: If the revector fails, don't attempt to revector anything else.
  xen/p2m: When revectoring deal with holes in the P2M array.
  xen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID.
  Revert "xen PVonHVM: move shared_info to MMIO before kexec"
  xen/mmu: Release just the MFN list, not MFN list and part of pagetables.

12 years agoxen/mmu: If the revector fails, don't attempt to revector anything else.
Konrad Rzeszutek Wilk [Fri, 17 Aug 2012 13:35:31 +0000 (09:35 -0400)]
xen/mmu: If the revector fails, don't attempt to revector anything else.

If the P2M revectoring would fail, we would try to continue on by
cleaning the PMD for L1 (PTE) page-tables. The xen_cleanhighmap
is greedy and erases the PMD on both boundaries. Since the P2M
array can share the PMD, we would wipe out part of the __ka
that is still used in the P2M tree to point to P2M leafs.

This fixes it by bypassing the revectoring and continuing on.
If the revector fails, a nice WARN is printed so we can still
troubleshoot this.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
12 years agoxen/p2m: When revectoring deal with holes in the P2M array.
Konrad Rzeszutek Wilk [Thu, 16 Aug 2012 20:38:55 +0000 (16:38 -0400)]
xen/p2m: When revectoring deal with holes in the P2M array.

When we free the PFNs and then subsequently populate them back
during bootup:

Freeing 20000-20200 pfn range: 512 pages freed
1-1 mapping on 20000->20200
Freeing 40000-40200 pfn range: 512 pages freed
1-1 mapping on 40000->40200
Freeing bad80-badf4 pfn range: 116 pages freed
1-1 mapping on bad80->badf4
Freeing badf6-bae7f pfn range: 137 pages freed
1-1 mapping on badf6->bae7f
Freeing bb000-100000 pfn range: 282624 pages freed
1-1 mapping on bb000->100000
Released 283999 pages of unused memory
Set 283999 page(s) to 1-1 mapping
Populating 1acb8a-1f20e9 pfn range: 283999 pages added

We end up having the P2M array (that is the one that was
grafted on the P2M tree) filled with IDENTITY_FRAME or
INVALID_P2M_ENTRY) entries. The patch titled

"xen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID."
recycles said slots and replaces the P2M tree leaf's with
 &mfn_list[xx] with p2m_identity or p2m_missing.

And re-uses the P2M array sections for other P2M tree leaf's.
For the above mentioned bootup excerpt, the PFNs at
0x20000->0x20200 are going to be IDENTITY based:

P2M[0][256][0] -> P2M[0][257][0] get turned in IDENTITY_FRAME.

We can re-use that and replace P2M[0][256] to point to p2m_identity.
The "old" page (the grafted P2M array provided by Xen) that was at
P2M[0][256] gets put somewhere else. Specifically at P2M[6][358],
b/c when we populate back:

Populating 1acb8a-1f20e9 pfn range: 283999 pages added

we fill P2M[6][358][0] (and P2M[6][358], P2M[6][359], ...) with
the new MFNs.

That is all OK, except when we revector we assume that the PFN
count would be the same in the grafted P2M array and in the
newly allocated. Since that is no longer the case, as we have
holes in the P2M that point to p2m_missing or p2m_identity we
have to take that into account.

[v2: Check for overflow]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
12 years agoxen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID.
Konrad Rzeszutek Wilk [Fri, 17 Aug 2012 13:27:35 +0000 (09:27 -0400)]
xen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID.

If P2M leaf is completly packed with INVALID_P2M_ENTRY or with
1:1 PFNs (so IDENTITY_FRAME type PFNs), we can swap the P2M leaf
with either a p2m_missing or p2m_identity respectively. The old
page (which was created via extend_brk or was grafted on from the
mfn_list) can be re-used for setting new PFNs.

This also means we can remove git commit:
5bc6f9888db5739abfa0cae279b4b442e4db8049
xen/p2m: Reserve 8MB of _brk space for P2M leafs when populating back
which tried to fix this.

and make the amount that is required to be reserved much smaller.

CC: stable@vger.kernel.org # for 3.5 only.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
12 years agox86, mtrr: Fix a type overflow in range_to_mtrr func
zhenzhong.duan [Wed, 30 May 2012 04:52:15 +0000 (12:52 +0800)]
x86, mtrr: Fix a type overflow in range_to_mtrr func

Orabug: 14073173
When boot on sun G5+ with 4T mem, see an overflow in mtrr cleanup as below.

*BAD*gran_size: 2G      chunk_size: 2G  num_reg: 10     lose cover RAM:
-18014398505283592M

This is because 1<<31 sign extended. Use an unsigned long constant to
fix it.  Useful for mem larger than or equal to 4T.

-v2: Use 64bit constant instead of explicit type conversion as suggested
by Yinghai. Description updated too.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Link: http://lkml.kernel.org/r/4FC5A77F.6060505@oracle.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
12 years agoFetch dmi version from SMBIOS if it exist
Zhenzhong Duan [Tue, 24 Jul 2012 11:59:04 +0000 (19:59 +0800)]
Fetch dmi version from SMBIOS if it exist

Orabug: 14267379
The right dmi version is in SMBIOS if it's zero in DMI region

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
12 years agoCheck dmi version when get system uuid
Zhenzhong Duan [Wed, 11 Jul 2012 04:02:57 +0000 (12:02 +0800)]
Check dmi version when get system uuid

Orabug: 14267379
As of version 2.6 of the SMBIOS specification, the first 3
fields of the UUID are supposed to be encoded on little-endian.

Also a minor fix to match variable meaning and mute checkpatch.pl

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
12 years agoMerge git://ca-git.us.oracle.com/linux-zduan-public.git v2.6.39-200.18.0#bug13993738
Maxim Uvarov [Thu, 16 Aug 2012 09:06:56 +0000 (02:06 -0700)]
Merge git://ca-git.us.oracle.com/linux-zduan-public.git v2.6.39-200.18.0#bug13993738

13 years agoRevert "xen PVonHVM: move shared_info to MMIO before kexec"
Konrad Rzeszutek Wilk [Tue, 14 Aug 2012 20:57:14 +0000 (16:57 -0400)]
Revert "xen PVonHVM: move shared_info to MMIO before kexec"

This reverts commit cfa1df57c9047dcd1b743a4c0487eb686bdea013.

It causes an infinite reading loop of pvclock on shutdown in
PVHVM case. Will revisit once its fixed upstream.

13 years agoxen/mmu: Release just the MFN list, not MFN list and part of pagetables.
Konrad Rzeszutek Wilk [Tue, 14 Aug 2012 20:37:31 +0000 (16:37 -0400)]
xen/mmu: Release just the MFN list, not MFN list and part of pagetables.

We call memblock_reserve for [start of mfn list] -> [PMD aligned end
of mfn list] instead of <start of mfn list> -> <page aligned end of mfn list].

This has the disastrous effect that if at bootup the end of mfn_list is
not PMD aligned we end up returning to memblock parts of the region
past the mfn_list array. And those parts are the PTE tables with
the disastrous effect of seeing this at bootup:

Write protecting the kernel read-only data: 10240k
Freeing unused kernel memory: 1860k freed
Freeing unused kernel memory: 200k freed
(XEN) mm.c:2429:d0 Bad type (saw 1400000000000002 != exp 7000000000000000) for mfn 116a80 (pfn 14e26)
...
(XEN) mm.c:908:d0 Error getting mfn 116a83 (pfn 14e2a) from L1 entry 8000000116a83067 for l1e_owner=0, pg_owner=0
(XEN) mm.c:908:d0 Error getting mfn 4040 (pfn 5555555555555555) from L1 entry 0000000004040601 for l1e_owner=0, pg_owner=0
.. and so on.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 0ebf4641eb144e633ae9a6466f4c9eaa1db6dc9b)

Conflicts:

arch/x86/xen/mmu.c

13 years agoMerge branch 'uek-2.6.39-300_bug14472774' of git://ca-git.us.oracle.com/linux-muvarov...
Maxim Uvarov [Mon, 13 Aug 2012 13:03:21 +0000 (06:03 -0700)]
Merge branch 'uek-2.6.39-300_bug14472774' of git://ca-git.us.oracle.com/linux-muvarov-public

13 years agox86/nmi: Add new NMI queues to deal with IO_CHK and SERR
Maxim Uvarov [Thu, 9 Aug 2012 15:14:24 +0000 (08:14 -0700)]
x86/nmi: Add new NMI queues to deal with IO_CHK and SERR

In discussions with Thomas Mingarelli about hpwdt, he explained
to me some issues they were some when using their virtual NMI
button to test the hpwdt driver.

It turns out the virtual NMI button used on HP's machines do no
send unknown NMIs but instead send IO_CHK NMIs.  The way the
kernel code is written, the hpwdt driver can not register itself
against that type of NMI and therefore can not successfully
capture system information before panic'ing.

To solve this I created two new NMI queues to allow driver to
register against the IO_CHK and SERR NMIs.  Or in the hpwdt all
three (if you include unknown NMIs too).

The change is straightforward and just mimics what the unknown
NMI does.

Reported-and-tested-by: Thomas Mingarelli <thomas.mingarelli@hp.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1333051877-15755-3-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Conflicts:

arch/x86/kernel/nmi.c
drivers/watchdog/hpwdt.c

13 years agox86, nmi: Create new NMI handler routines
Don Zickus [Fri, 30 Sep 2011 19:06:20 +0000 (15:06 -0400)]
x86, nmi: Create new NMI handler routines

The NMI handlers used to rely on the notifier infrastructure.  This worked
great until we wanted to support handling multiple events better.

One of the key ideas to the nmi handling is to process _all_ the handlers for
each NMI.  The reason behind this switch is because NMIs are edge triggered.
If enough NMIs are triggered, then they could be lost because the cpu can
only latch at most one NMI (besides the one currently being processed).

In order to deal with this we have decided to process all the NMI handlers
for each NMI.  This allows the handlers to determine if they recieved an
event or not (the ones that can not determine this will be left to fend
for themselves on the unknown NMI list).

As a result of this change it is now possible to have an extra NMI that
was destined to be received for an already processed event.  Because the
event was processed in the previous NMI, this NMI gets dropped and becomes
an 'unknown' NMI.  This of course will cause printks that scare people.

However, we prefer to have extra NMIs as opposed to losing NMIs and as such
are have developed a basic mechanism to catch most of them.  That will be
a later patch.

To accomplish this idea, I unhooked the nmi handlers from the notifier
routines and created a new mechanism loosely based on doIRQ.  The reason
for this is the notifier routines have a couple of shortcomings.  One we
could't guarantee all future NMI handlers used NOTIFY_OK instead of
NOTIFY_STOP.  Second, we couldn't keep track of the number of events being
handled in each routine (most only handle one, perf can handle more than one).
Third, I wanted to eventually display which nmi handlers are registered in
the system in /proc/interrupts to help see who is generating NMIs.

The patch below just implements the new infrastructure but doesn't wire it up
yet (that is the next patch).  Its design is based on doIRQ structs and the
atomic notifier routines.  So the rcu stuff in the patch isn't entirely untested
(as the notifier routines have soaked it) but it should be double checked in
case I copied the code wrong.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317409584-23662-3-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agoMerge commit 'v2.6.39-300.4.0#bug14479881v2'
Maxim Uvarov [Mon, 13 Aug 2012 12:39:23 +0000 (05:39 -0700)]
Merge commit 'v2.6.39-300.4.0#bug14479881v2'

13 years agoMerge branch 'bug14063941' of git://ca-git.us.oracle.com/linux-jubi-public
Maxim Uvarov [Mon, 13 Aug 2012 12:36:32 +0000 (05:36 -0700)]
Merge branch 'bug14063941' of git://ca-git.us.oracle.com/linux-jubi-public

13 years agotick: Add tick skew boot option
Mike Galbraith [Tue, 8 May 2012 10:20:58 +0000 (12:20 +0200)]
tick: Add tick skew boot option

Let the user decide whether power consumption or jitter is the
more important consideration for their machines.

Quoting removal commit af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867:

"Historically, Linux has tried to make the regular timer tick on the
 various CPUs not happen at the same time, to avoid contention on
 xtime_lock.

 Nowadays, with the tickless kernel, this contention no longer happens
 since time keeping and updating are done differently. In addition,
 this skew is actually hurting power consumption in a measurable way on
 many-core systems."

Problems:

- Contrary to the above, systems do encounter contention on both
  xtime_lock and RCU structure locks when the tick is synchronized.

- Moderate sized RT systems suffer intolerable jitter due to the tick
  being synchronized.

- SGI reports the same for their large systems.

- Fully utilized systems reap no power saving benefit from skew removal,
  but do suffer from resulting induced lock contention.

0209f649 rcu: limit rcu_node leaf-level fanout
  This patch was born to combat lock contention which testing showed
  to have been _induced by_ skew removal.  Skew the tick, contention
  disappeared virtually completely.

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Link: http://lkml.kernel.org/r/1336472458.21924.78.camel@marge.simpson.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agomm/vmstat.c: cache align vm_stat
Dimitri Sivanich [Tue, 1 Nov 2011 00:09:46 +0000 (17:09 -0700)]
mm/vmstat.c: cache align vm_stat

Avoid false sharing of the vm_stat array.

This was found to adversely affect tmpfs I/O performance.

Tests run on a 640 cpu UV system.

With 120 threads doing parallel writes, each to different tmpfs mounts:
No patch: ~300 MB/sec
With vm_stat alignment: ~430 MB/sec

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Christoph Lameter <cl@gentwo.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agovfs: fix panic in __d_lookup() with high dentry hashtable counts
Dimitri Sivanich [Fri, 10 Aug 2012 11:46:56 +0000 (04:46 -0700)]
vfs: fix panic in __d_lookup() with high dentry hashtable counts

When the number of dentry cache hash table entries gets too high
(2147483648 entries), as happens by default on a 16TB system, use of a
signed integer in the dcache_init() initialization loop prevents the
dentry_hashtable from getting initialized, causing a panic in
__d_lookup().  Fix this in dcache_init() and similar areas.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Conflicts:

net/ipv4/tcp.c

13 years agocpusets: randomize node rotor used in cpuset_mem_spread_node()
Jack Steiner [Fri, 10 Aug 2012 11:45:33 +0000 (04:45 -0700)]
cpusets: randomize node rotor used in cpuset_mem_spread_node()

Some workloads that create a large number of small files tend to assign
too many pages to node 0 (multi-node systems).  Part of the reason is that
the rotor (in cpuset_mem_spread_node()) used to assign nodes starts at
node 0 for newly created tasks.

This patch changes the rotor to be initialized to a random node number of
the cpuset.

[akpm@linux-foundation.org: fix layout]
[Lee.Schermerhorn@hp.com: Define stub numa_random() for !NUMA configuration]
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Paul Menage <menage@google.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:

arch/x86/mm/numa.c

13 years agox86: Reduce clock calibration time during slave cpu startup
Jack Steiner [Fri, 10 Aug 2012 11:44:37 +0000 (04:44 -0700)]
x86: Reduce clock calibration time during slave cpu startup

Reduce the startup time for slave cpus.

Adds hooks for an arch-specific function for clock calibration.
These hooks are used on x86.  If a newly started cpu has the
same phys_proc_id as a core already active, uses the TSC for the
delay loop and has a CONSTANT_TSC, use the already-calculated
value of loops_per_jiffy.

This patch reduces the time required to start slave cpus on a
4096 cpu system from: 465 sec OLD 62 sec NEW

This reduces boot time on a 4096p system by almost 7 minutes.
Nice...

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Stultz <john.stultz@linaro.org>
[fix CONFIG_SMP=n build]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Conflicts:

init/calibrate.c

13 years agox66, UV: Enable 64-bit ACPI MFCG support for SGI UV2 platform
Jack Steiner [Thu, 2 Jun 2011 19:59:43 +0000 (14:59 -0500)]
x66, UV: Enable 64-bit ACPI MFCG support for SGI UV2 platform

Enable 64-bit ACPI MFCG support for SGI UV2 platform. The check
is similar to the check on UV1. UV2 has a different oem_id
string.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Link: http://lkml.kernel.org/r/20110602195943.GA27079@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86, pci: Increase the number of iommus supported to be MAX_IO_APICS
Mike Travis [Fri, 10 Aug 2012 11:40:36 +0000 (04:40 -0700)]
x86, pci: Increase the number of iommus supported to be MAX_IO_APICS

The number of IOMMUs supported should be the same as the number of IO APICS.
This limit comes into play when the IOMMUs are identity mapped, thus the
number of possible IOMMUs in the "static identity" (si) domain should be
this same number.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
13 years agox86 PCI: Fix identity mapping for sandy bridge
Mike Travis [Fri, 10 Aug 2012 11:37:48 +0000 (04:37 -0700)]
x86 PCI: Fix identity mapping for sandy bridge

With SandyBridge, Intel has changed these Socket PCI devices to have a class
type of "System Peripheral" & "Performance counter", rather than "HostBridge".
So instead of using a "special" case to detect which devices will not be
doing DMA, use the fact that a device that is not associated with an IOMMU,
will not need an identity map.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Mike Habeck <habeck@sgi.com>
13 years agox86, nmi: Split out nmi from traps.c
Don Zickus [Fri, 30 Sep 2011 19:06:19 +0000 (15:06 -0400)]
x86, nmi: Split out nmi from traps.c

The nmi stuff is changing a lot and adding more functionality.  Split it
out from the traps.c file so it doesn't continue to pollute that file.

This makes it easier to find and expand all the future nmi related work.

No real functional changes here.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317409584-23662-2-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agoSPEC: v2.6.39-300.4.0
Maxim Uvarov [Tue, 7 Aug 2012 07:59:16 +0000 (00:59 -0700)]
SPEC: v2.6.39-300.4.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agocciss: only enable cciss_allow_hpsa when for ol5
Joe Jin [Fri, 20 Jul 2012 23:48:32 +0000 (07:48 +0800)]
cciss: only enable cciss_allow_hpsa when for ol5

Orabug: 14106006

commit 55cd818 enable cciss_allow_hpsa for both ol5 and ol6, ol6 need not
this feature, so just enable it when for ol5.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoRevert "cciss: remove controllers supported by hpsa"
Joe Jin [Tue, 17 Jul 2012 06:06:07 +0000 (14:06 +0800)]
Revert "cciss: remove controllers supported by hpsa"

Orabug: 14106006

This reverts commit 4205df34003eec4371020872cdfa228ffae5bd6a.

Conflicts:
drivers/block/cciss.c

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years ago[scsi] hpsa: add all support devices for ol5
Joe Jin [Fri, 20 Jul 2012 13:30:51 +0000 (21:30 +0800)]
[scsi] hpsa: add all support devices for ol5

Orabug: 14106006
To support uek2 on ol5, commit 29a8828 disable some devices from support list,
this made ovm3 upgrade from 3.0.3 to 3.1.1 failed to addressed local disk for
disk device name changed.

If kernel run as ovm3.1.1 dom0 kernel, please pass cciss_allow_hpsa=1 when
load cciss driver, for ol5.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoDisable VLAN 0 tagging for none VLAN traffic
Adnan Misherfi [Thu, 2 Aug 2012 20:17:44 +0000 (16:17 -0400)]
Disable VLAN 0 tagging for none VLAN traffic

Orabug: 14406424
Cisco enic driver on UCS blades tags a None VLAN traffic with VLAN 0, this causes VMs
that do not have the kernel patch " VLAN 0 should be treated as no vlan tag" to drop all
receive traffic as these VMs do not know how to deal with the VLAN 0 tag.
This is also a problem for older VMs that can not take the mentioned patch.

This fix disables the enic driver from tagging a None VLAN traffic with VLAN 0.This
fix is controlled by a driver parameters " disable_vlan0". the default value is disable_vlan0=1
which to disable the driver from tagging traffic with VLAN 0. To revert to original behavior
add "options enic disable_vlan0=0" to /etc/modprobe.con

Signed-off-by: Adnan Misherfi <adnan.misherfi@oracle.com>
13 years agoSPEC: v2.6.39-300.3.0
Maxim Uvarov [Mon, 6 Aug 2012 14:17:59 +0000 (07:17 -0700)]
SPEC: v2.6.39-300.3.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoMerge branch '2.6.39-300#bug14126896' of git://ca-git.us.oracle.com/linux-guasun...
Maxim Uvarov [Mon, 6 Aug 2012 10:49:11 +0000 (03:49 -0700)]
Merge branch '2.6.39-300#bug14126896' of git://ca-git.us.oracle.com/linux-guasun-public

13 years agoMerge branch '2.6.39-300#bug14233738' of git://ca-git.us.oracle.com/linux-guasun...
Maxim Uvarov [Mon, 6 Aug 2012 10:48:14 +0000 (03:48 -0700)]
Merge branch '2.6.39-300#bug14233738' of git://ca-git.us.oracle.com/linux-guasun-public

13 years agodl2k: Clean up rio_ioctl
Jeff Mahoney [Thu, 2 Aug 2012 12:04:00 +0000 (05:04 -0700)]
dl2k: Clean up rio_ioctl

Orabug: 14126896
The dl2k driver's rio_ioctl call has a few issues:
- No permissions checking
- Implements SIOCGMIIREG and SIOCGMIIREG using the SIOCDEVPRIVATE numbers
- Has a few ioctls that may have been used for debugging at one point
  but have no place in the kernel proper.

This patch removes all but the MII ioctls, renumbers them to use the
standard ones, and adds the proper permission check for SIOCSMIIREG.

We can also get rid of the dl2k-specific struct mii_data in favor of
the generic struct mii_ioctl_data.

Since we have the phyid on hand, we can add the SIOCGMIIPHY ioctl too.

Most of the MII code for the driver could probably be converted to use
the generic MII library but I don't have a device to test the results.
This fixes: CVE-2012-2313

Reported-by: Stephan Mueller <stephan.mueller@atsec.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
Signed-off-by: Guangyu Sun <guangyu.sun@oracle.com>
13 years agodl2k: use standard #defines from mii.h
Guangyu Sun [Fri, 3 Aug 2012 22:16:33 +0000 (15:16 -0700)]
dl2k: use standard #defines from mii.h

upstream commit 78f6a6bd89e9a33e4be1bc61e6990a1172aa396e

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Guangyu Sun <guangyu.sun@oracle.com>
13 years ago[SCSI] vmw_pvscsi: Try setting host->max_id as suggested by the device.
Arvind Kumar [Thu, 8 Mar 2012 10:18:53 +0000 (15:48 +0530)]
[SCSI] vmw_pvscsi: Try setting host->max_id as suggested by the device.

Fetch the config page from the device to learn max target id to set
host->max_id.

Also, fix some indentation issues and update the 'Maintained by' field.

Signed-off-by: Arvind Kumar <arvindkumar@vmware.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Guangyu Sun <guangyu.sun@oracle.com>
13 years agodl2k: Clean up rio_ioctl
Jeff Mahoney [Thu, 2 Aug 2012 12:04:00 +0000 (05:04 -0700)]
dl2k: Clean up rio_ioctl

Orabug: 14126896
The dl2k driver's rio_ioctl call has a few issues:
- No permissions checking
- Implements SIOCGMIIREG and SIOCGMIIREG using the SIOCDEVPRIVATE numbers
- Has a few ioctls that may have been used for debugging at one point
  but have no place in the kernel proper.

This patch removes all but the MII ioctls, renumbers them to use the
standard ones, and adds the proper permission check for SIOCSMIIREG.

We can also get rid of the dl2k-specific struct mii_data in favor of
the generic struct mii_ioctl_data.

Since we have the phyid on hand, we can add the SIOCGMIIPHY ioctl too.

Most of the MII code for the driver could probably be converted to use
the generic MII library but I don't have a device to test the results.
This fixes: CVE-2012-2313

Reported-by: Stephan Mueller <stephan.mueller@atsec.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoSPEC: v2.6.39-300.2.0
Maxim Uvarov [Thu, 2 Aug 2012 09:48:53 +0000 (02:48 -0700)]
SPEC: v2.6.39-300.2.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoMerge branch 'uek2-merge' of git://ca-git.us.oracle.com/linux-konrad-public
Maxim Uvarov [Wed, 1 Aug 2012 15:37:03 +0000 (08:37 -0700)]
Merge branch 'uek2-merge' of git://ca-git.us.oracle.com/linux-konrad-public

13 years agoMerge branch 'stable/for-linus-3.7.rebased' into uek2-merge
Konrad Rzeszutek Wilk [Tue, 31 Jul 2012 19:33:39 +0000 (15:33 -0400)]
Merge branch 'stable/for-linus-3.7.rebased' into uek2-merge

* stable/for-linus-3.7.rebased:
  xen/mmu/enlighten: Fix memblock_x86_reserve_range downport.

13 years agoxen/mmu/enlighten: Fix memblock_x86_reserve_range downport.
Konrad Rzeszutek Wilk [Tue, 31 Jul 2012 19:21:02 +0000 (15:21 -0400)]
xen/mmu/enlighten: Fix memblock_x86_reserve_range downport.

This is not for upstream as it memblock_x86_reserve_range is not
used upstream anymore.

When I back-ported the patches:
xen/x86: Use memblock_reserve for sensitive areas.
xen/mmu: Recycle the Xen provided L4, L3, and L2 pages

I simply used sed s/memblock_reserve/memblock_x86_reserve_range/.
That was incorrect as the parameters are different - memblock_reserve
as second expects the size, while memblock_x86_reserve_range expects
the physical address. This patch fixes those bugs.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
13 years agoMerge branch 'stable/for-linus-3.7.rebased' into uek2-merge
Konrad Rzeszutek Wilk [Tue, 31 Jul 2012 18:44:59 +0000 (14:44 -0400)]
Merge branch 'stable/for-linus-3.7.rebased' into uek2-merge

* stable/for-linus-3.7.rebased:
  xen/p2m: Reserve 8MB of _brk space for P2M leafs when populating back.
  xen/mmu: Remove from __ka space PMD entries for pagetables.
  xen/mmu: Copy and revector the P2M tree.
  xen/p2m: Add logic to revector a P2M tree to use __va leafs.
  xen/mmu: Recycle the Xen provided L4, L3, and L2 pages
  xen/mmu: For 64-bit do not call xen_map_identity_early
  xen/mmu: use copy_page instead of memcpy.
  xen/mmu: Provide comments describing the _ka and _va aliasing issue
  xen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything.
  xen/x86: Use memblock_reserve for sensitive areas.
  xen/p2m: Fix the comment describing the P2M tree.
  xen/perf: Define .glob for the different hypercalls.

13 years agoxen/p2m: Reserve 8MB of _brk space for P2M leafs when populating back.
Konrad Rzeszutek Wilk [Mon, 30 Jul 2012 14:18:05 +0000 (10:18 -0400)]
xen/p2m: Reserve 8MB of _brk space for P2M leafs when populating back.

When we release pages back during bootup:

Freeing  9d-100 pfn range: 99 pages freed
Freeing  9cf36-9d0d2 pfn range: 412 pages freed
Freeing  9f6bd-9f6bf pfn range: 2 pages freed
Freeing  9f714-9f7bf pfn range: 171 pages freed
Freeing  9f7e0-9f7ff pfn range: 31 pages freed
Freeing  9f800-100000 pfn range: 395264 pages freed
Released 395979 pages of unused memory

We then try to populate those pages back. In the P2M tree however
the space for those leafs must be reserved - as such we use extend_brk.
We reserve 8MB of _brk space, which means we can fit over
1048576 PFNs - which is more than we should ever need.

[v1: Made it 8MB of _brk space instead of 4MB per Jan's suggestion]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 99266871de5006ba7ad0bfece6bb283ede4094b9)

13 years agoxen/mmu: Remove from __ka space PMD entries for pagetables.
Konrad Rzeszutek Wilk [Thu, 26 Jul 2012 20:57:19 +0000 (16:57 -0400)]
xen/mmu: Remove from __ka space PMD entries for pagetables.

Please first read the description in "xen/mmu: Copy and revector the
P2M tree."

At this stage, the __ka address space (which is what the old
P2M tree was using) is partially disassembled. The cleanup_highmap
has removed the PMD entries from 0-16MB and anything past _brk_end
up to the max_pfn_mapped (which is the end of the ramdisk).

The xen_remove_p2m_tree and code around has ripped out the __ka for
the old P2M array.

Here we continue on doing it to where the Xen page-tables were.
It is safe to do it, as the page-tables are addressed using __va.
For good measure we delete anything that is within MODULES_VADDR
and up to the end of the PMD.

At this point the __ka only contains PMD entries for the start
of the kernel up to __brk.

[v1: Per Stefano's suggestion wrapped the MODULES_VADDR in debug]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 4e928e1a48b6b76e0b8384160213a32d03197e4b)

13 years agoxen/mmu: Copy and revector the P2M tree.
Konrad Rzeszutek Wilk [Thu, 26 Jul 2012 16:47:40 +0000 (12:47 -0400)]
xen/mmu: Copy and revector the P2M tree.

Please first read the description in "xen/p2m: Add logic to revector a
P2M tree to use __va leafs" patch.

The 'xen_revector_p2m_tree()' function allocates a new P2M tree
copies the contents of the old one in it, and returns the new one.

At this stage, the __ka address space (which is what the old
P2M tree was using) is partially disassembled. The cleanup_highmap
has removed the PMD entries from 0-16MB and anything past _brk_end
up to the max_pfn_mapped (which is the end of the ramdisk).

We have revectored the P2M tree (and the one for save/restore as well)
to use new shiny __va address to new MFNs. The xen_start_info
has been taken care of already in 'xen_setup_kernel_pagetable()' and
xen_start_info->shared_info in 'xen_setup_shared_info()', so
we are free to roam and delete PMD entries - which is exactly what
we are going to do. We rip out the __ka for the old P2M array.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:

arch/x86/xen/mmu.c
[upstream git commit 3a06359601deaec046ce33008527edfa6731ef23]
[s/memblock_free/memblock_x86_free_range]

13 years agoxen/p2m: Add logic to revector a P2M tree to use __va leafs.
Konrad Rzeszutek Wilk [Thu, 19 Jul 2012 17:52:29 +0000 (13:52 -0400)]
xen/p2m: Add logic to revector a P2M tree to use __va leafs.

During bootup Xen supplies us with a P2M array. It sticks
it right after the ramdisk, as can be seen with a 128GB PV guest:

(certain parts removed for clarity):
xc_dom_build_image: called
xc_dom_alloc_segment:   kernel       : 0xffffffff81000000 -> 0xffffffff81e43000  (pfn 0x1000 + 0xe43 pages)
xc_dom_pfn_to_ptr: domU mapping: pfn 0x1000+0xe43 at 0x7f097d8bf000
xc_dom_alloc_segment:   ramdisk      : 0xffffffff81e43000 -> 0xffffffff925c7000  (pfn 0x1e43 + 0x10784 pages)
xc_dom_pfn_to_ptr: domU mapping: pfn 0x1e43+0x10784 at 0x7f0952dd2000
xc_dom_alloc_segment:   phys2mach    : 0xffffffff925c7000 -> 0xffffffffa25c7000  (pfn 0x125c7 + 0x10000 pages)
xc_dom_pfn_to_ptr: domU mapping: pfn 0x125c7+0x10000 at 0x7f0942dd2000
xc_dom_alloc_page   :   start info   : 0xffffffffa25c7000 (pfn 0x225c7)
xc_dom_alloc_page   :   xenstore     : 0xffffffffa25c8000 (pfn 0x225c8)
xc_dom_alloc_page   :   console      : 0xffffffffa25c9000 (pfn 0x225c9)
nr_page_tables: 0x0000ffffffffffff/48: 0xffff000000000000 -> 0xffffffffffffffff, 1 table(s)
nr_page_tables: 0x0000007fffffffff/39: 0xffffff8000000000 -> 0xffffffffffffffff, 1 table(s)
nr_page_tables: 0x000000003fffffff/30: 0xffffffff80000000 -> 0xffffffffbfffffff, 1 table(s)
nr_page_tables: 0x00000000001fffff/21: 0xffffffff80000000 -> 0xffffffffa27fffff, 276 table(s)
xc_dom_alloc_segment:   page tables  : 0xffffffffa25ca000 -> 0xffffffffa26e1000  (pfn 0x225ca + 0x117 pages)
xc_dom_pfn_to_ptr: domU mapping: pfn 0x225ca+0x117 at 0x7f097d7a8000
xc_dom_alloc_page   :   boot stack   : 0xffffffffa26e1000 (pfn 0x226e1)
xc_dom_build_image  : virt_alloc_end : 0xffffffffa26e2000
xc_dom_build_image  : virt_pgtab_end : 0xffffffffa2800000

So the physical memory and virtual (using __START_KERNEL_map addresses)
layout looks as so:

  phys                             __ka
/------------\                   /-------------------\
| 0          | empty             | 0xffffffff80000000|
| ..         |                   | ..                |
| 16MB       | <= kernel starts  | 0xffffffff81000000|
| ..         |                   |                   |
| 30MB       | <= kernel ends => | 0xffffffff81e43000|
| ..         |  & ramdisk starts | ..                |
| 293MB      | <= ramdisk ends=> | 0xffffffff925c7000|
| ..         |  & P2M starts     | ..                |
| ..         |                   | ..                |
| 549MB      | <= P2M ends    => | 0xffffffffa25c7000|
| ..         | start_info        | 0xffffffffa25c7000|
| ..         | xenstore          | 0xffffffffa25c8000|
| ..         | cosole            | 0xffffffffa25c9000|
| 549MB      | <= page tables => | 0xffffffffa25ca000|
| ..         |                   |                   |
| 550MB      | <= PGT end     => | 0xffffffffa26e1000|
| ..         | boot stack        |                   |
\------------/                   \-------------------/

As can be seen, the ramdisk, P2M and pagetables are taking
a bit of __ka addresses space. Which is a problem since the
MODULES_VADDR starts at 0xffffffffa0000000 - and P2M sits
right in there! This results during bootup with the inability to
load modules, with this error:

------------[ cut here ]------------
WARNING: at /home/konrad/ssd/linux/mm/vmalloc.c:106 vmap_page_range_noflush+0x2d9/0x370()
Call Trace:
 [<ffffffff810719fa>] warn_slowpath_common+0x7a/0xb0
 [<ffffffff81030279>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
 [<ffffffff81071a45>] warn_slowpath_null+0x15/0x20
 [<ffffffff81130b89>] vmap_page_range_noflush+0x2d9/0x370
 [<ffffffff81130c4d>] map_vm_area+0x2d/0x50
 [<ffffffff811326d0>] __vmalloc_node_range+0x160/0x250
 [<ffffffff810c5369>] ? module_alloc_update_bounds+0x19/0x80
 [<ffffffff810c6186>] ? load_module+0x66/0x19c0
 [<ffffffff8105cadc>] module_alloc+0x5c/0x60
 [<ffffffff810c5369>] ? module_alloc_update_bounds+0x19/0x80
 [<ffffffff810c5369>] module_alloc_update_bounds+0x19/0x80
 [<ffffffff810c70c3>] load_module+0xfa3/0x19c0
 [<ffffffff812491f6>] ? security_file_permission+0x86/0x90
 [<ffffffff810c7b3a>] sys_init_module+0x5a/0x220
 [<ffffffff815ce339>] system_call_fastpath+0x16/0x1b
---[ end trace fd8f7704fdea0291 ]---
vmalloc: allocation failure, allocated 16384 of 20480 bytes
modprobe: page allocation failure: order:0, mode:0xd2

Since the __va and __ka are 1:1 up to MODULES_VADDR and
cleanup_highmap rids __ka of the ramdisk mapping, what
we want to do is similar - get rid of the P2M in the __ka
address space. There are two ways of fixing this:

 1) All P2M lookups instead of using the __ka address would
    use the __va address. This means we can safely erase from
    __ka space the PMD pointers that point to the PFNs for
    P2M array and be OK.
 2). Allocate a new array, copy the existing P2M into it,
    revector the P2M tree to use that, and return the old
    P2M to the memory allocate. This has the advantage that
    it sets the stage for using XEN_ELF_NOTE_INIT_P2M
    feature. That feature allows us to set the exact virtual
    address space we want for the P2M - and allows us to
    boot as initial domain on large machines.

So we pick option 2).

This patch only lays the groundwork in the P2M code. The patch
that modifies the MMU is called "xen/mmu: Copy and revector the P2M tree."

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 0b4d1198932f4204d74f6032dce6dbd095fa9531)

13 years agoxen/mmu: Recycle the Xen provided L4, L3, and L2 pages
Konrad Rzeszutek Wilk [Thu, 26 Jul 2012 16:00:56 +0000 (12:00 -0400)]
xen/mmu: Recycle the Xen provided L4, L3, and L2 pages

As we are not using them. We end up only using the L1 pagetables
and grafting those to our page-tables.

[v1: Per Stefano's suggestion squashed two commits]
[v2: Per Stefano's suggestion simplified loop]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:

arch/x86/xen/mmu.c
[s/memblock_reserve/memblock_x86_reserve-range]
[cherry picked from d950a0fb6d64c4c9f160e3770cef0109e27627b0]

13 years agoxen/mmu: For 64-bit do not call xen_map_identity_early
Konrad Rzeszutek Wilk [Thu, 12 Jul 2012 17:59:36 +0000 (13:59 -0400)]
xen/mmu: For 64-bit do not call xen_map_identity_early

B/c we do not need it. During the startup the Xen provides
us with all the memory mapped that we need to function.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit d90be24669f2c39a29f821a654956f30cc9c4ed2)

13 years agoxen/mmu: use copy_page instead of memcpy.
Konrad Rzeszutek Wilk [Thu, 26 Jul 2012 15:57:04 +0000 (11:57 -0400)]
xen/mmu: use copy_page instead of memcpy.

After all, this is what it is there for.

Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit cbc09be35990fb3d15671507f11c3e90479ef816)

13 years agoxen/mmu: Provide comments describing the _ka and _va aliasing issue
Konrad Rzeszutek Wilk [Thu, 12 Jul 2012 17:55:25 +0000 (13:55 -0400)]
xen/mmu: Provide comments describing the _ka and _va aliasing issue

Which is that the level2_kernel_pgt (__ka virtual addresses)
and level2_ident_pgt (__va virtual address) contain the same
PMD entries. So if you modify a PTE in __ka, it will be reflected
in __va (and vice-versa).

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 26e694dc644c36641d6d73585400caa1f015e1fd)

13 years agoxen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything.
Konrad Rzeszutek Wilk [Fri, 29 Jun 2012 02:47:35 +0000 (22:47 -0400)]
xen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything.

We don't need to return the new PGD - as we do not use it.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit a573e36a3641f268ee6215a7d7cf74610ca5e81a)

Conflicts:

arch/x86/xen/enlighten.c
arch/x86/xen/mmu.c

13 years agoxen/x86: Use memblock_reserve for sensitive areas.
Konrad Rzeszutek Wilk [Thu, 19 Jul 2012 14:23:47 +0000 (10:23 -0400)]
xen/x86: Use memblock_reserve for sensitive areas.

instead of a big memblock_reserve. This way we can be more
selective in freeing regions (and it also makes it easier
to understand where is what).

[v1: Move the auto_translate_physmap to proper line]
[v2: Per Stefano suggestion add more comments]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[upstream git commit 91addbf07abfdd109a9da4e02061e6ed3728b298]
Conflicts:

arch/x86/xen/setup.c
[s/memblock_reserve/memblock_x86_reserve_range]

13 years agoxen/p2m: Fix the comment describing the P2M tree.
Konrad Rzeszutek Wilk [Fri, 29 Jun 2012 02:12:36 +0000 (22:12 -0400)]
xen/p2m: Fix the comment describing the P2M tree.

It mixed up the p2m_mid_missing with p2m_missing. Also
remove some extra spaces.

[upstream git commit 800ea898bbd7f79ef99695f71538f204e24cbcf3]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
13 years agoxen/perf: Define .glob for the different hypercalls.
Konrad Rzeszutek Wilk [Wed, 11 Jul 2012 19:03:18 +0000 (15:03 -0400)]
xen/perf: Define .glob for the different hypercalls.

This allows us in perf to have this:

 99.67%  [kernel]             [k] xen_hypercall_sched_op
  0.11%  [kernel]             [k] xen_hypercall_xen_version

instead of the borring ever-encompassing:

 99.13%  [kernel]              [k] hypercall_page

[v2: Use a macro to define the name and skip]
[v3: Use balign per Jan's suggestion]

[upstream git commit 7d0642b93780a7309d2954de6f6126d6ceb526f0]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
13 years agoSPEC: v2.6.39-300.1.0
Maxim Uvarov [Thu, 26 Jul 2012 13:28:30 +0000 (06:28 -0700)]
SPEC: v2.6.39-300.1.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoMerge branch 'uek2-merge' of git://ca-git.us.oracle.com/linux-konrad-public
Maxim Uvarov [Thu, 26 Jul 2012 13:11:26 +0000 (06:11 -0700)]
Merge branch 'uek2-merge' of git://ca-git.us.oracle.com/linux-konrad-public

13 years agoMerge branch 'stable/for-linus-3.6.rebased' into uek2-merge
Konrad Rzeszutek Wilk [Wed, 25 Jul 2012 17:13:12 +0000 (13:13 -0400)]
Merge branch 'stable/for-linus-3.6.rebased' into uek2-merge

* stable/for-linus-3.6.rebased:
  xen/p2m: Check __brk_limit before allocating.