]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
13 years agoixgbevf: Update the driver string
Greg Rose [Thu, 19 May 2011 02:11:45 +0000 (02:11 +0000)]
ixgbevf: Update the driver string

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 1057c42747ebf4d1cbaa2ab6125b92914b8ec622)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: workaround invalid Tx/Rx tail descriptor register write
Bruce Allan [Fri, 29 Jul 2011 05:53:02 +0000 (05:53 +0000)]
e1000e: workaround invalid Tx/Rx tail descriptor register write

When the Manageability Engine (ME) is enabled on 82579, it periodically
accesses some MAC CSR registers.  There is an arbiter in hardware which
prevents simultaneous access of these registers by the host software, i.e.
the driver.  There is a hardware bug in the aribter that signals a host
access of the registers later than it actually happens.  A write of the
Transmit or Receive Descriptor Tail register could result in an incorrect
value if the driver and ME perform simultaneous accesses which could result
in an access to an invalid memory address.  This would return an
Unsupported Request which could hang the hardware.  Workaround the issue by
checking the FWSM register bit24 which is set by ME before it accesses the
MAC CSR registers.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit c6e7f51e73c1bc6044bce989ec503ef2e4758d55)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: Spurious interrupts & dropped packets with 82577/8/9 in half-duplex
Bruce Allan [Fri, 22 Jul 2011 06:21:56 +0000 (06:21 +0000)]
e1000e: Spurious interrupts & dropped packets with 82577/8/9 in half-duplex

On 82577/8/9 in half-duplex when a received packet is passed from the PHY
to the MAC, if too many preamble octects are stripped from the packet
before arriving at the MAC, it can be misintrepeted as an in-band message
rather than an actual frame.  For example, if the frame contents resembled
an interrupt request in-band message, it would trigger a false interrupt.
In most cases, the packet is just dropped.

By reducing the number of preamble octets stripped from the beginning of
the frame when passing it from the PHY to the MAC, the MAC will interpret
the frame properly.

An additional uses of the magic PHY_REG(770, 16) have been updated with a
define introduced with this patch.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 1d2101a712b3b7281a19ff6d7bfc16c2ce9d3998)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: increase driver version number
Bruce Allan [Fri, 22 Jul 2011 06:22:02 +0000 (06:22 +0000)]
e1000e: increase driver version number

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 12440928dca77eccc8a793cf3cd83d017abbd7d6)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: alternate MAC address update
Bruce Allan [Fri, 29 Jul 2011 05:53:07 +0000 (05:53 +0000)]
e1000e: alternate MAC address update

If word 0x37 in the EEPROM is 0xFFFF _or_ 0x0000, then there is no
alternate MAC address in the EEPROM.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 244735f6ebccbf72a283db89472309f770e14c80)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: do not disable receiver on 82574/82583
Bruce Allan [Fri, 22 Jul 2011 06:21:35 +0000 (06:21 +0000)]
e1000e: do not disable receiver on 82574/82583

Due to a hardware erratum, the receiver on 82574 and 82583 should not be
stopped once it has been started.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 7f99ae633884043c70f4cc4a03f43dad0f0ecba2)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: minor re-order of #include files
Bruce Allan [Fri, 29 Jul 2011 05:52:51 +0000 (05:52 +0000)]
e1000e: minor re-order of #include files

The recent commit a6b7a407 when back-ported to the out-of-tree e1000e
driver caused a compilation error on older kernels which required a
re-ordering of the #include files.  This cosmetic patch syncs the two
drivers for easier maintainability.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 9fb7a5f77b26dedfcfa4e3a36fe207f818662bee)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: remove unnecessary check for NULL pointer
Bruce Allan [Fri, 22 Jul 2011 06:21:41 +0000 (06:21 +0000)]
e1000e: remove unnecessary check for NULL pointer

The array shadow_ram is never NULL.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit b9e06f70dc186f8353cc593f2b4609383b3be7a9)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agointel drivers: repair missing flush operations
Jesse Brandeburg [Wed, 20 Jul 2011 00:56:21 +0000 (00:56 +0000)]
intel drivers: repair missing flush operations

after review of all intel drivers, found several instances where
drivers had the incorrect pattern of:
memory mapped write();
delay();

which should always be:
memory mapped write();
write flush(); /* aka memory mapped read */
delay();

explanation:
The reason for including the flush is that writes can be held
(posted) in PCI/PCIe bridges, but the read always has to complete
synchronously and therefore has to flush all pending writes to a
device.  If a write is held and followed by a delay, the delay
means nothing because the write may not have reached hardware
(maybe even not until the next read)

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 945a51517cc0bd9e461f2018624dfc1faef9ddee)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: use GFP_KERNEL allocations at init time
Jeff Kirsher [Tue, 12 Jul 2011 16:10:12 +0000 (16:10 +0000)]
e1000e: use GFP_KERNEL allocations at init time

In process and sleep allowed context, favor GFP_KERNEL allocations over
GFP_ATOMIC ones.

-v2: fixed checkpatch.pl warnings

CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Ben Greear <greearb@candelatech.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c2fed9965c60e1f989f57889357c557f7b907ab7)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: Add Jumbo Frame support to 82583 devices
Carolyn Wyborny [Tue, 12 Jul 2011 16:10:11 +0000 (16:10 +0000)]
e1000e: Add Jumbo Frame support to 82583 devices

This patch adds support for the Jumbo Frames feature on 82583 devices.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a3d72d5d01b82a86f3b16ca1918d2040b1acba8c)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: remove e1000_queue_stats
Eric Dumazet [Mon, 11 Jul 2011 10:00:40 +0000 (10:00 +0000)]
e1000e: remove e1000_queue_stats

struct e1000_queue_stats is not used, lets remove it

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 995181b9529adbfecd6882c734ee702b5ed9226c)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agonet: e1000e: Use is_multicast_ether_addr helper
Tobias Klauser [Sun, 3 Jul 2011 23:47:04 +0000 (23:47 +0000)]
net: e1000e: Use is_multicast_ether_addr helper

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3e714ad3c2a07ee120044b72222cc20c14959efb)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: remove unnecessary reads of PCI_CAP_ID_EXP
Jon Mason [Mon, 27 Jun 2011 07:43:47 +0000 (07:43 +0000)]
e1000e: remove unnecessary reads of PCI_CAP_ID_EXP

The PCIE capability offset is saved during PCI bus walking.  It will
remove an unnecessary search in the PCI configuration space if this
value is referenced instead of reacquiring it.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 353064de8af3bf46757376db66c29fa87a9fda3a)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: update driver version
Bruce Allan [Thu, 19 May 2011 01:53:41 +0000 (01:53 +0000)]
e1000e: update driver version

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit b3ccf26704f80cd9959c4d2a83c2278bcc93425c)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: Clear host wakeup bit on 82577/8 without touching PHY page 800
Bruce Allan [Fri, 13 May 2011 07:20:14 +0000 (07:20 +0000)]
e1000e: Clear host wakeup bit on 82577/8 without touching PHY page 800

The Host Wakeup Active bit in the PHY Port General Configuration register
(page 769 register 17) must be cleared after every PHY reset to prevent an
unexpected wake signal from the PHY. Originally, this was accomplished by
simply reading the PHY Wakeup Control register on page 800 which clears the
Host Wakeup Active bit as a side-effect. Unfortunately, a hardware bug on
the 82577 and 82578 PHY can cause unexpected behavior when registers on
page 800 are accessed while in gigabit mode.

This patch changes the remaining instances when the Host Wakeup Active bit
needs to be cleared while possibly in gigabit mode by accessing the Port
General Configuration register directly instead of accessing any register
on page 800.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 3ebfc7c9a6177794e0a1635483bd64268bed5d3c)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: access multiple PHY registers on same page at the same time
Bruce Allan [Fri, 13 May 2011 07:20:09 +0000 (07:20 +0000)]
e1000e: access multiple PHY registers on same page at the same time

Doing a PHY page select can take a long time, relatively speaking. This
can cause a significant delay when updating a number of PHY registers on
the same page by unnecessarily setting the page for each PHY access. For
example when going to Sx, all the PHY wakeup registers (WUC, RAR[], MTA[],
SHRAR[], IP4AT[], IP6AT[], etc.) on 82577/8/9 need to be updated which
takes a long time which can cause issues when suspending.

This patch introduces new PHY ops function pointers to allow callers to
set the page directly and do any number of PHY accesses on that page.
This feature is currently only implemented for 82577, 82578 and 82579
PHYs for both the normally addressed registers as well as the special-
case addressing of the PHY wakeup registers on page 800. For the latter
registers, the existing function for accessing the wakeup registers has
been divided up into three- 1) enable access to the wakeup register page,
2) perform the register access and 3) disable access to the wakeup register
page. The two functions that enable/disable access to the wakeup register
page are necessarily available to the caller so that the caller can restore
the value of the Port Control (a.k.a. Wakeup Enable) register after the
wakeup register accesses are done.

All instances of writing to multiple PHY registers on the same page are
updated to use this new method and to acquire any PHY locking mechanism
before setting the page and performing the register accesses, and release
the locking mechanism afterward.

Some affiliated magic number cleanup is done as well.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 2b6b168d52aa044363647cfff8bda5cef8068ca3)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: do not schedule the Tx queue until ready
Bruce Allan [Fri, 13 May 2011 07:20:03 +0000 (07:20 +0000)]
e1000e: do not schedule the Tx queue until ready

Start the Tx queue when the interface is brought up in e1000e_up() but do
not schedule the queue until link is up as detected in the watchdog task
which sets netif_carrier_on.

Also flush the descriptors and clean the Tx and Rx rings before resetting
the hardware when bringing the interface down otherwise there is a small
window where the watchdog task can be triggered with netif_carrier_off
and the Tx ring not yet empty which causes an additional and unnecessary
reset.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 400484fa65ead1bbc3e86ea79e7505182a31bce1)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: log when swflag is cleared unexpectedly on ICH/PCH devices
Bruce Allan [Fri, 13 May 2011 07:19:53 +0000 (07:19 +0000)]
e1000e: log when swflag is cleared unexpectedly on ICH/PCH devices

Since EXTCNF_CTRL.SWFLAG (used in the ownership arbitration of shared
resources, e.g. the PHY shared between the s/w, f/w, and h/w clients)
can be cleared by any of those clients, log a debug message when
software attempts to clear it and it is already cleared unexpectedly.
And since the swflag is cleared by a hardware reset, the driver does
not need to do that, but the mutex acquired when the bit is set must
still be cleared.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit c5caf4825b22957e4ad70fd94316e91ce8cfb51c)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: 82579 intermittently disabled during S0->Sx
Bruce Allan [Fri, 13 May 2011 07:19:48 +0000 (07:19 +0000)]
e1000e: 82579 intermittently disabled during S0->Sx

When repeatedly cycling Sx->S0 states with the network cable unplugged,
the 82579 PHY may not initialize as expected and may require a full power
cycle to recover functionality to the device.  Workaround this by testing
access of the PHY registers after resuming; if that returns unexpected
results toggle the LANPHYPC signal to power cycle the PHY.

This is implemented in the new function e1000_resume_workarounds_pchlan()
which calls another new function, e1000_toggle_lanphypc_value_ich8lan(),
which has been created to reduce code duplication (same functionality
required by a previous workaround).  Also, e1000e_disable_gig_wol_ich8lan
is now e1000_suspend_workarounds_ich8lan to better reflect what it does.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 99730e4c13c8344b02dd96108945b48d28c14c25)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoe1000e: disable far-end loopback mode on ESB2
Bruce Allan [Fri, 13 May 2011 07:19:42 +0000 (07:19 +0000)]
e1000e: disable far-end loopback mode on ESB2

The ESB2 LAN includes a debug feature that enables far-end loopback (FELB)
of the SerDes/Kumeran interface.  This feature is activated when receiving
a sequence of symbols that includes a reserved codeword.  On a perfect
link, FELB would never be activated.  In the presence of bit errors, there
is a very small, but non-zero, probability of FELB being activated.

If the FELB is activated, the SerDes link becomes non-functional and must
be reset.  It could also corrupt the switching tables in the switch since
the ESB2 is transmitting packets with a different source MAC address.

This patch disables the FELB feature.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit d9b24135b972ccdd5f5174fba06c730e895e6daf)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agonet: introduce __netdev_alloc_skb_ip_align
Eric Dumazet [Tue, 12 Jul 2011 03:08:34 +0000 (20:08 -0700)]
net: introduce __netdev_alloc_skb_ip_align

RX rings should use GFP_KERNEL allocations if possible, add
__netdev_alloc_skb_ip_align() helper to ease this.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4915a0de43c3e9aef92005c1f94a8ff3a6cfced5)

Signed-off-by: Joe Jin <joe.jin@oracle.com>
13 years agoSPEC: v2.6.39-100.0.22
Guru Anbalagane [Wed, 1 Feb 2012 07:34:33 +0000 (23:34 -0800)]
SPEC: v2.6.39-100.0.22
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>
13 years agoxfs: fix acl count validation in xfs_acl_from_disk()
Dan Carpenter [Thu, 26 Jan 2012 13:55:16 +0000 (16:55 +0300)]
xfs: fix acl count validation in xfs_acl_from_disk()

We applied a fix for CVE-2012-0038 fa8b18edd7 "xfs: validate acl count",
but there was a follow on patch which is not in our kernel.  If count
was a negative then we could get by the new check.

From 093019cf1b18dd31b2c3b77acce4e000e2cbc9ce Mon Sep 17 00:00:00 2001
From: Xi Wang <xi.wang@gmail.com>
Date: Mon, 12 Dec 2011 21:55:52 +0000
Subject: [PATCH] xfs: fix acl count validation in xfs_acl_from_disk()

Commit fa8b18ed didn't prevent the integer overflow and possible
memory corruption.  "count" can go negative and bypass the check.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
13 years agoUpdated driver version to 5.02.00.00.06.02-uek2
Tej Parkash [Tue, 31 Jan 2012 08:24:37 +0000 (13:54 +0530)]
Updated driver version to 5.02.00.00.06.02-uek2

Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoocfs2: use spinlock irqsave for downconvert lock.patch
Srinivas Eeda [Tue, 31 Jan 2012 22:37:19 +0000 (14:37 -0800)]
ocfs2: use spinlock irqsave for downconvert lock.patch

When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it
deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread.
Below is the stack snippet.

The patch disables interrupts when acquiring dc_task_lock spinlock.

ocfs2_wake_downconvert_thread
ocfs2_rw_unlock
ocfs2_dio_end_io
dio_complete
.....
bio_endio
req_bio_endio
....
scsi_io_completion
blk_done_softirq
__do_softirq
do_softirq
irq_exit
do_IRQ
ocfs2_downconvert_thread
[kthread]

Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
13 years agodm-nfs-for-uek2
Adnan Misherfi [Fri, 27 Jan 2012 19:29:57 +0000 (14:29 -0500)]
dm-nfs-for-uek2

13 years agoSPEC: v2.6.39-100.0.21
Guru Anbalagane [Fri, 27 Jan 2012 01:44:29 +0000 (17:44 -0800)]
SPEC: v2.6.39-100.0.21

Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>
13 years agogit-changelog: add Orabug and CVE
Maxim Uvarov [Fri, 13 Jan 2012 22:57:06 +0000 (14:57 -0800)]
git-changelog: add Orabug and CVE

Add parsing Orabug and CVE.
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoqla2xxx: Update the driver version to 8.03.07.12.39.0-k.
Giridhar Malavali [Fri, 16 Dec 2011 11:09:59 +0000 (03:09 -0800)]
qla2xxx: Update the driver version to 8.03.07.12.39.0-k.

Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
13 years agoAdd support for pv hugepages and support for huge balloon pages.
Dave McCracken [Fri, 20 Jan 2012 15:34:37 +0000 (09:34 -0600)]
Add support for pv hugepages and support for huge balloon pages.

Signed-off-by: Dave McCracken <dave.mccracken@oracle.com>
13 years agoBtrfs: remove some verbose warnings
Chris Mason [Wed, 25 Jan 2012 19:06:49 +0000 (14:06 -0500)]
Btrfs: remove some verbose warnings

Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agoBtrfs: fix reservations in btrfs_page_mkwrite
Chris Mason [Wed, 25 Jan 2012 18:47:40 +0000 (13:47 -0500)]
Btrfs: fix reservations in btrfs_page_mkwrite

Josef fixed btrfs_page_mkwrite to properly release reserved
extents if there was an error.  But if we fail to get a reservation
and we fail to dirty the inode (for ENOSPC reasons), we'll end up
trying to release a reservation we never had.

This makes sure we only release if we were able to reserve.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agoBtrfs: use larger system chunks
Chris Mason [Mon, 16 Jan 2012 13:13:11 +0000 (08:13 -0500)]
Btrfs: use larger system chunks

system chunks by default are very small.  This makes them slightly
larger and also fixes the conditional checks to make sure we don't
allocate a billion of them at once.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 96bdc7dc61fb1b1e8e858dafb13abee8482ba064)

13 years agoBtrfs: add a delalloc mutex to inodes for delalloc reservations
Josef Bacik [Fri, 13 Jan 2012 17:09:22 +0000 (12:09 -0500)]
Btrfs: add a delalloc mutex to inodes for delalloc reservations

I was using i_mutex for this, but we're getting bogus lockdep warnings by doing
that and theres no real way to get rid of those, so just stop using i_mutex to
protect delalloc metadata reservations and use a delalloc mutex instead.  This
shouldn't be contended often at all, only if you are writing and mmap writing to
the file at the same time.  Thanks,

(cherry picked from commit f248679e86fead40cc78e724c7181d6bec1a2046)

Signed-off-by: Josef Bacik <josef@redhat.com>
13 years agoBtrfs: protect orphan block rsv with spin_lock
Josef Bacik [Fri, 2 Dec 2011 20:44:12 +0000 (15:44 -0500)]
Btrfs: protect orphan block rsv with spin_lock

We've been seeing warnings coming out of the orphan commit stuff forever from
ceph.  Turns out it's because we're racing with checking if the orphan block
reserve is set, because we clear it outside of the spin_lock.  So leave the
normal fastpath checks where they are, but take the spin_lock and _recheck_ to
make sure we haven't had an orphan block rsv added in the meantime.  Then clear
the root's orphan block rsv and release the lock.  With this patch a user said
the warnings went away and they usually showed up pretty soon after he started
ceph.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
(cherry picked from commit 90290e19820e3323ce6b9c2888eeb68bf29c278b)

13 years agoBtrfs: don't call btrfs_throttle in file write
Josef Bacik [Fri, 13 Jan 2012 00:10:12 +0000 (19:10 -0500)]
Btrfs: don't call btrfs_throttle in file write

Btrfs_throttle will make us wait if there is a currently committing transaction
until we can open new transactions, which is ridiculous since we don't actually
start any transactions within the file write path anyway, so all this does is
introduce big latencies if we have a sync/fsync heavy workload going on while
somebody else is trying to do work.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 45a8090e626ab470c91142954431a93846030b0d)

13 years agoBtrfs: release space on error in page_mkwrite
Josef Bacik [Fri, 13 Jan 2012 00:10:12 +0000 (19:10 -0500)]
Btrfs: release space on error in page_mkwrite

If updating the inode gave us an ENOSPC we were just returning in page_mkwrite,
which is a problem since we make our reservation right before trying to update
the inode, so fix the out label so that we actually free our reservation.
Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit ec39e180fd3188c983c94603634bfcd019f42ae7)

13 years agoBtrfs: fix btrfsck error 400 when truncating a compressed
Miao Xie [Fri, 13 Jan 2012 00:10:12 +0000 (19:10 -0500)]
Btrfs: fix btrfsck error 400 when truncating a compressed

Reproduce steps:
 # mkfs.btrfs /dev/sdb5
 # mount /dev/sdb5 -o compress=lzo /mnt
 # dd if=/dev/zero of=/mnt/tmpfile bs=128K count=1
 # sync
 # truncate -s 64K /mnt/tmpfile
 root 5 inode 257 errors 400

This is because of the wrong if condition, which is used to check if we should
subtract the bytes of the dropped range from i_blocks/i_bytes of i-node or not.
When we truncate a compressed extent, btrfs substracts the bytes of the whole
extent, it's wrong. We should substract the real size that we truncate, no
matter it is a compressed extent or not. Fix it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit f70a9a6b94af86fca069a7552ab672c31b457786)

13 years agoBtrfs: do not use btrfs_end_transaction_throttle everywhere
Josef Bacik [Fri, 13 Jan 2012 00:10:12 +0000 (19:10 -0500)]
Btrfs: do not use btrfs_end_transaction_throttle everywhere

A user reported a problem where things like open with O_CREAT would take up to
30 seconds when he had nfs activity on the same mount.  This is because all of
our quick metadata operations, like create, symlink etc all do
btrfs_end_transaction_throttle, which if the transaction is blocked will wait
for the commit to complete before it returns.  This adds a ridiculous amount of
latency and isn't really needed.  The normal btrfs_end_transaction will mark the
transaction as blocked and wake the transaction kthread up if it thinks the
transaction needs to end (this being in the running out of global reserve space
scenario), and this is all that is really needed since we've already done
everything we're going to do, we just need to return.  This should help people
with the latency they were seeing when using synchronous heavy workloads.
Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 7ad85bb76a61801362701b77c5cee5aa09f35369)

13 years agoBtrfs: fix possible deadlock when opening a seed device
Li Zefan [Wed, 7 Dec 2011 03:38:24 +0000 (11:38 +0800)]
Btrfs: fix possible deadlock when opening a seed device

The correct lock order is uuid_mutex -> volume_mutex -> chunk_mutex,
but when we mount a filesystem which has backing seed devices, we have
this lock chain:

    open_ctree()
        lock(chunk_mutex);
        read_chunk_tree();
            read_one_dev();
                open_seed_devices();
                    lock(uuid_mutex);

and then we hit a lockdep splat.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit b367e47fb3a70f5d24ebd6faf7d42436d485fb2d)

13 years agoBtrfs: update global block_rsv when creating a new block group
Li Zefan [Wed, 7 Dec 2011 02:39:22 +0000 (10:39 +0800)]
Btrfs: update global block_rsv when creating a new block group

A bug was triggered while using seed device:

    # mkfs.btrfs /dev/loop1
    # btrfstune -S 1 /dev/loop1
    # mount -o /dev/loop1 /mnt
    # btrfs dev add /dev/loop2 /mnt

btrfs: block rsv returned -28
------------[ cut here ]------------
WARNING: at fs/btrfs/extent-tree.c:5969 btrfs_alloc_free_block+0x166/0x396 [btrfs]()
...
Call Trace:
...
[<f7b7c31c>] btrfs_cow_block+0x101/0x147 [btrfs]
[<f7b7eaa6>] btrfs_search_slot+0x1b8/0x55f [btrfs]
[<f7b7f844>] btrfs_insert_empty_items+0x42/0x7f [btrfs]
[<f7b7f8c1>] btrfs_insert_item+0x40/0x7e [btrfs]
[<f7b8ac02>] btrfs_make_block_group+0x243/0x2aa [btrfs]
[<f7bb3f53>] __btrfs_alloc_chunk+0x672/0x70e [btrfs]
[<f7bb41ff>] init_first_rw_device+0x77/0x13c [btrfs]
[<f7bb5a62>] btrfs_init_new_device+0x664/0x9fd [btrfs]
[<f7bbb65a>] btrfs_ioctl+0x694/0xdbe [btrfs]
[<c04f55f7>] do_vfs_ioctl+0x496/0x4cc
[<c04f5660>] sys_ioctl+0x33/0x4f
[<c07b9edf>] sysenter_do_call+0x12/0x38
---[ end trace 906adac595facc7d ]---

Since seed device is readonly, there's no usable space in the filesystem.
Afterwards we add a sprout device to it, and the kernel creates a METADATA
block group and a SYSTEM block group where comes free space we can reserve,
but we still get revervation failure because the global block_rsv hasn't
been updated accordingly.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit c7c144db531fda414e532adac56e965ce332e2a5)

13 years agoBtrfs: rewrite btrfs_trim_block_group()
Li Zefan [Thu, 29 Dec 2011 06:47:27 +0000 (14:47 +0800)]
Btrfs: rewrite btrfs_trim_block_group()

There are various bugs in block group trimming:

- It may trim from offset smaller than user-specified offset.
- It may trim beyond user-specified range.
- It may leak free space for extents smaller than specified minlen.
- It may truncate the last trimmed extent thus leak free space.
- With mixed extents+bitmaps, some extents may not be trimmed.
- With mixed extents+bitmaps, some bitmaps may not be trimmed (even
none will be trimmed). Even for those trimmed, not all the free space
in the bitmaps will be trimmed.

I rewrite btrfs_trim_block_group() and break it into two functions.
One is to trim extents only, and the other is to trim bitmaps only.

Before patching:

# fstrim -v /mnt/
/mnt/: 1496465408 bytes were trimmed

After patching:

# fstrim -v /mnt/
/mnt/: 2193768448 bytes were trimmed

And this matches the total free space:

# btrfs fi df /mnt
Data: total=3.58GB, used=1.79GB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=205.12MB, used=97.14MB
Metadata: total=8.00MB, used=0.00

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit 7fe1e641502616220437079258506196bc4d8cbf)

13 years agoBtrfs: simplfy calculation of stripe length for discard operation
Li Zefan [Thu, 1 Dec 2011 06:06:42 +0000 (14:06 +0800)]
Btrfs: simplfy calculation of stripe length for discard operation

For btrfs raid, while discarding a range of space, we'll need to know
the start offset and length to discard for each device, and it's done
in btrfs_map_block().

However the calculation is a bit complex for raid0 and raid10, so I
reimplement it based on a fact that:

        dev1          dev2           dev3    (raid0)
        -----------------------------------
        s0 s3 s6      s1 s4 s7       s2 s5

Each device has (total_stripes / nr_dev) stripes, or plus one.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit ec9ef7a13be4dcce964c8503e8999087945e5b9e)

13 years agoBtrfs: don't pre-allocate btrfs bio
Li Zefan [Thu, 1 Dec 2011 04:55:47 +0000 (12:55 +0800)]
Btrfs: don't pre-allocate btrfs bio

We pre-allocate a btrfs bio with fixed size, and then may re-allocate
memory if we find stripes are bigger than the fixed size. But this
pre-allocation is not necessary.

Also we don't have to calcuate the stripe number twice.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit de11cc12df17337979e0929d2831887432f236ca)

13 years agoBtrfs: don't pass a trans handle unnecessarily in volumes.c
Li Zefan [Thu, 8 Dec 2011 07:07:24 +0000 (15:07 +0800)]
Btrfs: don't pass a trans handle unnecessarily in volumes.c

Some functions never use the transaction handle passed to them.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit 125ccb0ae6806dbec31abf4a85448971df3b4e39)

13 years agoBtrfs: reserve metadata space in btrfs_ioctl_setflags()
Li Zefan [Thu, 29 Dec 2011 05:39:50 +0000 (13:39 +0800)]
Btrfs: reserve metadata space in btrfs_ioctl_setflags()

Check and reserve space for btrfs_update_inode().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit 4da6f1a332f6c16b6594c7892f13c31459b9b1c8)

13 years agoBtrfs: remove BUG_ON()s in btrfs_ioctl_setflags()
Li Zefan [Thu, 29 Dec 2011 05:36:45 +0000 (13:36 +0800)]
Btrfs: remove BUG_ON()s in btrfs_ioctl_setflags()

We can recover from errors and return -errno to user space.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit f062abf089ff705e09bbaa6fa1e2fd7688a0f2ea)

13 years agoBtrfs: check the return value of io_ctl_init()
Li Zefan [Mon, 9 Jan 2012 06:36:28 +0000 (14:36 +0800)]
Btrfs: check the return value of io_ctl_init()

It can return -ENOMEM.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit 706efc6630c2722602541a6a2fc5900a4e38456a)

13 years agoBtrfs: avoid possible NULL deref in io_ctl_drop_pages()
Li Zefan [Mon, 9 Jan 2012 06:27:42 +0000 (14:27 +0800)]
Btrfs: avoid possible NULL deref in io_ctl_drop_pages()

If we run into some failure path in io_ctl_prepare_pages(),
io_ctl->pages[] array may have some NULL pointers.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit a1ee5a45818acc7f9c13e560827cf3e8735ac919)

13 years agoBtrfs: add pinned extents to on-disk free space cache correctly
Li Zefan [Tue, 10 Jan 2012 08:41:01 +0000 (16:41 +0800)]
Btrfs: add pinned extents to on-disk free space cache correctly

I got this while running xfstests:

[24256.836098] block group 317849600 has an wrong amount of free space
[24256.836100] btrfs: failed to load free space cache for block group 317849600

We should clamp the extent returned by find_first_extent_bit(),
so the start of the extent won't smaller than the start of the
block group.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
(cherry picked from commit db804f23a72bada58f083dfad6a65d019ddb3bd4)

13 years agoBtrfs: revamp clustered allocation logic
Alexandre Oliva [Fri, 14 Oct 2011 15:10:36 +0000 (12:10 -0300)]
Btrfs: revamp clustered allocation logic

Parameterize clusters on minimum total size, minimum chunk size and
minimum contiguous size for at least one chunk, without limits on
cluster, window or gap sizes.  Don't tolerate any fragmentation for
SSD_SPREAD; accept it for metadata, but try to keep data dense.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 1bb91902dc90e25449893e693ad45605cb08fbe5)

13 years agoBtrfs: don't set up allocation result twice
Alexandre Oliva [Mon, 28 Nov 2011 14:36:17 +0000 (12:36 -0200)]
Btrfs: don't set up allocation result twice

We store the allocation start and length twice in ins, once right
after the other, but with intervening calls that may prevent the
duplicate from being optimized out by the compiler.  Remove one of the
assignments.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit fc7c1077ceb99c35e5f9d0ce03dc7740565bb2bf)

13 years agoBtrfs: test free space only for unclustered allocation
Alexandre Oliva [Mon, 12 Dec 2011 06:48:19 +0000 (04:48 -0200)]
Btrfs: test free space only for unclustered allocation

Since the clustered allocation may be taking extents from a different
block group, there's no point in spin-locking and testing the current
block group free space before attempting to allocate space from a
cluster, even more so when we might refrain from even trying the
cluster in the current block group because, after the cluster was set
up, not enough free space remained.  Furthermore, cluster creation
attempts fail fast when the block group doesn't have enough free
space, so the test was completely superfluous.

I've move the free space test past the cluster allocation attempt,
where it is more useful, and arranged for a cluster in the current
block group to be released before trying an unclustered allocation,
when we reach the LOOP_NO_EMPTY_SIZE stage, so that the free space in
the cluster stands a chance of being combined with additional free
space in the block group so as to succeed in the allocation attempt.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit a5f6f719a5cd7caeee8ed8137cf3f94c3bbebc65)

13 years agoBtrfs: use bigger metadata chunks on bigger filesystems
Chris Mason [Fri, 6 Jan 2012 20:47:38 +0000 (15:47 -0500)]
Btrfs: use bigger metadata chunks on bigger filesystems

The 256MB chunk is a little small on a huge FS.  This scales up the
chunk size.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 1100373f8aa69e377386499350496e3d8565605f)

13 years agoBtrfs: lower the bar for chunk allocation
Chris Mason [Fri, 6 Jan 2012 20:41:34 +0000 (15:41 -0500)]
Btrfs: lower the bar for chunk allocation

The chunk allocation code has tried to keep a pretty tight lid on creating new
metadata chunks.  This is partially because in the past the reservation
code didn't give us an accurate idea of how much space was being used.

The new code is much more accurate, so we're able to get rid of some of these
checks.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit cf1d72c9ceec391d34c48724da57282e97f01122)

13 years agoBtrfs: run chunk allocations while we do delayed refs
Chris Mason [Fri, 6 Jan 2012 20:23:57 +0000 (15:23 -0500)]
Btrfs: run chunk allocations while we do delayed refs

Btrfs tries to batch extent allocation tree changes to improve performance
and reduce metadata trashing.  But it doesn't allocate new metadata chunks
while it is doing allocations for the extent allocation tree.

This commit changes the delayed refence code to do chunk allocations if we're
getting low on room.  It prevents crashes and improves performance.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 203bf287cb01a5dc26c20bd3737cecf3aeba1d48)

13 years agoBtrfs: call d_instantiate after all ops are setup
Al Viro [Fri, 23 Dec 2011 12:58:13 +0000 (07:58 -0500)]
Btrfs: call d_instantiate after all ops are setup

This closes races where btrfs is calling d_instantiate too soon during
inode creation.  All of the callers of btrfs_add_nondir are updated to
instantiate after the inode is fully setup in memory.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 08c422c27f855d27b0b3d9fa30ebd938d4ae6f1f)

13 years agoBtrfs: fix worker lock misuse in find_worker
Chris Mason [Fri, 23 Dec 2011 12:53:00 +0000 (07:53 -0500)]
Btrfs: fix worker lock misuse in find_worker

Dan Carpenter noticed that we were doing a double unlock on the worker
lock, and sometimes picking a worker thread without the lock held.

This fixes both errors.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
(cherry picked from commit 8d532b2afb2eacc84588db709ec280a3d1219be3)

13 years agoxen/config: turn CONFIG_XEN_DEBUG_FS off.
Konrad Rzeszutek Wilk [Tue, 24 Jan 2012 21:55:29 +0000 (16:55 -0500)]
xen/config: turn CONFIG_XEN_DEBUG_FS off.

That option makes the Xen spinlock code (xen/spinlock.c) accumulate
statistics about how many locks taken, time in slowpath, etc.
Good information during debugging but not in production.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
13 years agoproc: clean up and fix /proc/<pid>/mem handling
Maxim Uvarov [Mon, 23 Jan 2012 20:08:00 +0000 (12:08 -0800)]
proc: clean up and fix /proc/<pid>/mem handling

Orabug: 13618927
CVE-2012-0056
Jüri Aedla reported that the /proc/<pid>/mem handling really isn't very
robust, and it also doesn't match the permission checking of any of the
other related files.

This changes it to do the permission checks at open time, and instead of
tracking the process, it tracks the VM at the time of the open.  That
simplifies the code a lot, but does mean that if you hold the file
descriptor open over an execve(), you'll continue to read from the _old_
VM.

That is different from our previous behavior, but much simpler.  If
somebody actually finds a load where this matters, we'll need to revert
this commit.

I suspect that nobody will ever notice - because the process mapping
addresses will also have changed as part of the execve.  So you cannot
actually usefully access the fd across a VM change simply because all
the offsets for IO would have changed too.

Reported-by: Jüri Aedla <asd@ut.ee>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:

fs/proc/base.c

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoset XEN_MAX_DOMAIN_MEMORY for 512
Maxim Uvarov [Sat, 21 Jan 2012 01:49:55 +0000 (17:49 -0800)]
set XEN_MAX_DOMAIN_MEMORY for 512

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoadd __init arguments to init functions
Maxim Uvarov [Sat, 21 Jan 2012 01:45:24 +0000 (17:45 -0800)]
add __init arguments to init functions

Fix following issues:
WARNING: vmlinux.o(.text+0x3aba): Section mismatch in reference from the function xen_align_and_add_e820_region() to the function .init.text:e820_add_region()
The function xen_align_and_add_e820_region() references
the function __init e820_add_region().
This is often because xen_align_and_add_e820_region lacks a __init
annotation or the annotation of e820_add_region is wrong.

WARNING: vmlinux.o(.text+0x2e9ec): Section mismatch in reference from the function acpi_map_cpu2node() to the variable .cpuinit.data:__apicid_to_node
The function acpi_map_cpu2node() references
the variable __cpuinitdata __apicid_to_node.
This is often because acpi_map_cpu2node lacks a __cpuinitdata
annotation or the annotation of __apicid_to_node is wrong.

WARNING: vmlinux.o(.text+0x2e9f1): Section mismatch in reference from the function acpi_map_cpu2node() to the function .cpuinit.text:numa_set_node()
The function acpi_map_cpu2node() references
the function __cpuinit numa_set_node().
This is often because acpi_map_cpu2node lacks a __cpuinit
annotation or the annotation of numa_set_node is wrong.

WARNING: vmlinux.o(.text+0x3f9b4): Section mismatch in reference from the function enable_iommus() to the function .init.text:iommu_set_device_table()
The function enable_iommus() references
the function __init iommu_set_device_table().
This is often because enable_iommus lacks a __init
annotation or the annotation of iommu_set_device_table is wrong.

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agohpwdt: clean up set_memory_x call for 32 bit
Maxim Uvarov [Sun, 15 Jan 2012 20:08:20 +0000 (12:08 -0800)]
hpwdt: clean up set_memory_x call for 32 bit

1. addess has to be page aligned.
2. set_memory_x uses page size argument, not size.
Bug causes with following commit:
commit da28179b4e90dda56912ee825c7eaa62fc103797
Author: Mingarelli, Thomas <Thomas.Mingarelli@hp.com>
Date:   Mon Nov 7 10:59:00 2011 +0100

     watchdog: hpwdt: Changes to handle NX secure bit in 32bit path

    commit e67d668e147c3b4fec638c9e0ace04319f5ceccd upstream.

    This patch makes use of the set_memory_x() kernel API in order
    to make necessary BIOS calls to source NMIs.

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoSPEC: v2.6.39-100.0.20
Maxim Uvarov [Fri, 13 Jan 2012 02:16:53 +0000 (18:16 -0800)]
SPEC: v2.6.39-100.0.20

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoEnable Kabi Check
Guru Anbalagane [Fri, 13 Jan 2012 00:10:14 +0000 (16:10 -0800)]
Enable Kabi Check
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>
13 years agonet/bna driver update from 2.3.2.3 to 3.0.2.2
Maxim Uvarov [Thu, 12 Jan 2012 21:19:34 +0000 (13:19 -0800)]
net/bna driver update from 2.3.2.3 to 3.0.2.2

Orabug: 13255
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoscsi/bfa driver update from 2.3.2.3 to 3.0.2.2
Maxim Uvarov [Thu, 12 Jan 2012 21:14:37 +0000 (13:14 -0800)]
scsi/bfa driver update from 2.3.2.3 to 3.0.2.2

Orabug: 13254
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoUpdated version to 5.02.00.00.06.02-uek1
Tej Parkash [Tue, 20 Dec 2011 16:52:01 +0000 (22:22 +0530)]
Updated version to 5.02.00.00.06.02-uek1

Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Added error logging for firmware abort
Nilesh Javali [Wed, 14 Dec 2011 10:47:50 +0000 (16:17 +0530)]
qla4xxx: Added error logging for firmware abort

JIRA Key: IUEKR2ISCSI-15

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Disable generating pause frames in case of FW hung
Giridhar Malavali [Wed, 14 Dec 2011 10:45:10 +0000 (16:15 +0530)]
qla4xxx: Disable generating pause frames in case of FW hung

JIRA Key: IUEKR2ISCSI-14

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Temperature monitoring for ISP82XX core.
Mike Hernandez [Wed, 14 Dec 2011 10:16:12 +0000 (15:46 +0530)]
qla4xxx: Temperature monitoring for ISP82XX core.

During watchdog, need to monitor temperature of ISP82XX core
and set device state to FAILED when temperature reaches
"Panic" level.

JIRA Key: IUEKR2ISCSI-13

Signed-off-by: Mike Hernandez <michael.hernandez@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: check for FW alive before calling chip_reset
Shyam Sunder [Fri, 2 Dec 2011 06:42:13 +0000 (22:42 -0800)]
qla4xxx: check for FW alive before calling chip_reset

Check for firmware alive and do premature completion of
mbox commands in case of FW hung before doing chip_reset

Jira Key: IUEKR2ISCSI-16

Signed-off-by: Shyam Sunder <shyam.sunder@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Remove the unused macros
Tej Parkash [Tue, 20 Dec 2011 04:47:14 +0000 (10:17 +0530)]
qla4xxx: Remove the unused macros

Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Lalit Chandivade <lalit.chadnivade@qlogic.com>
13 years agoqla4xxx: cleanup, make qla4xxx_build_ddb_list short
Lalit Chandivade [Tue, 13 Dec 2011 11:21:37 +0000 (16:51 +0530)]
qla4xxx: cleanup, make qla4xxx_build_ddb_list short

Make qla4xxx_build_ddb_list shorter by adding more helper functions.

JIRA Key: IUEKR2ISCSI-12

Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: clear the RISC interrupt bit during firmware init
Sarang Radke [Thu, 8 Dec 2011 06:45:31 +0000 (12:15 +0530)]
qla4xxx: clear the RISC interrupt bit during firmware init

Fix for the kdump kernel panic issue with 80xx adapters.

JIRA Key: IUEKR2ISCSI-11

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Sarang Radke <sarang.radke@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: clear the SCSI COMPLETION INTERRUPT bit during firmware init
Prasanna Mumbai [Thu, 8 Dec 2011 05:29:35 +0000 (10:59 +0530)]
qla4xxx: clear the SCSI COMPLETION INTERRUPT bit during firmware init

Fix for the kdump kernel panic issue on 40xx adapters.

JIRA Key: IUEKR2ISCSI-10

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Prasanna Mumbai <prasanna.mumbai@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Fixed BFS with sendtargets as boot index.
Manish Rangankar [Fri, 2 Dec 2011 08:25:03 +0000 (13:55 +0530)]
qla4xxx: Fixed BFS with sendtargets as boot index.

If ql4xdisablesysfsboot = 0 and sendtargets entry as boot index then
driver does export sendtarget entries in sysfs but iscsistart does not
do discovery. So in this case let driver do the discovery and
login to the targets.

JIRA Key: IUEKR2ISCSI-9

Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com>
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Correct the default relogin timeout value
Nilesh Javali [Wed, 7 Dec 2011 08:22:31 +0000 (13:52 +0530)]
qla4xxx: Correct the default relogin timeout value

The ACB default timeout value is used to set the default
relogin timeout value. For ISP4022 adapters where
the ACB default value is set to 2560s, limit the relogin
timeout to 12s.

JIRA Key: IUEKR2ISCSI-8

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agoqla4xxx: Limit the ACB Default Timeout value to 12s
Nilesh Javali [Thu, 1 Dec 2011 09:06:04 +0000 (14:36 +0530)]
qla4xxx: Limit the ACB Default Timeout value to 12s

The ACB default timeout value is set to 2560s in the
ISP4022 firmware. This caused the driver to loop
for 2560s. Hence limit the default timeout at the driver
level to min 12s.

Also break out from the loop if the sendtargets list was empty.

JIRA Key: IUEKR2ISCSI-7

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
13 years agobond_alb: don't disable softirq under bond_alb_xmit
Maxim Uvarov [Thu, 5 Jan 2012 19:59:19 +0000 (11:59 -0800)]
bond_alb: don't disable softirq under bond_alb_xmit

Orabug: 13323879
No need to lock soft irqs under bond_alb_xmit()
which already has softirq disabled.

Changes:
1. add non-bh/bh version to tlb_clear_slave()

2. represent BH and non BH hash table locks
_lock_rx_hashtbl_bh/_unlock_rx_hashtbl_bh
_lock_rx_hashtbl/_unlock_rx_hashtbl
_lock_tx_hashtbl_bh/_unlock_tx_hashtbl_bh
_lock_tx_hashtbl/_unlock_tx_hashtbl

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
13 years agofix kernel version
Guru Anbalagane [Thu, 12 Jan 2012 22:34:15 +0000 (14:34 -0800)]
fix kernel version

13 years agoSPEC: v2.6.39-100.0.19
Maxim Uvarov [Wed, 11 Jan 2012 01:33:02 +0000 (17:33 -0800)]
SPEC: v2.6.39-100.0.19

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoscripts/git-changelog: generate rpm changelog script
Maxim Uvarov [Wed, 11 Jan 2012 00:54:36 +0000 (16:54 -0800)]
scripts/git-changelog: generate rpm changelog script

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoRevert "hpwd watchdog mark page executable"
Maxim Uvarov [Wed, 11 Jan 2012 00:04:19 +0000 (16:04 -0800)]
Revert "hpwd watchdog mark page executable"

Orabug: 13115973
This reverts commit e7494242c42201128204e50b2a707de335dd6334.
Bug fix is better covered with following upstream commit:

commit da28179b4e90dda56912ee825c7eaa62fc103797
Author: Mingarelli, Thomas <Thomas.Mingarelli@hp.com>
Date:   Mon Nov 7 10:59:00 2011 +0100

watchdog: hpwdt: Changes to handle NX secure bit in 32bit path

commit e67d668e147c3b4fec638c9e0ace04319f5ceccd upstream.
This patch makes use of the set_memory_x() kernel API in order
to make necessary BIOS calls to source NMIs.

This is needed for SLES11 SP2 and the latest upstream kernel as it appears
the NX Execute Disable has grown in its control.

Signed-off by: Thomas Mingarelli <thomas.mingarelli@hp.com>
Signed-off by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoPartial revert of mainline removal of deprecated sysfs interface for 13568528
Chuck Anderson [Sat, 7 Jan 2012 00:12:49 +0000 (16:12 -0800)]
Partial revert of mainline removal of deprecated sysfs interface for 13568528

Jan. 06, 2012
Oracle bug 13568528
Patch written by Andrew Thomas
Ported by Chuck Anderson

This patch partialy reverts the removal in mainline of a deprecated sysfs
interface needed by the OVM3.0.4 UEK2 based dom0 kernel when it is used
to install OVM3.0.4

Comments from Andrew:

The problem is that in newer kernels, even with the
CONFIG_SYSFS_DEPRECATED[_V2] flags set, some nodes have been removed so to
tools looking in sysfs, pieces are missing.  This breaks anaconda (actually
kudzu) for us.  For OVM3 we use the dom0 kernel as the install kernel, so we
need UEK2 to provide the right "shape" sysfs.  This isn't an issue for OL
because you use the old RHEL kernel to install UEK1/2].  That said, this
issue affects more than us.  As Joe Jin points out, bug 13100678, required
kudzu fixes for eth devices.  Arguably the OVM3 anaconda issue can also be
fixed in kudzu, but what no one knows is if the missing sysfs nodes are
symptoms of a wider set of tools related problems and therefore whether the
correct fix is to revert sysfs changes in UEK2 so that the sysfs it presents
is isomorphic to what 2.6.18 based kernels provide.

A "better" set of tools would be from 6uX, but in order to get those
installed/upgraded on OVM3 is not a trivial task because the system customers
have already installed is 5u5 based.  We have other tools in dom0 [eg our
agent] which "know" about the old flavour of sysfs and these would need
porting.  You either change the kernel OR you change all the tools that rely
on sysfs... the problem is that customers can install there own tools on OL5.

Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoscsi:lpfc update to 8.3.5.58
Maxim Uvarov [Fri, 6 Jan 2012 21:30:21 +0000 (13:30 -0800)]
scsi:lpfc update to 8.3.5.58

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoLet KERNEL_VERSION be 3.0.x, and override UTSNAME
Nelson Elhage [Tue, 10 Jan 2012 23:04:08 +0000 (15:04 -0800)]
Let KERNEL_VERSION be 3.0.x, and override UTSNAME

This will let out-of-tree modules correctly detect the kernel version
when building against it, but it will still identify as 2.6.39
everywhere in userspace.

Signed-off-by: Nelson Elhage <nelson.elhage@oracle.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoqla4xxx: Fix qla4xxx_dump_buffer to dump buffer correctly
Vikas Chaudhary [Fri, 2 Dec 2011 06:42:12 +0000 (22:42 -0800)]
qla4xxx: Fix qla4xxx_dump_buffer to dump buffer correctly

KERN_INFO in printk adding new line character that mess-up
dump print format. Remove KERN_INFO to fix dump format.

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: Fix the IDC locking mechanism
Nilesh Javali [Fri, 2 Dec 2011 06:42:11 +0000 (22:42 -0800)]
qla4xxx: Fix the IDC locking mechanism

This ensures the transition of dev_state from COLD to
INITIALIZING is within lock and atomic.

Jira Key: IUEKR2ISCSI-6

Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: Wait for disable_acb before doing set_acb
Vikas Chaudhary [Fri, 2 Dec 2011 06:42:10 +0000 (22:42 -0800)]
qla4xxx: Wait for disable_acb before doing set_acb

In function qla4xxx_iface_set_param wait for disable_acb to
complete so that set_acb will not fail.

Jira Key: IUEKR2ISCSI-5

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: Don't recover adapter if device state is FAILED
Sarang Radke [Fri, 2 Dec 2011 06:42:09 +0000 (22:42 -0800)]
qla4xxx: Don't recover adapter if device state is FAILED

Multiple reset request don't get handled correctly as
the driver tries to recover adapter which is in FAILED state.

Jira Key: IUEKR2ISCSI-4

Signed-off-by: Sarang Radke <sarang.radke@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: fix call trace on rmmod with ql4xdontresethba=1
Sarang Radke [Tue, 6 Dec 2011 10:34:10 +0000 (02:34 -0800)]
qla4xxx: fix call trace on rmmod with ql4xdontresethba=1

abort all active commands from eh_host_reset in-case
of ql4xdontresethba=1

Fix following call trace:-
Nov 21 14:50:47 172.17.140.111 qla4xxx 0000:13:00.4: qla4_8xxx_disable_msix: qla4xxx (rsp_q)
Nov 21 14:50:47 172.17.140.111 qla4xxx 0000:13:00.4: PCI INT A disabled
Nov 21 14:50:47 172.17.140.111 slab error in kmem_cache_destroy(): cache `qla4xxx_srbs': Can't free all objects
Nov 21 14:50:47 172.17.140.111 Pid: 9154, comm: rmmod Tainted: G           O 3.2.0-rc2+ #2
Nov 21 14:50:47 172.17.140.111 Call Trace:
Nov 21 14:50:47 172.17.140.111  [<c051231a>] ? kmem_cache_destroy+0x9a/0xb0
Nov 21 14:50:47 172.17.140.111  [<c0489c4a>] ? sys_delete_module+0x14a/0x210
Nov 21 14:50:47 172.17.140.111  [<c04fd552>] ? do_munmap+0x202/0x280
Nov 21 14:50:47 172.17.140.111  [<c04a6d4e>] ? audit_syscall_entry+0x1ae/0x1d0
Nov 21 14:50:47 172.17.140.111  [<c083019f>] ? sysenter_do_call+0x12/0x28
Nov 21 14:51:50 172.17.140.111 SLAB: cache with size 64 has lost its name
Nov 21 14:51:50 172.17.140.111 iscsi: registered transport (qla4xxx)
Nov 21 14:51:50 172.17.140.111 qla4xxx 0000:13:00.4: PCI INT A -> GSI 28 (level, low) -> IRQ 28

Jira Key: IUEKR2ISCSI-3

Signed-off-by: Sarang Radke <sarang.radke@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: Fix CPU lockups when ql4xdontresethba set
Mike Hernandez [Fri, 2 Dec 2011 06:42:07 +0000 (22:42 -0800)]
qla4xxx: Fix CPU lockups when ql4xdontresethba set

Fix issue where CPU lockup is seen when ql4xdontresethba is set and
driver is "stuck" in NEED_RESET state handler.

Jira Key: IUEKR2ISCSI-2

Signed-off-by: Mike Hernandez <michael.hernandez@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agoqla4xxx: Perform context resets in case of context failures.
Vikas Chaudhary [Fri, 2 Dec 2011 06:42:06 +0000 (22:42 -0800)]
qla4xxx: Perform context resets in case of context failures.

For 4032, context reset was the same as chip reset, and any firmware
issue was recovered by performing a chip reset.
For 82xx, the iSCSI firmware runs along with FCoE and the NIC
firmware contexts, and an error encountered doesnot essentially mean
that a chip reset is necessary.

Perform Chip resets only in the following cases:
1. Mailbox system error.
2. Mailbox command timeout.
3. fw_heartbeat_counter counter stops incrementing.

For all other cases, only perform a context reset.
1. Command Completion with an invalid srb.
2. Other mailbox failures.

Jira Key: IUEKR2ISCSI-1

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Shyam Sunder <shyam.sunder@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
13 years agodo not obsolete firmware
Maxim Uvarov [Fri, 6 Jan 2012 00:27:43 +0000 (16:27 -0800)]
do not obsolete firmware

Orabug: 13535055
Do not remove firmware for other kernels.
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoRevert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"
Konrad Rzeszutek Wilk [Tue, 10 Jan 2012 20:43:47 +0000 (15:43 -0500)]
Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"

This reverts commit ddacf5ef684a655abe2bb50c4b2a5b72ae0d5e05.

We piggyback on upstream git commit 12275dd4b747f5d87fa36229774d76bca8e63068
which says:

commit 12275dd4b747f5d87fa36229774d76bca8e63068
Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date:   Mon Dec 19 09:30:35 2011 -0500

    Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"

    This reverts commit ddacf5ef684a655abe2bb50c4b2a5b72ae0d5e05.
    As when booting the kernel under Amazon EC2 as an HVM guest it ends up
    hanging during startup. Reverting this we loose the fix for kexec
    booting to the crash kernels.

    Fixes Canonical BZ #901305 (http://bugs.launchpad.net/bugs/901305)

Tested-by: Alessandro Salvatori <sandr8@gmail.com>
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
13 years agoRevert "xen-blkback: convert hole punching to discard request on loop devices"
Maxim Uvarov [Tue, 10 Jan 2012 22:35:52 +0000 (14:35 -0800)]
Revert "xen-blkback: convert hole punching to discard request on loop devices"

This reverts commit 7139323c5ef7b342d770efef6ae463ff201e3d83.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
13 years agoath9k: Fix kernel panic in AR2427 in AP mode
Mohammed Shafi Shajakhan [Mon, 26 Dec 2011 05:12:15 +0000 (10:42 +0530)]
ath9k: Fix kernel panic in AR2427 in AP mode

commit b25bfda38236f349cde0d1b28952f4eea2148d3f upstream.

don't do aggregation related stuff for 'AP mode client power save
handling' if aggregation is not enabled in the driver, otherwise it
will lead to panic because those data structures won't be never
intialized in 'ath_tx_node_init' if aggregation is disabled

EIP is at ath_tx_aggr_wakeup+0x37/0x80 [ath9k]
EAX: e8c09a20 EBX: f2a304e8 ECX: 00000001 EDX: 00000000
ESI: e8c085e0 EDI: f2a304ac EBP: f40e1ca4 ESP: f40e1c8c
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process swapper/1 (pid: 0, ti=f40e0000 task=f408e860
task.ti=f40dc000)
Stack:
0001e966 e8c09a20 00000000 f2a304ac e8c085e0 f2a304ac
f40e1cb0 f8186741
f8186700 f40e1d2c f922988d f2a304ac 00000202 00000001
c0b4ba43 00000000
0000000f e8eb75c0 e8c085e0 205b0001 34383220 f2a304ac
f2a30000 00010020
Call Trace:
[<f8186741>] ath9k_sta_notify+0x41/0x50 [ath9k]
[<f8186700>] ? ath9k_get_survey+0x110/0x110 [ath9k]
[<f922988d>] ieee80211_sta_ps_deliver_wakeup+0x9d/0x350
[mac80211]
[<c018dc75>] ? __module_address+0x95/0xb0
[<f92465b3>] ap_sta_ps_end+0x63/0xa0 [mac80211]
[<f9246746>] ieee80211_rx_h_sta_process+0x156/0x2b0
[mac80211]
[<f9247d1e>] ieee80211_rx_handlers+0xce/0x510 [mac80211]
[<c018440b>] ? trace_hardirqs_on+0xb/0x10
[<c056936e>] ? skb_queue_tail+0x3e/0x50
[<f9248271>] ieee80211_prepare_and_rx_handle+0x111/0x750
[mac80211]
[<f9248bf9>] ieee80211_rx+0x349/0xb20 [mac80211]
[<f9248949>] ? ieee80211_rx+0x99/0xb20 [mac80211]
[<f818b0b8>] ath_rx_tasklet+0x818/0x1d00 [ath9k]
[<f8187a75>] ? ath9k_tasklet+0x35/0x1c0 [ath9k]
[<f8187a75>] ? ath9k_tasklet+0x35/0x1c0 [ath9k]
[<f8187b33>] ath9k_tasklet+0xf3/0x1c0 [ath9k]
[<c0151b7e>] tasklet_action+0xbe/0x180

Cc: Senthil Balasubramanian <senthilb@qca.qualcomm.com>
Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Reported-by: Ashwin Mendonca <ashwinloyal@gmail.com>
Tested-by: Ashwin Mendonca <ashwinloyal@gmail.com>
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
13 years agoptrace: partially fix the do_wait(WEXITED) vs EXIT_DEAD->EXIT_ZOMBIE race
Oleg Nesterov [Wed, 4 Jan 2012 16:29:02 +0000 (17:29 +0100)]
ptrace: partially fix the do_wait(WEXITED) vs EXIT_DEAD->EXIT_ZOMBIE race

commit 50b8d257486a45cba7b65ca978986ed216bbcc10 upstream.

Test-case:

int main(void)
{
int pid, status;

pid = fork();
if (!pid) {
for (;;) {
if (!fork())
return 0;
if (waitpid(-1, &status, 0) < 0) {
printf("ERR!! wait: %m\n");
return 0;
}
}
}

assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0);
assert(waitpid(-1, NULL, 0) == pid);

assert(ptrace(PTRACE_SETOPTIONS, pid, 0,
PTRACE_O_TRACEFORK) == 0);

do {
ptrace(PTRACE_CONT, pid, 0, 0);
pid = waitpid(-1, NULL, 0);
} while (pid > 0);

return 1;
}

It fails because ->real_parent sees its child in EXIT_DEAD state
while the tracer is going to change the state back to EXIT_ZOMBIE
in wait_task_zombie().

The offending commit is 823b018e which moved the EXIT_DEAD check,
but in fact we should not blame it. The original code was not
correct as well because it didn't take ptrace_reparented() into
account and because we can't really trust ->ptrace.

This patch adds the additional check to close this particular
race but it doesn't solve the whole problem. We simply can't
rely on ->ptrace in this case, it can be cleared if the tracer
is multithreaded by the exiting ->parent.

I think we should kill EXIT_DEAD altogether, we should always
remove the soon-to-be-reaped child from ->children or at least
we should never do the DEAD->ZOMBIE transition. But this is too
complex for 3.2.

Reported-and-tested-by: Denys Vlasenko <vda.linux@googlemail.com>
Tested-by: Lukasz Michalik <lmi@ift.uni.wroc.pl>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>