Moger, Babu [Thu, 2 Feb 2012 15:21:54 +0000 (15:21 +0000)]
[SCSI] scsi_dh_rdac: Fix for unbalanced reference count
Orabug: 14059970
This patch fixes an unbalanced refcount issue.
Elevating the lock for both kref_put and also for controller node deletion.
Previously, controller deletion was protected but the not the kref_put. This
was causing the other thread to pick up the controller structure which was
already kref'd zero.
This was causing the following WARN_ON and also sometimes panic.
[SCSI] dh_rdac: Associate HBA and storage in rdac_controller to support partitions in storage
Orabug: 14059970
rdac hardware handler assumes that there is one-to-one relation ship
between the host and the controller w.r.t lun. IOW, it does not
support "multiple storage partitions" within a storage.
Example:
HBA1 and HBA2 see lun 0 and 1 in storage A (1)
HBA3 and HBA4 see lun 0 and 1 in storage A (2)
HBA5 and HBA6 see lun 0 and 1 in storage A (3)
luns 0 and 1 in (1), (2) and (3) are totally different.
But, rdac handler treats the lun 0s (and lun 1s) as the same when
sending a mode select to the controller, which is wrong.
This patch makes the rdac hardware handler associate HBA and the
storage w.r.t lun (and not the host itself).
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
[SCSI] dh_rdac: Use WWID from C8 page instead of Subsystem id from C4 page to identify storage
Orabug: 14059970
rdac hardware handler uses "Subsystem Identifier" from C4 inquiry page
to uniquely identify a storage. The problem with that is that if any
any of the bytes are non-ascii, subsys_id will all be spaces (hex
0x20). This creates lot of problems especially when there are multiple
rdac storages are connected to the server.
Use "Storage Array Unique Identifier" from C8 inquiry page, which is the
world wide unique identifier for the storage array, to uniquely identify
the storage.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
root [Wed, 2 May 2012 14:25:07 +0000 (19:55 +0530)]
be2iscsi: Check ASYNC PDU Handle corresponds to HDR/DATA Handle
For each ASYNC PDU received there is an HDR and DATA handle for it.
There will be only 1 HDR ASYNC Handle, but DATA Handle can be more
than 1 for each ASYNC PDU received. Checking if the ASYNC Handle
correspongs to HDR or DATA while returning the Handle to the free list.
hwi_free_async_msg just return the handles to the free list. No return
values are needed so changing the return type to void.
root [Wed, 2 May 2012 14:17:43 +0000 (19:47 +0530)]
be2iscsi:Fix the function return values.
This patch fixes the return value
Signed-off-by: John Soni Jose <sony.john-n@emulex.com> Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: root <root@localhost.(none)>
root [Wed, 2 May 2012 14:16:38 +0000 (19:46 +0530)]
be2iscsi:Code cleanup, removing the goto statement
Signed-off-by: John Soni Jose <sony.john-n@emulex.com> Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: root <root@localhost.(none)>
root [Wed, 2 May 2012 14:14:51 +0000 (19:44 +0530)]
be2iscsi:Fix double free of MCCQ info memory.
In case of MCC_Q creation failed, the MCCQ info memory is freed
from be_mcc_queues_destroy and be_mcc_queues_create. This caused
kernel to panic because of double free.
Signed-off-by: John Soni Jose <sony.john-n@emulex.com> Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: root <root@localhost.(none)>
root [Wed, 2 May 2012 14:13:21 +0000 (19:43 +0530)]
be2iscsi:Set num_cpu = 1 if pci_enable_msix fails
This patch sets the num_cpu to 1 if msix not supported Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: root <root@localhost.(none)>
root [Wed, 2 May 2012 14:09:28 +0000 (19:39 +0530)]
be2iscsi:Freeing of WRB and SGL Handle in cleanup task
The WRB and SGL Handle allocated for Login task were not freed
back to the pool after the login process was done. This code
releases the WRB and SGL Handle after the login process.
v2:
- Fix up locking so bh calls are not done when not needed.
- Make beiscsi_cleanup_task static.
root [Wed, 2 May 2012 14:01:03 +0000 (19:31 +0530)]
be2iscsi: Fix in the Asynchronous Code Path
Set the ASYNC PDU Handle pBuffer for Data ring with the VA/PA
of the allocated memory for it.
To get the correct ASYNC PDY Handle iterate the list and compare
the PA set during initialization with the passed PHY Address.
The buffer_size and num_enteries are common for HDR and Data ring
root [Wed, 2 May 2012 13:58:04 +0000 (19:28 +0530)]
be2iscsi: cleanup a min_t() call
"sense_len" was declared as int type but actually it only stores a
u16 value that comes from hardware. The cast to u16 in min_t()
confuses static analysis because it truncates the int to u16 so I've
fixed the declaration to reflect that "sense_len" is just a u16.
Also there was a call to cpu_to_be16() which I've changed to
be16_to_cpu(). The functions are equivalent, but obviously the
hardware is big endian and we're doing the min_t() comparison on CPU
endian values.
This whole patch is just a cleanup and doesn't affect how the code
works.
upstream commit id : 4053a4be525d3441cad6cd1ae207177f03eb9ce7 Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: root <root@localhost.(none)>
qla2xxx: Handle device mapping changes due to device logout.
A device logout sent in the delete path of a fcport would clear the
port handle binding inside the firmware. This could lead to queued
work items for the fcport, if any, getting incorrect results. This
patch fixes the issue by checking for device name changes after a
call to get port database.
Chad Dupuis [Fri, 13 Jan 2012 15:08:35 +0000 (09:08 -0600)]
qla2xxx: Hard code the number of loop entries at 128.
Do not use ha->max_fibre_devices in loop topology since the maximum number of
entries will always be 128 and so we don't have to worry about changing
ha->max_fibre_devices back.
Giridhar Malavali [Wed, 14 Dec 2011 01:17:47 +0000 (17:17 -0800)]
qla2xxx: Complete mailbox command timedout to avoid initialization failures during next reset cycle.
Complete the mailbox command timed out before initiating another abort cycle
to recover so that mailbox commands issued during next reset cycle don't fail
due to pending mailbox access timeout.
Andrew Vasquez [Tue, 10 Apr 2012 11:52:14 +0000 (17:22 +0530)]
qla2xxx: Cache swl during fabric discovery.
Rather than continuously allocating and freeing swl within the discovery
process, simply pre-allocate it the first time that it's needed, cache it
through the rest of the lifecycle of the driver and free it at module unload.
Joe Carnuccio [Tue, 10 Apr 2012 11:50:36 +0000 (17:20 +0530)]
qla2xxx: Remove EDC sysfs interface.
Since the new firmware periodically resets the EDC, the EDC is now
not able to be flashed while the firmware is running, so the user
applications must be prevented from flashing the EDC, and this is
achieved by removing the EDC sysfs interface.
Michael Christie [Tue, 10 Apr 2012 11:20:23 +0000 (16:50 +0530)]
qla2xxx: Remove check for null fcport from host reset handler.
Remove the check for a NULL fcport so that the host reset will run
unconditionally to unwedge any commands before the device is offlined and to
prevent a quick runthrough of the SCSI error handling.
Andrew Vasquez [Fri, 28 Oct 2011 21:40:44 +0000 (14:40 -0700)]
qla2xxx: Correct out of bounds read of ISP2200 mailbox registers.
From Olatunji:
A tool that I m building for finding memory faults in
Linux drivers is reporting that the following loop, in
qla2x00_mbx_completion(), reads outside the allocated io memory
while reading ISP2200 mailbox registers. I would appreciate your
help in confirming this bug.
During isp2200 initialization (qla2x00_probe_one), ha->mbx_count
is set to 32, even though isp2200 has 24 mailbox registers
(mailbox0 ... mailbox23). Therefore the loop runs for
cnt=[1..31], wptr walks off the allocated mailbox register region
at cnt==24, and results in out-of-bounds reads.
Although I observed this problem in linux2.6.17.1, I
confirmed that it also exists in 2.6.37 and 3.1-rc4.
Fortunately, the reads outside the 24 mailbox registers are
benign. For correctness, limit the driver's read to 24.
Andrew Vasquez [Thu, 20 Oct 2011 17:14:16 +0000 (10:14 -0700)]
qla2xxx: Clear options-flags while issuing stop-firmware mbx command.
Not clearing the options flags in mbx1 could lead the firmware
into interpreting old data in mbx1 through mbx8. This could
lead to inadvertent DMA read/write operations to stale memory.
During command failure/non-recognition, the upper-layer
FC-transport expects the drivers to set
job-reply->reply_payload_rcv_len. Do this in a consistent manner
to avoid duplication.
Andrew Vasquez [Fri, 4 Nov 2011 14:31:51 +0000 (09:31 -0500)]
qla2xxx: Perform implicit logout during rport tear-down.
During rport tear-down, make sure we do an implicit LOGO of the fcport in our
firmware to try to clear any residual commands associated with that fcport.
Rework the structures related to SRB processing to minimize the memor
allocations per I/O and manage resources associated with and completions
from common routines.
Nagalakshmi Nandigama [Mon, 7 May 2012 20:40:00 +0000 (13:40 -0700)]
[mpt2sas] fix NULL pointer at ioc->pfacts
Orabug: 14040678
The ioc->pfacts member in the IOC structure is getting set to zero
following a call to _base_get_ioc_facts due to the memset in that routine.
So if the ioc->pfacts was read after a host reset, there would be a NULL
pointer dereference. The routine _base_get_ioc_facts is called from context
of host reset. The problem in _base_get_ioc_facts is the size of
Mpi2IOCFactsReply is 64, whereas the sizeof "struct mpt2sas_facts" is 60,
so there is a four byte overflow resulting from the memset.
Also, there is memset in _base_get_port_facts using the incorrect structure,
it should be "struct mpt2sas_port_facts" instead of Mpi2PortFactsReply.
Nagalakshmi Nandigama [Mon, 7 May 2012 20:39:25 +0000 (13:39 -0700)]
[mpt2sas] A hard drive is going OFFLINE when there is a hard reset issued
and simultaneously another hard drive is hot unplugged
Orabug: 14040678
Following the host reset, the firmware discovery is reassigning another
hard drive in the topology to the same device handle as that device is
getting hot removed. Until the driver device removal routine is called,
there will be two hard drive with the matching device handle in the
internal device link list. In the device removal routine, a separate
function which moves the device from BLOCKED into OFFLINE state.
Since this routine is passed with the device handle passed as input parameter,
the routine will be traversing the internal device link list searching for
matching device handle. This results in two devices with matching
device handle, therefore both devices goes OFFLINE.
To fix this issue,the input parameter is changed from device handle to
SAS address, therefore only the device that is hot unplugged will be placed
in OFFLINE state.
Nagalakshmi Nandigama [Mon, 7 May 2012 20:38:49 +0000 (13:38 -0700)]
[mpt2sas] Set the phy identifier of the end device to to the phy number of the parent device
it is linked to
Orabug: 14040678
The phy_identifier inside the routine _transport_set_identify()
is set to sas_device_page_zero->PhyNum. This returns the
phy number of the parent device this device is linked to.
Nagalakshmi Nandigama [Mon, 7 May 2012 20:37:52 +0000 (13:37 -0700)]
[mpt2sas] While enabling phy, read the current port number from sas iounit page 0
instead of page 1
Orabug: 14040678
The port number is changing after disabling/enabling phys using the SysFS interface
This is because the firmware behavour changed where it would read the the port number
then set it to some different value even though Auto Port Config is turned on.
With this change of behavour in FW, it is possible that the expanders are moved
from one port to another after disabling /enabling phys. This is occuring because
the port number in sas iounit page 1 is not matching up to the current port in
page 0. In order to fix this the driver is modified to read the current
port number from sas iounit page 0 instead of page 1. Also copy the
port and phy flags over from page 0 to page 1.
Nagalakshmi Nandigama [Mon, 7 May 2012 20:35:49 +0000 (13:35 -0700)]
[mpt2sas] Modify the source code as per the findings reported by the source
code analysis tool
Orabug: 14040678
Modified the source code as per the findings reported by the source
code analysis tool. Source code for the following functionalities
has been touched. None of the driver functionalities has changed.
- SMP Passthrough IOCTL
- Debug messages for MPT Replies (i.e. bit 9 of Logging Level)
- Task Management using sysfs
- Device removal, i.e. when a target device (including any PD within a volume) is removed, and Volume Deletion.
- Trace Buffer
Nagalakshmi Nandigama [Mon, 7 May 2012 20:29:11 +0000 (13:29 -0700)]
[mpt2sas] Improvement were made to better protect the sas_device, raid_device,
and expander_device lists
There were possible race conditions surrounding reading an object
from the link list while from another context in the driver was
removing it. The nature of this enhancement is to rearrange locking
so the link lists are better protected.
Change set:
(1) numerous routines were rearranged so spin locks are held through
the entire time a link list object is being read from or written to.
(2) added new routines for object deletion from link list. Thus ensuring
lock was held during the deletion of the link list object, then and memory
for object freed outside the lock. The memory was freed outside the lock
so driver had access to device object info which was required for
notifying the scsi mid layer that a device was getting deleted.
(3) added the ioc->blocking_handles parameter. This is a bitmask used
to identify which devices need blocking when there is device loss. This was
introduced so that lock can be held for the entire time traversing the link
list objects, and the bitmask was set to indicate which device handles need
blocking. Oustide the lock the ioc->blocking_handles bitmask is traversed,
with the respective device handle the scsi mid layer is called for moving
devices into blocking state.
Nagalakshmi Nandigama [Mon, 7 May 2012 20:27:04 +0000 (13:27 -0700)]
[mpt2sas] Added multisegment mode support for Linux BSG Driver
Orabug: 14040678
Added support for Block IO requests with multiple segments (vectors) in
the SMP handler of the SAS Transport Class. This is required by the
BSG driver. Multisegment support added for both, Request and Response.