Matt Carlson [Mon, 13 Feb 2012 15:20:11 +0000 (15:20 +0000)]
tg3: Fix NVRAM page writes on newer devices
On newer devices, the hardware expects the NVRAM address register
to be written only once per NVRAM page. To do otherwise causes NVRAM
corruption. This patch fixes the problem.
(cherry picked from commit 422782249927e887a4c032d1d7e1f59de281ecbb) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Mon, 13 Feb 2012 15:20:10 +0000 (15:20 +0000)]
tg3: Fix copper autoneg adv checks
When checking the autoneg advertisements, the driver failed to include
the master and master enable bits for the bcm5701. This patch fixes the
problem.
(cherry picked from commit 3198e07fd64aa8c3a38dda33bcc0f44265eb581e) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Mon, 13 Feb 2012 15:20:09 +0000 (15:20 +0000)]
tg3: Fix stats while interface is down
If the tg3 interface is down, the driver will return ethtool stats
uninitialized. This patch zeroes out the destination stat buffer in
such a case.
(cherry picked from commit b546e46f5c9f19a921d6f62db80f2e9371bc0558) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Mon, 13 Feb 2012 15:20:08 +0000 (15:20 +0000)]
tg3: Disable new DMA engine for 57766
A bug was found in the new DMA engine for the 57766. This patch
disables it, which causes the device to fallback to the old DMA engine.
(cherry picked from commit 3906969189a409e590a51b18c86a92d0506c9372) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Mon, 13 Feb 2012 10:20:12 +0000 (10:20 +0000)]
tg3: Move transmit comment to a better location
This patch moves a comment in the transmit path to a better location.
(cherry picked from commit c5665a538da6b887a5096358a12527785506e5ac) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Mon, 13 Feb 2012 10:20:11 +0000 (10:20 +0000)]
tg3: Eliminate unneeded prototype
This patch eliminates the unneeded tg3_halt_cpu() prototype and moves
the tg3_setup_phy() prototype closer to where it is needed.
(cherry picked from commit 4b40952268c2390a71b9728194eb4b4dd514c087) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Mon, 13 Feb 2012 10:20:10 +0000 (10:20 +0000)]
tg3: Relocate tg3_find_peer
This patch relocates tg3_find_peer to eliminate a prototype.
(cherry picked from commit 16c7fa7dfe9a304ca6d5bdf39aa4a8d315336a07) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Mon, 13 Feb 2012 10:20:09 +0000 (10:20 +0000)]
tg3: Move tg3_nvram_write_block functions
This patch moves the tg3_nvram_write_block functions higher in the file
to eliminate a prototype.
(cherry picked from commit dbe9b92a602b203174a3227a05e69c019eebf91d) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Mon, 13 Feb 2012 10:20:08 +0000 (10:20 +0000)]
tg3: Move tg3_set_rx_mode
This patch moves __tg3_set_rx_mode above its first use and moves
tg3_set_rx_mode down closer to where the netdev_ops functions should be.
(cherry picked from commit ccd5ba9db50d09d6d6e2d91d76d3b27fff8f6ed5) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Joe Jin [Mon, 27 Aug 2012 05:14:37 +0000 (13:14 +0800)]
tg3: Move tg3_change_mtu to a better location
This patch moves tg3_change_mtu to a better location.
(cherry picked from commit faf1627ac15d25687259844575a128e9f06f3e6f) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Mon, 13 Feb 2012 10:20:06 +0000 (10:20 +0000)]
tg3: Relocate tg3_reset_task
This patch moves tg3_reset_task further down in the file where it makes
more sense to be.
(cherry picked from commit 9a21fb8fc3a74429a4eee2d53df5a3167c290f2a) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Mon, 13 Feb 2012 10:20:05 +0000 (10:20 +0000)]
tg3: Move tg3_restart_hw to a better location
This patch relocates tg3_restart_hw() to a better location.
(cherry picked from commit ebf3312e781aee1bc22bb3a42cb324eb973a7c02) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Tue, 17 Jan 2012 15:27:23 +0000 (15:27 +0000)]
tg3: Fix single-vector MSI-X code
Kdump kernels leave MSI-X interrupts (as setup by the crashed kernel)
enabled. However, kdump only enables one CPU in the new environment,
thus causing tg3 to abort MSI-X setup. When the driver attempts to
enable INTA or MSI interrupt modes on a kdump kernel, interrupt
delivery fails.
This patch attempts to workaround the problem by forcing the driver
to enable a single MSI-X interrupt. In such a configuration, the
device's multivector interrupt mode must be disabled.
(cherry picked from commit c3b5003b628d8e373262bee42c7260d6a799c73e) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Matt Carlson [Fri, 16 Dec 2011 13:33:23 +0000 (13:33 +0000)]
tg3: Make the RSS indir tbl admin configurable
This patch adds the ethtool callbacks necessary to change the rss
indirection table from userspace. Should the number of interrupts
change (e.g. across a close / open call, or through a reset) and
any one of the indirection table values fall out-of-range, the driver
will reset the indirection table to a default layout.
[Integrated many suggestions made by Ben Hutchings.]
Changes since v3
* Removed TG3_FLAG_SUPPORT_MSIX checks at the start of
tg3_get_rxfh_indir() and tg3_set_rxfh_indir().
(cherry picked from commit 90415477bf1356f72acc34063ff52441fc10a754) Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Reviewed-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Joe Jin [Mon, 27 Aug 2012 12:54:45 +0000 (20:54 +0800)]
ethtool: Define and apply a default policy for RX flow hash indirection
All drivers that support modification of the RX flow hash indirection
table initialise it in the same way: RX rings are assigned to table
entries in rotation. Make that default policy explicit by having them
call a ethtool_rxfh_indir_default() function.
In the ethtool core, add support for a zero size value for
ETHTOOL_SRXFHINDIR, which resets the table to this default.
(backported from upstream commit 278bc4296bd64ffd1d3913b487dc8a520e423a7a) Partly-suggested-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Shreyas N Bhatewara <sbhatewara@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Ben Hutchings [Thu, 15 Dec 2011 13:51:16 +0000 (13:51 +0000)]
ethtool: Clarify use of size field for ETHTOOL_GRXFHINDIR
In order to find out the device's RX flow hash table size, ethtool
initially uses ETHTOOL_GRXFHINDIR with a buffer size of zero. This
must be supported, but it is not necessary to support any other user
buffer size less than the device table size.
(cherry picked from commit 14596f7006297b67516e2b6a2b26bcb11fe08fb3) Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Joe Jin [Mon, 27 Aug 2012 12:38:36 +0000 (20:38 +0800)]
ethtool: Centralise validation of ETHTOOL_{G, S}RXFHINDIR parameters
Add a new ethtool operation (get_rxfh_indir_size) to get the
indirectional table size. Use this to validate the user buffer size
before calling get_rxfh_indir or set_rxfh_indir. Use get_rxnfc to get
the number of RX rings, and validate the contents of the new
indirection table before calling set_rxfh_indir. Remove this
validation from drivers.
(backported from upstream commit 7850f63f1620512631445b901ae11cd149e7375c) Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Joe Jin [Mon, 27 Aug 2012 06:35:28 +0000 (14:35 +0800)]
bonding: comparing a u8 with -1 is always false
slave->duplex is a u8 type so the in bond_info_show_slave() when we
check "if (slave->duplex == -1)", it's always false.
(backported from upstream commit 589665f5a6008dbce1d0af2cb93e94a80bf78151) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Joe Jin [Mon, 27 Aug 2012 06:34:02 +0000 (14:34 +0800)]
bonding:update speed/duplex for NETDEV_CHANGE
Zheng Liang(lzheng@redhat.com) found a bug that if we config bonding with
arp monitor, sometimes bonding driver cannot get the speed and duplex from
its slaves, it will assume them to be 100Mb/sec and Full, please see
/proc/net/bonding/bond0.
But there is no such problem when uses miimon.
(Take igb for example)
I find that the reason is that after dev_open() in bond_enslave(),
bond_update_speed_duplex() will call igb_get_settings()
, but in that function,
it runs ethtool_cmd_speed_set(ecmd, -1); ecmd->duplex = -1;
because igb get an error value of status.
So even dev_open() is called, but the device is not really ready to get its
settings.
Maybe it is safe for us to call igb_get_settings() only after
this message shows up, that is "igb: p4p1 NIC Link is Up 1000 Mbps Full Duplex,
Flow Control: RX".
So I prefer to update the speed and duplex for a slave when reseices
NETDEV_CHANGE/NETDEV_UP event.
Changelog
V2:
1 remove the "fake 100/Full" logic in bond_update_speed_duplex(),
set speed and duplex to -1 when it gets error value of speed and duplex.
2 delete the warning in bond_enslave() if bond_update_speed_duplex() returns
error.
3 make bond_info_show_slave() handle bad values of speed and duplex.
(backport from upstream commit 98f41f694f46085fda475cdee8cc0b6d2c5e6f1f) Signed-off-by: Weiping Pan <wpan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Abhijeet Joglekar [Tue, 14 Jun 2011 04:21:01 +0000 (21:21 -0700)]
fnic: fix incorrect use of SLAB_CACHE_DMA flag
Driver was incorrectly using the SLAB_CACHE_DMA flag when creating a cache
for SGLs. fnic device does not have 24-bit DMA restrictions. Remove the flag
and allocations from ZONE_DMA.
Thanks to Roland Dreier and David Rientjes for pointing out the bug.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
(cherry picked from commit 0c79c74272b25bbcfb0149d5a4627ed768795d2d)
commit 9ac32e1b firmware: convert e100 driver to request_firmware()
did a straight conversion of the in-driver ucode to external
files. This introduced the possibility of the driver failing
to enable an interface due to missing ucode. There was no
evaluation of the importance of the ucode at the time.
Based on comments in earlier versions of this driver, and in
the source code for the FreeBSD fxp driver, we can assume that
the ucode implements the "CPU Cycle Saver" feature on supported
adapters. Although generally wanted, this is an optional
feature. The ucode source is not available, preventing it from
being included in free distributions. This creates unnecessary
problems for the end users. Doing a network install based on a
free distribution installer requires the user to download and
insert the ucode into the installer.
Making the ucode optional when possible improves the user
experience and driver usability.
The ucode for some adapters include a bugfix, making it
essential. We continue to fail for these adapters unless the
ucode is available.
(cherry picked from commit 8b0d2f9ed3d8e92feada7c5d70fa85be46e6f948) Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Richard Cochran [Sun, 8 Apr 2012 14:38:10 +0000 (14:38 +0000)]
e100: enable transmit time stamping.
This patch enables software (and phy device) transmit time stamping.
Tested on an old PIII laptop with built in NIC.
(cherry picked from commit 48425b14929ef185377e3827ba7135d88343face) Signed-off-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Joe Jin <joe.jin@oracle.com>
Maxim Uvarov [Thu, 9 Aug 2012 15:14:24 +0000 (08:14 -0700)]
x86/nmi: Add new NMI queues to deal with IO_CHK and SERR
In discussions with Thomas Mingarelli about hpwdt, he explained
to me some issues they were some when using their virtual NMI
button to test the hpwdt driver.
It turns out the virtual NMI button used on HP's machines do no
send unknown NMIs but instead send IO_CHK NMIs. The way the
kernel code is written, the hpwdt driver can not register itself
against that type of NMI and therefore can not successfully
capture system information before panic'ing.
To solve this I created two new NMI queues to allow driver to
register against the IO_CHK and SERR NMIs. Or in the hpwdt all
three (if you include unknown NMIs too).
The change is straightforward and just mimics what the unknown
NMI does.
Reported-and-tested-by: Thomas Mingarelli <thomas.mingarelli@hp.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1333051877-15755-3-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Conflicts:
Don Zickus [Fri, 30 Sep 2011 19:06:20 +0000 (15:06 -0400)]
x86, nmi: Create new NMI handler routines
The NMI handlers used to rely on the notifier infrastructure. This worked
great until we wanted to support handling multiple events better.
One of the key ideas to the nmi handling is to process _all_ the handlers for
each NMI. The reason behind this switch is because NMIs are edge triggered.
If enough NMIs are triggered, then they could be lost because the cpu can
only latch at most one NMI (besides the one currently being processed).
In order to deal with this we have decided to process all the NMI handlers
for each NMI. This allows the handlers to determine if they recieved an
event or not (the ones that can not determine this will be left to fend
for themselves on the unknown NMI list).
As a result of this change it is now possible to have an extra NMI that
was destined to be received for an already processed event. Because the
event was processed in the previous NMI, this NMI gets dropped and becomes
an 'unknown' NMI. This of course will cause printks that scare people.
However, we prefer to have extra NMIs as opposed to losing NMIs and as such
are have developed a basic mechanism to catch most of them. That will be
a later patch.
To accomplish this idea, I unhooked the nmi handlers from the notifier
routines and created a new mechanism loosely based on doIRQ. The reason
for this is the notifier routines have a couple of shortcomings. One we
could't guarantee all future NMI handlers used NOTIFY_OK instead of
NOTIFY_STOP. Second, we couldn't keep track of the number of events being
handled in each routine (most only handle one, perf can handle more than one).
Third, I wanted to eventually display which nmi handlers are registered in
the system in /proc/interrupts to help see who is generating NMIs.
The patch below just implements the new infrastructure but doesn't wire it up
yet (that is the next patch). Its design is based on doIRQ structs and the
atomic notifier routines. So the rcu stuff in the patch isn't entirely untested
(as the notifier routines have soaked it) but it should be double checked in
case I copied the code wrong.
"Historically, Linux has tried to make the regular timer tick on the
various CPUs not happen at the same time, to avoid contention on
xtime_lock.
Nowadays, with the tickless kernel, this contention no longer happens
since time keeping and updating are done differently. In addition,
this skew is actually hurting power consumption in a measurable way on
many-core systems."
Problems:
- Contrary to the above, systems do encounter contention on both
xtime_lock and RCU structure locks when the tick is synchronized.
- Moderate sized RT systems suffer intolerable jitter due to the tick
being synchronized.
- SGI reports the same for their large systems.
- Fully utilized systems reap no power saving benefit from skew removal,
but do suffer from resulting induced lock contention.
- 0209f649 rcu: limit rcu_node leaf-level fanout
This patch was born to combat lock contention which testing showed
to have been _induced by_ skew removal. Skew the tick, contention
disappeared virtually completely.
Dimitri Sivanich [Fri, 10 Aug 2012 11:46:56 +0000 (04:46 -0700)]
vfs: fix panic in __d_lookup() with high dentry hashtable counts
When the number of dentry cache hash table entries gets too high
(2147483648 entries), as happens by default on a 16TB system, use of a
signed integer in the dcache_init() initialization loop prevents the
dentry_hashtable from getting initialized, causing a panic in
__d_lookup(). Fix this in dcache_init() and similar areas.
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Conflicts:
Jack Steiner [Fri, 10 Aug 2012 11:45:33 +0000 (04:45 -0700)]
cpusets: randomize node rotor used in cpuset_mem_spread_node()
Some workloads that create a large number of small files tend to assign
too many pages to node 0 (multi-node systems). Part of the reason is that
the rotor (in cpuset_mem_spread_node()) used to assign nodes starts at
node 0 for newly created tasks.
This patch changes the rotor to be initialized to a random node number of
the cpuset.
[akpm@linux-foundation.org: fix layout]
[Lee.Schermerhorn@hp.com: Define stub numa_random() for !NUMA configuration] Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Paul Menage <menage@google.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
Jack Steiner [Fri, 10 Aug 2012 11:44:37 +0000 (04:44 -0700)]
x86: Reduce clock calibration time during slave cpu startup
Reduce the startup time for slave cpus.
Adds hooks for an arch-specific function for clock calibration.
These hooks are used on x86. If a newly started cpu has the
same phys_proc_id as a core already active, uses the TSC for the
delay loop and has a CONSTANT_TSC, use the already-calculated
value of loops_per_jiffy.
This patch reduces the time required to start slave cpus on a
4096 cpu system from: 465 sec OLD 62 sec NEW
This reduces boot time on a 4096p system by almost 7 minutes.
Nice...
Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Stultz <john.stultz@linaro.org>
[fix CONFIG_SMP=n build] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Conflicts:
Mike Travis [Fri, 10 Aug 2012 11:40:36 +0000 (04:40 -0700)]
x86, pci: Increase the number of iommus supported to be MAX_IO_APICS
The number of IOMMUs supported should be the same as the number of IO APICS.
This limit comes into play when the IOMMUs are identity mapped, thus the
number of possible IOMMUs in the "static identity" (si) domain should be
this same number.
Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Jack Steiner <steiner@sgi.com>
Mike Travis [Fri, 10 Aug 2012 11:37:48 +0000 (04:37 -0700)]
x86 PCI: Fix identity mapping for sandy bridge
With SandyBridge, Intel has changed these Socket PCI devices to have a class
type of "System Peripheral" & "Performance counter", rather than "HostBridge".
So instead of using a "special" case to detect which devices will not be
doing DMA, use the fact that a device that is not associated with an IOMMU,
will not need an identity map.
Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Mike Habeck <habeck@sgi.com>
Joe Jin [Fri, 20 Jul 2012 13:30:51 +0000 (21:30 +0800)]
[scsi] hpsa: add all support devices for ol5
Orabug: 14106006
To support uek2 on ol5, commit 29a8828 disable some devices from support list,
this made ovm3 upgrade from 3.0.3 to 3.1.1 failed to addressed local disk for
disk device name changed.
If kernel run as ovm3.1.1 dom0 kernel, please pass cciss_allow_hpsa=1 when
load cciss driver, for ol5.
Adnan Misherfi [Thu, 2 Aug 2012 20:17:44 +0000 (16:17 -0400)]
Disable VLAN 0 tagging for none VLAN traffic
Orabug: 14406424
Cisco enic driver on UCS blades tags a None VLAN traffic with VLAN 0, this causes VMs
that do not have the kernel patch " VLAN 0 should be treated as no vlan tag" to drop all
receive traffic as these VMs do not know how to deal with the VLAN 0 tag.
This is also a problem for older VMs that can not take the mentioned patch.
This fix disables the enic driver from tagging a None VLAN traffic with VLAN 0.This
fix is controlled by a driver parameters " disable_vlan0". the default value is disable_vlan0=1
which to disable the driver from tagging traffic with VLAN 0. To revert to original behavior
add "options enic disable_vlan0=0" to /etc/modprobe.con
Jeff Mahoney [Thu, 2 Aug 2012 12:04:00 +0000 (05:04 -0700)]
dl2k: Clean up rio_ioctl
Orabug: 14126896
The dl2k driver's rio_ioctl call has a few issues:
- No permissions checking
- Implements SIOCGMIIREG and SIOCGMIIREG using the SIOCDEVPRIVATE numbers
- Has a few ioctls that may have been used for debugging at one point
but have no place in the kernel proper.
This patch removes all but the MII ioctls, renumbers them to use the
standard ones, and adds the proper permission check for SIOCSMIIREG.
We can also get rid of the dl2k-specific struct mii_data in favor of
the generic struct mii_ioctl_data.
Since we have the phyid on hand, we can add the SIOCGMIIPHY ioctl too.
Most of the MII code for the driver could probably be converted to use
the generic MII library but I don't have a device to test the results.
This fixes: CVE-2012-2313
Reported-by: Stephan Mueller <stephan.mueller@atsec.com> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com> Signed-off-by: Guangyu Sun <guangyu.sun@oracle.com>
Jeff Mahoney [Thu, 2 Aug 2012 12:04:00 +0000 (05:04 -0700)]
dl2k: Clean up rio_ioctl
Orabug: 14126896
The dl2k driver's rio_ioctl call has a few issues:
- No permissions checking
- Implements SIOCGMIIREG and SIOCGMIIREG using the SIOCDEVPRIVATE numbers
- Has a few ioctls that may have been used for debugging at one point
but have no place in the kernel proper.
This patch removes all but the MII ioctls, renumbers them to use the
standard ones, and adds the proper permission check for SIOCSMIIREG.
We can also get rid of the dl2k-specific struct mii_data in favor of
the generic struct mii_ioctl_data.
Since we have the phyid on hand, we can add the SIOCGMIIPHY ioctl too.
Most of the MII code for the driver could probably be converted to use
the generic MII library but I don't have a device to test the results.
This fixes: CVE-2012-2313
Reported-by: Stephan Mueller <stephan.mueller@atsec.com> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
This is not for upstream as it memblock_x86_reserve_range is not
used upstream anymore.
When I back-ported the patches:
xen/x86: Use memblock_reserve for sensitive areas.
xen/mmu: Recycle the Xen provided L4, L3, and L2 pages
I simply used sed s/memblock_reserve/memblock_x86_reserve_range/.
That was incorrect as the parameters are different - memblock_reserve
as second expects the size, while memblock_x86_reserve_range expects
the physical address. This patch fixes those bugs.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Merge branch 'stable/for-linus-3.7.rebased' into uek2-merge
* stable/for-linus-3.7.rebased:
xen/p2m: Reserve 8MB of _brk space for P2M leafs when populating back.
xen/mmu: Remove from __ka space PMD entries for pagetables.
xen/mmu: Copy and revector the P2M tree.
xen/p2m: Add logic to revector a P2M tree to use __va leafs.
xen/mmu: Recycle the Xen provided L4, L3, and L2 pages
xen/mmu: For 64-bit do not call xen_map_identity_early
xen/mmu: use copy_page instead of memcpy.
xen/mmu: Provide comments describing the _ka and _va aliasing issue
xen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything.
xen/x86: Use memblock_reserve for sensitive areas.
xen/p2m: Fix the comment describing the P2M tree.
xen/perf: Define .glob for the different hypercalls.
We then try to populate those pages back. In the P2M tree however
the space for those leafs must be reserved - as such we use extend_brk.
We reserve 8MB of _brk space, which means we can fit over 1048576 PFNs - which is more than we should ever need.
[v1: Made it 8MB of _brk space instead of 4MB per Jan's suggestion] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 99266871de5006ba7ad0bfece6bb283ede4094b9)
xen/mmu: Remove from __ka space PMD entries for pagetables.
Please first read the description in "xen/mmu: Copy and revector the
P2M tree."
At this stage, the __ka address space (which is what the old
P2M tree was using) is partially disassembled. The cleanup_highmap
has removed the PMD entries from 0-16MB and anything past _brk_end
up to the max_pfn_mapped (which is the end of the ramdisk).
The xen_remove_p2m_tree and code around has ripped out the __ka for
the old P2M array.
Here we continue on doing it to where the Xen page-tables were.
It is safe to do it, as the page-tables are addressed using __va.
For good measure we delete anything that is within MODULES_VADDR
and up to the end of the PMD.
At this point the __ka only contains PMD entries for the start
of the kernel up to __brk.
[v1: Per Stefano's suggestion wrapped the MODULES_VADDR in debug] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 4e928e1a48b6b76e0b8384160213a32d03197e4b)
Please first read the description in "xen/p2m: Add logic to revector a
P2M tree to use __va leafs" patch.
The 'xen_revector_p2m_tree()' function allocates a new P2M tree
copies the contents of the old one in it, and returns the new one.
At this stage, the __ka address space (which is what the old
P2M tree was using) is partially disassembled. The cleanup_highmap
has removed the PMD entries from 0-16MB and anything past _brk_end
up to the max_pfn_mapped (which is the end of the ramdisk).
We have revectored the P2M tree (and the one for save/restore as well)
to use new shiny __va address to new MFNs. The xen_start_info
has been taken care of already in 'xen_setup_kernel_pagetable()' and
xen_start_info->shared_info in 'xen_setup_shared_info()', so
we are free to roam and delete PMD entries - which is exactly what
we are going to do. We rip out the __ka for the old P2M array.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
As can be seen, the ramdisk, P2M and pagetables are taking
a bit of __ka addresses space. Which is a problem since the
MODULES_VADDR starts at 0xffffffffa0000000 - and P2M sits
right in there! This results during bootup with the inability to
load modules, with this error:
Since the __va and __ka are 1:1 up to MODULES_VADDR and
cleanup_highmap rids __ka of the ramdisk mapping, what
we want to do is similar - get rid of the P2M in the __ka
address space. There are two ways of fixing this:
1) All P2M lookups instead of using the __ka address would
use the __va address. This means we can safely erase from
__ka space the PMD pointers that point to the PFNs for
P2M array and be OK.
2). Allocate a new array, copy the existing P2M into it,
revector the P2M tree to use that, and return the old
P2M to the memory allocate. This has the advantage that
it sets the stage for using XEN_ELF_NOTE_INIT_P2M
feature. That feature allows us to set the exact virtual
address space we want for the P2M - and allows us to
boot as initial domain on large machines.
So we pick option 2).
This patch only lays the groundwork in the P2M code. The patch
that modifies the MMU is called "xen/mmu: Copy and revector the P2M tree."
xen/mmu: Recycle the Xen provided L4, L3, and L2 pages
As we are not using them. We end up only using the L1 pagetables
and grafting those to our page-tables.
[v1: Per Stefano's suggestion squashed two commits]
[v2: Per Stefano's suggestion simplified loop] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit cbc09be35990fb3d15671507f11c3e90479ef816)
xen/mmu: Provide comments describing the _ka and _va aliasing issue
Which is that the level2_kernel_pgt (__ka virtual addresses)
and level2_ident_pgt (__va virtual address) contain the same
PMD entries. So if you modify a PTE in __ka, it will be reflected
in __va (and vice-versa).
xen/x86: Use memblock_reserve for sensitive areas.
instead of a big memblock_reserve. This way we can be more
selective in freeing regions (and it also makes it easier
to understand where is what).
[v1: Move the auto_translate_physmap to proper line]
[v2: Per Stefano suggestion add more comments] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[upstream git commit 91addbf07abfdd109a9da4e02061e6ed3728b298]
Conflicts:
The P2M code is smart enough to return false (which means that it
cannot allocate anymore) and the error can perculate up the calling
stack without trouble - with the error logic doing the proper thing.
So check the __brk_limit values before allocating from extend_brk.
This allows us to boot on machines where we do not have enough
__brk space, and we would get this:
Interestingly enough, most of the time we are not going to hit this
b/c the _brk space is quite large (v3.5): ffffffff81a25000 B __brk_base ffffffff81e43000 B __brk_limit
= ~4MB.
vs earlier kernels (with this back-ported), the space is smaller: ffffffff81a25000 B __brk_base ffffffff81a7b000 B __brk_limit
= 344 kBytes.
With this patch, we would get now a limited amount of pages populated back:
Freeing 9f-100 pfn range: 97 pages freed
Freeing b7ee0-ecd9b pfn range: 216763 pages freed
Released 216860 pages of unused memory
Set 295297 page(s) to 1-1 mapping
Populating 100000-134f1c pfn range: 30720 pages added
[while it was instructed to populate 216860 pages back
on this particular machine]
qla2xxx: Perform ROM mbx cmd access only after ISP soft-reset during f/w recovery.
Initial assumption by driver was that the ROM mbx cmds will be accessible
even when FCoE operational f/w is in reset recovery. However it seems that
in case of "ISP System error" (i.e. 0x8002) there is a period when the ISP
ISP is not operational and firmware waits in tight loop for either the driver
to take a dump or perform soft-reset. During this time none of the ROM mbx
cmds will get serviced by f/w.
Hence the patch makes sure driver sends mbx only after soft reset is complete.
Arun Easi [Tue, 19 Jun 2012 23:56:27 +0000 (16:56 -0700)]
qla2xxx: Fix for continuous rescan attempts in arbitrated loop topology.
Stale information in the temporary fcport created in
qla2x00_configure_local_loop() causes qla2x00_get_port_database() call
to fail. This reschedules scan, which gets stuck continuously in the
rescheduling-of-scan loop due to the failure.
Chad Dupuis [Thu, 19 Jul 2012 09:13:50 +0000 (14:43 +0530)]
qla2xxx: Use bitmap to store loop_id's for fcports.
Store used fcport loop_id's in a bitmap so that as opposed to looping through
all fcports to find the next free loop_id, new loop_id lookup can be just be
done via bitops.
Joe Carnuccio [Wed, 18 Apr 2012 20:19:06 +0000 (13:19 -0700)]
qla2xxx: Add I2C BSG interface.
Add BSG interface to generically access I2C attached devices.
The transferred data limitations:
- the address must be on an even byte boundary,
- the chunk size must be no more than 64 bytes,
(these being limitations of the firmware).
So, to transfer more than 64 bytes, the caller must chunk up
the data into 64 byte chunks and perform multiple accesses.
The caller is responsible for setting the device address,
the data offset, the option bits, and the data length.