Chris Mason [Wed, 1 Feb 2012 01:27:41 +0000 (20:27 -0500)]
Btrfs: don't reserve data with extents locked in btrfs_fallocate
btrfs_fallocate tries to allocate space only if ranges in the file don't
already exist. But the enospc checks it does are not allowed with
extents locked.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
On platforms with no iCRU support don't print two, (possibly conflicting),
"NMI occurred" messages when the firmware is unable to source the NMI.
Please note that one of the enhancements to the v1.3.0 hpwdt driver is to panic and allow
KDUMP to succeed even on NMIs that are unknown to the platform firmware.
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Reviewed-by: Thomas Mingarelli <thomas.mingarelli@hp.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
Alexander Duyck [Thu, 2 Jun 2011 04:29:23 +0000 (04:29 +0000)]
ixgbe: Fix FCOE memory leak for DDP packets
This patch is meant to fix a memory leak found via code review for FCOE.
Specifically on DDP flows the SKBs were being dropped without being
recycled, freed, or given to the stack.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 63d635b21c00069b5ade7640bcbe8ab912dc65d1)
Emil Tantilov [Thu, 28 Jul 2011 06:17:04 +0000 (06:17 +0000)]
ixgbe: fix PHY link setup for 82599
Fix pointer to setup_link for 82599.
This resolves some link issues when advertising modes unsupported
by the link partner.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit b57e35bd0e545181c94405ce35b89000aed56cc5)
Don Skidmore [Wed, 20 Jul 2011 02:27:05 +0000 (02:27 +0000)]
ixgbe: fix __ixgbe_notify_dca() bail out code
The way __ixgbe_notify_dca() was currently set up it would not be
possible to add a requester. Both cases of the IXGBE_FLAG_DCA_ENABLED
bit being on and off would lead to the function exiting for a
DCA_PROVIDER_ADD.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 2a72c31ee4aa31b6a762390e4811a0edf5eefcef)
Don Skidmore [Thu, 21 Jul 2011 05:55:00 +0000 (05:55 +0000)]
ixgbe: convert to ndo_fix_features
Private rx_csum flags are now duplicate of netdev->features &
NETIF_F_RXCSUM. We remove those duplicates and now use the net_device_ops
ndo_set_features. This was based on the original patch submitted by
Michal Miroslaw <mirq-linux@rere.qmqm.pl>. I also removed the special
case not requiring a reset for X540 hardware. It is needed just as it is
in 82599 hardware.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 082757afcf7d6e44b24c4927ce5b158196d63e84)
Andy Gospodarek [Sat, 16 Jul 2011 07:31:33 +0000 (07:31 +0000)]
ixgbe: only enable WoL for magic packet by default
Martin Wilck <martin.wilck@ts.fujitsu.com> reported that systems using
the ixgbe-driver that were capable of WoL were rebooting almost as soon
as they were shut down. This is because the default WoL settings
enabled magic packet, broadcast, unicast, and multicast.
Other Intel devices seem to use the stored eeprom value for initial WoL
capabilities. The 82578DM (e1000e) and 82576 (igb) the devices I looked
at had only the magic packet enabled in the eeprom, so that seems
appropriate on ixgbe-based devices as well. I set the WoL options on my
82578DM to be the same default as the ixgbe devices (umbg) and saw the
same as Martin -- almost as soon as my box shutdown, it booted again.
This patch changes the default to only be the magic packet. This is the
same as the default for most Intel and non-Intel hardware currently
upstream.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> CC: Martin Wilck <martin.wilck@ts.fujitsu.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 9417c464ba834ae29982fcd71bc59dc8e734327c)
Emil Tantilov [Tue, 12 Jul 2011 08:13:41 +0000 (08:13 +0000)]
ixgbe: remove ifdef check for non-existent define
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 34c6ee8181d6ec1ed761ed2fdac4f1b0358e5447)
Alexander Duyck [Sat, 11 Jun 2011 01:45:13 +0000 (01:45 +0000)]
ixgbe: Pass staterr instead of re-reading status and error bits from descriptor
This change is meant to address possible race conditions from the status
and error bits on the RX descriptors being re-read by multiple functions in
the RX cleanup path. To resolve this I have added code that will pass the
staterr value to those functions.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit ff886dfce2bdacbe71583ec973cec117973d8859)
Alexander Duyck [Sat, 11 Jun 2011 01:45:08 +0000 (01:45 +0000)]
ixgbe: Move interrupt related values out of ring and into q_vector
This change moves work_limit, total_packets, and total_bytes into the ring
container struct of the q_vector. The advantage of this is that it should
reduce the size of memory used in the event of multiple rings being
assigned to a single q_vector. In addition it should help to reduce the
total workload for calculating itr since now total_packets and total_bytes
will be the total work done of the interrupt instead of for the ring.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit bd19805803a954415ec36a559fd3b8a0a3647d7c)
Alexander Duyck [Sat, 11 Jun 2011 01:45:03 +0000 (01:45 +0000)]
ixgbe: add structure for containing RX/TX rings to q_vector
This patch adds support for a ring container structure to be used within
the q_vector. The basic idea is to provide a means of separating the RX
and TX rings while maintaining a common structure for their containment.
The advantage to this is that later we should be able to pass this
structure to the update_itr functions without needing to pass individual
rings.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 08c8833b29cfa4343ff132eebc5648b234eb3f85)
Alexander Duyck [Sat, 11 Jun 2011 01:44:58 +0000 (01:44 +0000)]
ixgbe: inline the ixgbe_maybe_stop_tx function
The ixgbe_maybe_stop_tx function is only a few lines long and is called
multiple times through the xmit hotpath. In order to streamline things it
makes sense to just inline it.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 82d4e46e2a398154273044dd9813206f0d85bc09)
Alexander Duyck [Sat, 11 Jun 2011 01:44:53 +0000 (01:44 +0000)]
ixgbe: Update ATR to use recorded TX queues instead of CPU for routing
This change is meant to update ATR so that it will use the recorded RX
queue instead of the CPU in the case of routing. This change is meant to
help ixgbe default behavior to more closely match that of the kernel.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 6440752c872e418452a2cbbf5e73d546affe2b28)
Alexander Duyck [Thu, 2 Jun 2011 04:28:39 +0000 (04:28 +0000)]
ixgbe: Make certain to initialize the fdir_perfect_lock in all cases
This fix makes it so that the fdir_perfect_lock is initialized in all
cases. This is necessary as the fdir_filter_exit routine will always
attempt to take the lock before inspecting the filter table.
Reported-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 1fc5f0386461dcb3fcbda7a8bcac87f7a822bc56)
Nicolas Schichan [Sat, 9 Jul 2011 00:24:18 +0000 (00:24 +0000)]
e1000: always call e1000_check_for_link() on e1000_ce4100 MACs.
Interrupts about link lost or rx sequence errors are not reported by
the ce4100 hardware, leading to transitions from link UP to link DOWN
never being reported.
Signed-off-by: Nicolas Schichan <nschichan@freebox.fr> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 6d9e5130b96daa1656966f7271e8a0928b3906c4)
- unify vlan and nonvlan rx path
- kill adapter->vlgrp and e1000_vlan_rx_register
- allow to turn on/off rx/tx vlan accel via ethtool (set_features)
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5622e4044a916de1af84bfcc4d437ce0c799d531)
Michał Mirosław [Wed, 8 Jun 2011 08:36:42 +0000 (08:36 +0000)]
e1000: convert to ndo_fix_features
Private rx_csum flags are now duplicate of netdev->features & NETIF_F_RXCSUM.
Removing this needs deeper surgery.
Things noticed:
- RX csum disabled by default
- HW VLAN acceleration probably can be toggled, but it's left as is
- the resets on RX csum offload change can probably be avoided
- there is A LOT of copy-and-pasted code here
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit e97d3207c5e697e3d86cc67483f9cdcce16cc0bf)
Greg Dietsche [Thu, 16 Jun 2011 07:09:30 +0000 (07:09 +0000)]
e1000: remove unnecessary code
Compile tested.
remove unnecessary code that matches this coccinelle pattern
if (...)
return ret;
return ret;
Signed-off-by: Greg Dietsche <Gregory.Dietsche@cuw.edu> Signed-off-by: David S. Miller <davem@conan.davemloft.net>
(cherry picked from commit c4dc4d108ace27cc0c594b67bd6bd945deaac8c2)
- unify vlan and nonvlan rx path
- kill adapter->vlgrp and igbvf_vlan_rx_register
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a0f1d603ee4186081fda0ad48ccf6163c2009f10)
Lior Levy [Sat, 25 Jun 2011 07:09:08 +0000 (00:09 -0700)]
ixgbe: A fix to VF TX rate limit
There is a need to configure MMW_SIZE in register RTTBCNRM with a correct
value (0x4 for non jumbo frames and 0x14 for jumbo frames support).
For 82599 the value is 0x4 and for X540 the value is 0x14.
Signed-off-by: Lior Levy <lior.levy@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 7555e83df399ef35e031b137442eac2b1894b993)
Alexander Duyck [Fri, 27 May 2011 05:31:52 +0000 (05:31 +0000)]
ixgbe: Update method used for determining descriptor count for an skb
This patch updates the current methods used for determining if we have
enough space to transmit a given skb. The current method is quite wasteful
as it has us go through and determine how each page is going to be broken
up. That only needs to be done if pages are larger than our maximum data
per TXD. As such I have wrapped that in a page size check.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit a535c30e9e98d201089503a0ffa0093cba16e796)
Alexander Duyck [Fri, 27 May 2011 05:31:47 +0000 (05:31 +0000)]
ixgbe: Add one function that handles most of context descriptor setup
There is a significant amount of shared functionality between the checksum
and TSO offload configuration that is shared in regards to how they setup
the context descriptors. Since so much of the functionality is shared it
makes sense to move the shared functionality into a single function and
just call that function from the two context descriptor specific routines.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 897ab15606ce896b6a574a263beb51cbfb43f041)
Alexander Duyck [Fri, 27 May 2011 05:31:42 +0000 (05:31 +0000)]
ixgbe: Move all values that deal with count, next_to_use, next_to_clean to u16
This change updates all values dealing with count, next_to_use, and
next_to_clean so that they stay u16 values. The advantage of this is that
there is no re-casting of type during the propagation through the stack.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 63544e9c0055316d0397cb671f2ff99d85c77293)
Alexander Duyck [Fri, 27 May 2011 05:31:37 +0000 (05:31 +0000)]
ixgbe: Convert IXGBE_DESC_UNUSED from macro to static inline function
This change is a minor cleanup that converts the IXGBE_DESC_UNUSED macro
into a static inline function just for the case of the code being a bit
cleaner.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 7d4987de752a94772ad1ae85ad5c702bbcf73305)
Alexander Duyck [Fri, 27 May 2011 05:31:32 +0000 (05:31 +0000)]
ixgbe: pass adapter struct instead of netdev for interrupt data
This change makes it so that we pass the adapter struct instead of the
netdev for most of the basic interrupts that are not associated with
q_vectors.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit a65151ba201fe56ac146767e018674a84bfef1a6)
Don Skidmore [Fri, 20 May 2011 03:05:14 +0000 (03:05 +0000)]
ixgbe: update driver version string
Update the ixgbe driver version string to better match the Source Driver
with similar device support. Likewise update to the current LAD Linux
versioning scheme.
Signed-of-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Evan Swanson <evan.swanson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit a38a104d7af27b7697bf7c4272f4be5d1ec6ef4c)
Alexander Duyck [Sat, 14 May 2011 01:16:02 +0000 (01:16 +0000)]
ixgbe: fix ring assignment issues for SR-IOV and drop cases
This change fixes the fact that we would trigger a null pointer dereference
or specify the wrong ring if the rings were restored. This change makes
certain that the DROP queue is a static value, and all other rings are
based on the ring offsets for the PF.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 1f4d51836f5e49f2e5201f1daf90239c04b3faf2)
Emil Tantilov [Fri, 13 May 2011 02:22:45 +0000 (02:22 +0000)]
ixgbe: move reset code into a separate function
Move reset code into a separate function to allow for reuse in other
parts of the code.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit c988ee829074073d3cd80090ef56a6e370b5c9b4)
Emil Tantilov [Fri, 13 May 2011 02:22:40 +0000 (02:22 +0000)]
ixgbe: move setting RSC into a separate function
Move setting RSC into a separate function to allow for reuse in other
parts of the code.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 3a28926451a22a2b699962e738c8540da642c319)
Alexander Duyck [Wed, 11 May 2011 07:18:52 +0000 (07:18 +0000)]
ixgbe: add support for nfc addition and removal of filters
This change is meant to allow for nfc to insert and remove filters in order
to test the ethtool interface which includes it's own rules manager.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit e4911d57a45ca30771c64b56e552891fcd105070)
Alexander Duyck [Wed, 11 May 2011 07:18:47 +0000 (07:18 +0000)]
ixgbe: add support for displaying ntuple filters via the nfc interface
This code adds support for displaying the filters that were added via the
nfc interface. This is primarily to test the interface for now, but I am
also looking into the feasibility of moving all of the ntuple filter code
in ixgbe over to the nfc interface since it seems to be better implemented.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 3e05334f8be83e8529f1cbf4f4dea06a4d51d676)
Alexander Duyck [Wed, 11 May 2011 07:18:41 +0000 (07:18 +0000)]
ixgbe: add basic support for setting and getting nfc controls
This change adds basic support for the obtaining of RSS ring counts.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 91cd94bfe4f00fccf692e32dfa86a9fad0d61280)
Alexander Duyck [Wed, 11 May 2011 07:18:36 +0000 (07:18 +0000)]
ixgbe: update perfect filter framework to support retaining filters
This change is meant to update the internal framework of ixgbe so that
perfect filters can be stored and tracked via software.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit c04f6ca84866ef207e009a08e4c34ca241df7aa2)
Alexander Duyck [Fri, 20 May 2011 07:36:17 +0000 (07:36 +0000)]
ixgbe: fix flags relating to perfect filters to support coexistence
I am removing the requirement that Ntuple filters have the same
number of queues and requirements as ATR. As a result this change will
make it so that all the Ntuple flag does is disable ATR for now.
This change fixes an issue in which we were incorrectly re-enabling ATR
when we exited perfect filter mode. This was due to the fact that the
logic assumed RSS and DCB were mutually exclusive which is no longer the
case.
To correct this we just need to add a check to guarantee DCB is disabled
before re-enabling ATR.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 03ecf91aae757eeb70763a3393227c4597c87b23)
Alexander Duyck [Wed, 11 May 2011 07:18:26 +0000 (07:18 +0000)]
ixgbe: remove ntuple filtering
Due to numerous issues in ntuple filters it has been decided to move the
interface over to the network flow classification interface. As a first
step to achieving this I first need to remove the old ntuple interface.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit b29a21694f7d12e40537e1e587ec47725849769b)
Vasu Dev [Wed, 11 May 2011 05:41:46 +0000 (05:41 +0000)]
ixgbe: setup per CPU PCI pool for FCoE DDP
Currently single PCI pool used across all CPUs and that
doesn't scales up as number of CPU increases, so this
patch adds per CPU PCI pool to setup udl and that aligns
well from FCoE stack as that already has per CPU exch locking.
Adds per CPU PCI alloc setup and free in
ixgbe_fcoe_ddp_pools_alloc and ixgbe_fcoe_ddp_pools_free,
use CPU specific pool during DDP setup.
Re-arranged ixgbe_fcoe struct to have fewer holes
along with adding pools ptr using pahole.
Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit dadbe85ac47f180fa1e3ef93b276ab7938b1a98b)
Emil Tantilov [Sat, 7 May 2011 07:40:20 +0000 (07:40 +0000)]
ixgbe: add support for Dell CEM
This patch adds support for Dell CEM (Comprehensive Embedded Management)).
This consists of informing the management firmware of the driver version
during probe on 82599 and X540 HW.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Evan Swanson <evan.swanson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 9612de92e023bff0d1cd5725ee65293accc70c56)
John Fastabend [Tue, 26 Apr 2011 07:26:30 +0000 (07:26 +0000)]
ixgbe: DCB and perfect filters can coexist
Now flow directors perfect filters features can coexist with DCB.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit b1bbdb206a52b7eb13c2e57ee794b90618f61002)
John Fastabend [Tue, 26 Apr 2011 07:26:25 +0000 (07:26 +0000)]
ixgbe: fix bit mask for DCB version
This bit mask is wrong DCBX_HOST is always set. It was missed up
until now because lldpad reprograms the device on a link
event. However this is still wrong and it is best not to be
mis-configured for some time immediately following ixgbe_up().
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 6f70f6acc7321f9f6501157b29d2db8c475cf3b2)
John Fastabend [Tue, 26 Apr 2011 07:26:19 +0000 (07:26 +0000)]
ixgbe: setup redirection table for multiple packet buffers
Setup RSS redirection table to be compatible with multiple packet
buffers. Currently, this works on 82599 devices because the RSS
redirection index is masked by the number of queues per packet
buffer.
This sets the cap on the RSS table to maxq.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 86b4db3bcce714d6bdd8c056158821301624bf00)
John Fastabend [Tue, 26 Apr 2011 07:26:14 +0000 (07:26 +0000)]
ixgbe: DCB 82598 devices, tx_idx and rx_idx swapped
The tx_idx and rx_idx values are swapped on 82598 devices
with DCB enabled.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit aba70d5e6c1c188fe45c1a782060440b6c2ea759)
John Fastabend [Tue, 26 Apr 2011 07:26:08 +0000 (07:26 +0000)]
ixgbe: DCB use existing TX and RX queues
The number of TX and RX queues allocated depends on the device
type, the current features set, online CPUs, and various
compile flags.
To enable DCB with multiple queues and allow it to coexist with
all the features currently implemented it has to setup a valid
queue count. This is done at init time using the FDIR and RSS
max queue counts and allowing each TC to allocate a queue per
CPU.
DCB will now use available queues up to (8 x TCs) this is somewhat
arbitrary cap but allows DCB to use up to 64 queues. Its easy to
increase this later if that is needed.
This is prep work to enable Flow Director with DCB. After this
DCB can easily coexist with existing features and no longer
needs its own DCB feature ring.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit e901acd6fa5538436e08e8a862dd2c080297f852)
John Fastabend [Tue, 3 May 2011 02:26:48 +0000 (02:26 +0000)]
ixgbe: configure minimal packet buffers to support TC
ixgbe devices support different numbers of packet buffers either
8 or 4. Here we only allocate the minimal number of packet
buffers required to implement the net_devices number of traffic
classes.
Fewer traffic classes allows for larger packet buffers in
hardware. Also more Tx/Rx queues can be given to each
traffic class.
This patch is mostly about propagating the number of traffic
classes through the init path. Specifically this adds the 4TC
cases to the MRQC and MTQC setup routines. Also ixgbe_setup_tc()
was sanitized to handle other traffic class value.
Finally changing the number of packet buffers in the hardware
requires the device to reinit. So this moves the reinit work
from DCB into the main ixgbe_setup_tc() routine to consolidate
the reset code. Now dcbnl_xxx ops call ixgbe_setup_tc() to
configure packet buffers if needed.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 8b1c0b24d9afd4a59a8aa9c778253bcff949395a)
John Fastabend [Tue, 26 Apr 2011 07:25:58 +0000 (07:25 +0000)]
ixgbe: consolidate MRQC and MTQC handling
The MRQC and MTQC registers are configured in the main
setup path but are also reconfigured in the DCB setup
path. The DCB path fixes the DCB configuration by configuring
the SECTXMINIFG gap which is required for DCB pause
to operate correctly.
This patch reduces the duplicate code and does all setup
in ixgbe_setup_mtqc() and ixgbe_setup_mrqc().
Additionally, this removes the IXGBE_QDE. This write never
set the WRITE bit in the register so the write was not
actually doing anything. Also this was to clear the register
but, it is never set and defaults to zero. If this is
needed for SRIOV it should be added correctly in a follow
up patch. But it's never been working so removing it here
should be OK.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 72a32f1f3f68b7d95e7151b5f88831fb9906416e)
John Fastabend [Mon, 2 May 2011 12:34:10 +0000 (12:34 +0000)]
ixgbe: consolidate packet buffer allocation
Consolidate packet buffer allocation currently being
done in the DCB path and main path. This allows the
feature set and packet buffer requirements to be done
once.
This is prep work to allow DCB to coexist with other
features namely, flow director.
CC: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 80605c6513207344d00b32e8d1e64bd34fdf1358)
John Fastabend [Tue, 26 Apr 2011 10:05:14 +0000 (10:05 +0000)]
ixgbe: dcbnl reduce duplicated code and indentation
Replace duplicated code in if/else branches with single
check and ixgbe_init_interrupt_scheme().
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 1fcd86b51179518f7e69164e37353fb59cd6301e)
- unify vlan and nonvlan rx path
- kill adapter->vlgrp and ixgbevf_vlan_rx_register
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit dadcd65f52921456112183fde543fc214bb0a227)
Stephen Hemminger [Thu, 9 Jun 2011 02:58:39 +0000 (02:58 +0000)]
ixgbevf: remove unnecessary ampersands
Use standard format for net_device_ops (without &)
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Greg Rose <Gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit c12db7695e1f52b27108c3dfde3cf655a0368c58)
When the Manageability Engine (ME) is enabled on 82579, it periodically
accesses some MAC CSR registers. There is an arbiter in hardware which
prevents simultaneous access of these registers by the host software, i.e.
the driver. There is a hardware bug in the aribter that signals a host
access of the registers later than it actually happens. A write of the
Transmit or Receive Descriptor Tail register could result in an incorrect
value if the driver and ME perform simultaneous accesses which could result
in an access to an invalid memory address. This would return an
Unsupported Request which could hang the hardware. Workaround the issue by
checking the FWSM register bit24 which is set by ME before it accesses the
MAC CSR registers.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit c6e7f51e73c1bc6044bce989ec503ef2e4758d55)
Bruce Allan [Fri, 22 Jul 2011 06:21:56 +0000 (06:21 +0000)]
e1000e: Spurious interrupts & dropped packets with 82577/8/9 in half-duplex
On 82577/8/9 in half-duplex when a received packet is passed from the PHY
to the MAC, if too many preamble octects are stripped from the packet
before arriving at the MAC, it can be misintrepeted as an in-band message
rather than an actual frame. For example, if the frame contents resembled
an interrupt request in-band message, it would trigger a false interrupt.
In most cases, the packet is just dropped.
By reducing the number of preamble octets stripped from the beginning of
the frame when passing it from the PHY to the MAC, the MAC will interpret
the frame properly.
An additional uses of the magic PHY_REG(770, 16) have been updated with a
define introduced with this patch.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 1d2101a712b3b7281a19ff6d7bfc16c2ce9d3998)
Bruce Allan [Fri, 22 Jul 2011 06:22:02 +0000 (06:22 +0000)]
e1000e: increase driver version number
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 12440928dca77eccc8a793cf3cd83d017abbd7d6)
Bruce Allan [Fri, 29 Jul 2011 05:53:07 +0000 (05:53 +0000)]
e1000e: alternate MAC address update
If word 0x37 in the EEPROM is 0xFFFF _or_ 0x0000, then there is no
alternate MAC address in the EEPROM.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 244735f6ebccbf72a283db89472309f770e14c80)
Bruce Allan [Fri, 22 Jul 2011 06:21:35 +0000 (06:21 +0000)]
e1000e: do not disable receiver on 82574/82583
Due to a hardware erratum, the receiver on 82574 and 82583 should not be
stopped once it has been started.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 7f99ae633884043c70f4cc4a03f43dad0f0ecba2)
Bruce Allan [Fri, 29 Jul 2011 05:52:51 +0000 (05:52 +0000)]
e1000e: minor re-order of #include files
The recent commit a6b7a407 when back-ported to the out-of-tree e1000e
driver caused a compilation error on older kernels which required a
re-ordering of the #include files. This cosmetic patch syncs the two
drivers for easier maintainability.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 9fb7a5f77b26dedfcfa4e3a36fe207f818662bee)
Bruce Allan [Fri, 22 Jul 2011 06:21:41 +0000 (06:21 +0000)]
e1000e: remove unnecessary check for NULL pointer
The array shadow_ram is never NULL.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit b9e06f70dc186f8353cc593f2b4609383b3be7a9)
after review of all intel drivers, found several instances where
drivers had the incorrect pattern of:
memory mapped write();
delay();
which should always be:
memory mapped write();
write flush(); /* aka memory mapped read */
delay();
explanation:
The reason for including the flush is that writes can be held
(posted) in PCI/PCIe bridges, but the read always has to complete
synchronously and therefore has to flush all pending writes to a
device. If a write is held and followed by a delay, the delay
means nothing because the write may not have reached hardware
(maybe even not until the next read)
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 945a51517cc0bd9e461f2018624dfc1faef9ddee)
Jeff Kirsher [Tue, 12 Jul 2011 16:10:12 +0000 (16:10 +0000)]
e1000e: use GFP_KERNEL allocations at init time
In process and sleep allowed context, favor GFP_KERNEL allocations over
GFP_ATOMIC ones.
-v2: fixed checkpatch.pl warnings
CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Ben Greear <greearb@candelatech.com> CC: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c2fed9965c60e1f989f57889357c557f7b907ab7)
This patch adds support for the Jumbo Frames feature on 82583 devices.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a3d72d5d01b82a86f3b16ca1918d2040b1acba8c)
Eric Dumazet [Mon, 11 Jul 2011 10:00:40 +0000 (10:00 +0000)]
e1000e: remove e1000_queue_stats
struct e1000_queue_stats is not used, lets remove it
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 995181b9529adbfecd6882c734ee702b5ed9226c)
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3e714ad3c2a07ee120044b72222cc20c14959efb)
Jon Mason [Mon, 27 Jun 2011 07:43:47 +0000 (07:43 +0000)]
e1000e: remove unnecessary reads of PCI_CAP_ID_EXP
The PCIE capability offset is saved during PCI bus walking. It will
remove an unnecessary search in the PCI configuration space if this
value is referenced instead of reacquiring it.
Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 353064de8af3bf46757376db66c29fa87a9fda3a)
Bruce Allan [Thu, 19 May 2011 01:53:41 +0000 (01:53 +0000)]
e1000e: update driver version
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit b3ccf26704f80cd9959c4d2a83c2278bcc93425c)
Bruce Allan [Fri, 13 May 2011 07:20:14 +0000 (07:20 +0000)]
e1000e: Clear host wakeup bit on 82577/8 without touching PHY page 800
The Host Wakeup Active bit in the PHY Port General Configuration register
(page 769 register 17) must be cleared after every PHY reset to prevent an
unexpected wake signal from the PHY. Originally, this was accomplished by
simply reading the PHY Wakeup Control register on page 800 which clears the
Host Wakeup Active bit as a side-effect. Unfortunately, a hardware bug on
the 82577 and 82578 PHY can cause unexpected behavior when registers on
page 800 are accessed while in gigabit mode.
This patch changes the remaining instances when the Host Wakeup Active bit
needs to be cleared while possibly in gigabit mode by accessing the Port
General Configuration register directly instead of accessing any register
on page 800.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 3ebfc7c9a6177794e0a1635483bd64268bed5d3c)
Bruce Allan [Fri, 13 May 2011 07:20:09 +0000 (07:20 +0000)]
e1000e: access multiple PHY registers on same page at the same time
Doing a PHY page select can take a long time, relatively speaking. This
can cause a significant delay when updating a number of PHY registers on
the same page by unnecessarily setting the page for each PHY access. For
example when going to Sx, all the PHY wakeup registers (WUC, RAR[], MTA[],
SHRAR[], IP4AT[], IP6AT[], etc.) on 82577/8/9 need to be updated which
takes a long time which can cause issues when suspending.
This patch introduces new PHY ops function pointers to allow callers to
set the page directly and do any number of PHY accesses on that page.
This feature is currently only implemented for 82577, 82578 and 82579
PHYs for both the normally addressed registers as well as the special-
case addressing of the PHY wakeup registers on page 800. For the latter
registers, the existing function for accessing the wakeup registers has
been divided up into three- 1) enable access to the wakeup register page,
2) perform the register access and 3) disable access to the wakeup register
page. The two functions that enable/disable access to the wakeup register
page are necessarily available to the caller so that the caller can restore
the value of the Port Control (a.k.a. Wakeup Enable) register after the
wakeup register accesses are done.
All instances of writing to multiple PHY registers on the same page are
updated to use this new method and to acquire any PHY locking mechanism
before setting the page and performing the register accesses, and release
the locking mechanism afterward.
Some affiliated magic number cleanup is done as well.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 2b6b168d52aa044363647cfff8bda5cef8068ca3)
Bruce Allan [Fri, 13 May 2011 07:20:03 +0000 (07:20 +0000)]
e1000e: do not schedule the Tx queue until ready
Start the Tx queue when the interface is brought up in e1000e_up() but do
not schedule the queue until link is up as detected in the watchdog task
which sets netif_carrier_on.
Also flush the descriptors and clean the Tx and Rx rings before resetting
the hardware when bringing the interface down otherwise there is a small
window where the watchdog task can be triggered with netif_carrier_off
and the Tx ring not yet empty which causes an additional and unnecessary
reset.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 400484fa65ead1bbc3e86ea79e7505182a31bce1)
Bruce Allan [Fri, 13 May 2011 07:19:53 +0000 (07:19 +0000)]
e1000e: log when swflag is cleared unexpectedly on ICH/PCH devices
Since EXTCNF_CTRL.SWFLAG (used in the ownership arbitration of shared
resources, e.g. the PHY shared between the s/w, f/w, and h/w clients)
can be cleared by any of those clients, log a debug message when
software attempts to clear it and it is already cleared unexpectedly.
And since the swflag is cleared by a hardware reset, the driver does
not need to do that, but the mutex acquired when the bit is set must
still be cleared.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit c5caf4825b22957e4ad70fd94316e91ce8cfb51c)
Bruce Allan [Fri, 13 May 2011 07:19:48 +0000 (07:19 +0000)]
e1000e: 82579 intermittently disabled during S0->Sx
When repeatedly cycling Sx->S0 states with the network cable unplugged,
the 82579 PHY may not initialize as expected and may require a full power
cycle to recover functionality to the device. Workaround this by testing
access of the PHY registers after resuming; if that returns unexpected
results toggle the LANPHYPC signal to power cycle the PHY.
This is implemented in the new function e1000_resume_workarounds_pchlan()
which calls another new function, e1000_toggle_lanphypc_value_ich8lan(),
which has been created to reduce code duplication (same functionality
required by a previous workaround). Also, e1000e_disable_gig_wol_ich8lan
is now e1000_suspend_workarounds_ich8lan to better reflect what it does.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 99730e4c13c8344b02dd96108945b48d28c14c25)
Bruce Allan [Fri, 13 May 2011 07:19:42 +0000 (07:19 +0000)]
e1000e: disable far-end loopback mode on ESB2
The ESB2 LAN includes a debug feature that enables far-end loopback (FELB)
of the SerDes/Kumeran interface. This feature is activated when receiving
a sequence of symbols that includes a reserved codeword. On a perfect
link, FELB would never be activated. In the presence of bit errors, there
is a very small, but non-zero, probability of FELB being activated.
If the FELB is activated, the SerDes link becomes non-functional and must
be reset. It could also corrupt the switching tables in the switch since
the ESB2 is transmitting packets with a different source MAC address.
This patch disables the FELB feature.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit d9b24135b972ccdd5f5174fba06c730e895e6daf)
Eric Dumazet [Tue, 12 Jul 2011 03:08:34 +0000 (20:08 -0700)]
net: introduce __netdev_alloc_skb_ip_align
RX rings should use GFP_KERNEL allocations if possible, add
__netdev_alloc_skb_ip_align() helper to ease this.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4915a0de43c3e9aef92005c1f94a8ff3a6cfced5)
Dan Carpenter [Thu, 26 Jan 2012 13:55:16 +0000 (16:55 +0300)]
xfs: fix acl count validation in xfs_acl_from_disk()
We applied a fix for CVE-2012-0038 fa8b18edd7 "xfs: validate acl count",
but there was a follow on patch which is not in our kernel. If count
was a negative then we could get by the new check.
From 093019cf1b18dd31b2c3b77acce4e000e2cbc9ce Mon Sep 17 00:00:00 2001
From: Xi Wang <xi.wang@gmail.com>
Date: Mon, 12 Dec 2011 21:55:52 +0000
Subject: [PATCH] xfs: fix acl count validation in xfs_acl_from_disk()
Commit fa8b18ed didn't prevent the integer overflow and possible
memory corruption. "count" can go negative and bypass the check.
Signed-off-by: Xi Wang <xi.wang@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Srinivas Eeda [Tue, 31 Jan 2012 22:37:19 +0000 (14:37 -0800)]
ocfs2: use spinlock irqsave for downconvert lock.patch
When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it
deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread.
Below is the stack snippet.
The patch disables interrupts when acquiring dc_task_lock spinlock.
Chris Mason [Wed, 25 Jan 2012 18:47:40 +0000 (13:47 -0500)]
Btrfs: fix reservations in btrfs_page_mkwrite
Josef fixed btrfs_page_mkwrite to properly release reserved
extents if there was an error. But if we fail to get a reservation
and we fail to dirty the inode (for ENOSPC reasons), we'll end up
trying to release a reservation we never had.
This makes sure we only release if we were able to reserve.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 16 Jan 2012 13:13:11 +0000 (08:13 -0500)]
Btrfs: use larger system chunks
system chunks by default are very small. This makes them slightly
larger and also fixes the conditional checks to make sure we don't
allocate a billion of them at once.
Josef Bacik [Fri, 13 Jan 2012 17:09:22 +0000 (12:09 -0500)]
Btrfs: add a delalloc mutex to inodes for delalloc reservations
I was using i_mutex for this, but we're getting bogus lockdep warnings by doing
that and theres no real way to get rid of those, so just stop using i_mutex to
protect delalloc metadata reservations and use a delalloc mutex instead. This
shouldn't be contended often at all, only if you are writing and mmap writing to
the file at the same time. Thanks,
Josef Bacik [Fri, 2 Dec 2011 20:44:12 +0000 (15:44 -0500)]
Btrfs: protect orphan block rsv with spin_lock
We've been seeing warnings coming out of the orphan commit stuff forever from
ceph. Turns out it's because we're racing with checking if the orphan block
reserve is set, because we clear it outside of the spin_lock. So leave the
normal fastpath checks where they are, but take the spin_lock and _recheck_ to
make sure we haven't had an orphan block rsv added in the meantime. Then clear
the root's orphan block rsv and release the lock. With this patch a user said
the warnings went away and they usually showed up pretty soon after he started
ceph. Thanks,
Josef Bacik [Fri, 13 Jan 2012 00:10:12 +0000 (19:10 -0500)]
Btrfs: don't call btrfs_throttle in file write
Btrfs_throttle will make us wait if there is a currently committing transaction
until we can open new transactions, which is ridiculous since we don't actually
start any transactions within the file write path anyway, so all this does is
introduce big latencies if we have a sync/fsync heavy workload going on while
somebody else is trying to do work. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 45a8090e626ab470c91142954431a93846030b0d)
Josef Bacik [Fri, 13 Jan 2012 00:10:12 +0000 (19:10 -0500)]
Btrfs: release space on error in page_mkwrite
If updating the inode gave us an ENOSPC we were just returning in page_mkwrite,
which is a problem since we make our reservation right before trying to update
the inode, so fix the out label so that we actually free our reservation.
Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit ec39e180fd3188c983c94603634bfcd019f42ae7)
This is because of the wrong if condition, which is used to check if we should
subtract the bytes of the dropped range from i_blocks/i_bytes of i-node or not.
When we truncate a compressed extent, btrfs substracts the bytes of the whole
extent, it's wrong. We should substract the real size that we truncate, no
matter it is a compressed extent or not. Fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit f70a9a6b94af86fca069a7552ab672c31b457786)
Josef Bacik [Fri, 13 Jan 2012 00:10:12 +0000 (19:10 -0500)]
Btrfs: do not use btrfs_end_transaction_throttle everywhere
A user reported a problem where things like open with O_CREAT would take up to
30 seconds when he had nfs activity on the same mount. This is because all of
our quick metadata operations, like create, symlink etc all do
btrfs_end_transaction_throttle, which if the transaction is blocked will wait
for the commit to complete before it returns. This adds a ridiculous amount of
latency and isn't really needed. The normal btrfs_end_transaction will mark the
transaction as blocked and wake the transaction kthread up if it thinks the
transaction needs to end (this being in the running out of global reserve space
scenario), and this is all that is really needed since we've already done
everything we're going to do, we just need to return. This should help people
with the latency they were seeing when using synchronous heavy workloads.
Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
(cherry picked from commit 7ad85bb76a61801362701b77c5cee5aa09f35369)
Li Zefan [Wed, 7 Dec 2011 03:38:24 +0000 (11:38 +0800)]
Btrfs: fix possible deadlock when opening a seed device
The correct lock order is uuid_mutex -> volume_mutex -> chunk_mutex,
but when we mount a filesystem which has backing seed devices, we have
this lock chain:
Since seed device is readonly, there's no usable space in the filesystem.
Afterwards we add a sprout device to it, and the kernel creates a METADATA
block group and a SYSTEM block group where comes free space we can reserve,
but we still get revervation failure because the global block_rsv hasn't
been updated accordingly.