]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
9 years agoixgbe: Fix ATR so that it correctly handles IPv6 extension headers
Alexander Duyck [Tue, 26 Jan 2016 03:39:40 +0000 (19:39 -0800)]
ixgbe: Fix ATR so that it correctly handles IPv6 extension headers

Orabug: 23177316

The ATR code was assuming that it would be able to use tcp_hdr for
every TCP frame that came through.  However this isn't the case as it
is possible for a frame to arrive that is TCP but sent through something
like a raw socket.  As a result the driver was setting up bad filters in
which tcp_hdr was really pointing to the network header so the data was
all invalid.

In order to correct this I have added a bit of parsing logic that will
determine the TCP header location based off of the network header and
either the offset in the case of the IPv4 header, or a walk through the
IPv6 extension headers until it encounters the header that indicates
IPPROTO_TCP.  In addition I have added checks to verify that the lowest
protocol provided is recognized as IPv4 or IPv6 to help mitigate raw
sockets using ETH_P_ALL from having ATR applied to them.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit e2873d43f9c607e9d855b8ae120d5990ba1722df)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Store VXLAN port number in network order
Alexander Duyck [Tue, 26 Jan 2016 03:36:29 +0000 (19:36 -0800)]
ixgbe: Store VXLAN port number in network order

Orabug: 23177316

The VXLAN port number should be stored in network order instead of in host
order as it is accessed from the hot-path in ATR.  This way we can avoid
having to do any byte swaps in order to validate the port number.

I moved the vxlan_port value into a hole in the read-mostly region of the
adapter struct.  This way it should be in a warm cache-line instead of in
some isolated region in memory when it needs to be accessed.

In addition I went through and stripped a bunch of unneeded ifdef flags
since having an extra variable present doesn't really hurt anything and
makes the code easier to read.  I also went through and dropped the
NETIF_F_RXCSUM flag which was being set in hw_encap_features but provides
no value as the flag is not evaluated in the Rx path.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 9f12df906cd807a05d71aa53a951532d1dd3b888)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Fix for RAR0 not being set to default MAC addr
Tushar Dave [Thu, 7 Jan 2016 22:17:03 +0000 (14:17 -0800)]
ixgbe: Fix for RAR0 not being set to default MAC addr

Orabug: 23177316

commit c9f53e63c208 ("ixgbe: Refactor MAC address configuration code")
introduced code that doesn't set HW register RAR0 to default mac address
but FF:FF:FF:FF:FF:FF. Due to this, ixgbe HW discards all incoming packets
that doesn't have destination mac address equals to FF:FF:FF:FF:FF:FF.

This commit sets RAR0 correctly to default HW mac address.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Tested-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 56768045186c183f1d6e5cd916dd07751a777a8d)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: fix dates on header of ixgbe_model.h
John Fastabend [Wed, 17 Feb 2016 22:35:23 +0000 (14:35 -0800)]
ixgbe: fix dates on header of ixgbe_model.h

Orabug: 23177316

Fixes: 9d35cf062e05 ("net: ixgbe: add minimal parser details for ixgbe")
Reported-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a92265ce1cea3832a47103ae16afa328a396e9af)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: use u32 instead of __u32 in model header
John Fastabend [Wed, 17 Feb 2016 22:34:53 +0000 (14:34 -0800)]
ixgbe: use u32 instead of __u32 in model header

Orabug: 23177316

I incorrectly used __u32 types where we should be using u32 types when
I added the ixgbe_model.h file.

Fixes: 9d35cf062e05 ("net: ixgbe: add minimal parser details for ixgbe")
Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fa477f4cb3de7bdd3899029803ebfcf269ba8c85)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agonet: ixgbe: add minimal parser details for ixgbe
John Fastabend [Wed, 17 Feb 2016 05:18:28 +0000 (21:18 -0800)]
net: ixgbe: add minimal parser details for ixgbe

Orabug: 23177316

This adds an ixgbe data structure that is used to determine what
headers:fields can be matched and in what order they are supported.

For hardware devices this can be a bit tricky because typically
only pre-programmed (firmware, ucode, rtl) parse graphs will be
supported and we don't yet have an interface to change these from
the OS. So its sort of a you get whatever your friendly vendor
provides affair at the moment.

In the future we can add the get routines and set routines to
update this data structure. One interesting thing to note here
is the data structure here identifies ethernet, ip, and tcp
fields without having to hardcode them as enumerations or use
other identifiers.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 9d35cf062e05be8b8b2b7dbc943cd95352cd90cb)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Make ATR recognize IPv6 extended headers
Mark Rustad [Wed, 9 Dec 2015 22:55:37 +0000 (14:55 -0800)]
ixgbe: Make ATR recognize IPv6 extended headers

Orabug: 23177316

Right now ATR is not handling IPv6 extended headers, so ATR is not
being performed on such packets. Fix that by skipping extended
headers when they are present. This also fixes a problem where
the ATR code was not checking that the inner protocol was actually
TCP before setting up the signature rules. Since the protocol check
is intimately involved with the extended header processing as well,
this all gets fixed together.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit e19dcdeb3527e996a96ea49d86cccce768b1079a)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Fix MDD events generated when FCoE+SRIOV are enabled
Neerav Parikh [Wed, 9 Dec 2015 06:13:58 +0000 (22:13 -0800)]
ixgbe: Fix MDD events generated when FCoE+SRIOV are enabled

Orabug: 23177316

When FCoE is enabled with SR-IOV on the X550 NIC the hardware
generates MDD events.

This patch fixes these by setting the expected values in the
Tx context descriptors for FCoE/FIP frames and adding a flush
after writing the RDLEN register.

Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 8b75451be1fc05b6ee3f9d0eaea0006d60caff89)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Fix to get FDMI HBA attributes information with X550
Usha Ketineni [Tue, 8 Dec 2015 12:01:18 +0000 (04:01 -0800)]
ixgbe: Fix to get FDMI HBA attributes information with X550

Orabug: 23177316

Check whether the FCOE support is enabled for the devices to get the
 FDMI HBA attributes information instead of checking each device id.
Also, add Model string information for X550.

Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit b262a9a772eae649159fd2480992713a2dd2b3d3)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Correct handling of any outer UDP checksum setting
Mark Rustad [Fri, 4 Dec 2015 19:26:43 +0000 (11:26 -0800)]
ixgbe: Correct handling of any outer UDP checksum setting

Orabug: 23177316

If an outer UDP checksum is set, pass the skb up with CHECKSUM_NONE
so that the stack will check the checksum. Do not increment an
error counter, because we don't know that there is an actual error.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit d469251bfd06d15289c9dd5dd60b8ebf65785b03)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: do not call check_link for ethtool in ixgbe_get_settings()
Emil Tantilov [Thu, 3 Dec 2015 23:20:06 +0000 (15:20 -0800)]
ixgbe: do not call check_link for ethtool in ixgbe_get_settings()

Orabug: 23177316

In ixgbe_get_settings() the link status and speed of the interface
are determined based on a read from the LINKS register via the call
to mac.ops.check.link(). This can cause issues where external drivers
may end up with unknown speed when calling ethtool_get_setings().

Instead of calling the mac.ops.check_link() we can report the speed
from the adapter structure which is populated by the driver.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 0e4d422f5f7249324ac8d1b8e12772e530787a66)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: fix broken PFC with X550
Vasu Dev [Mon, 23 Nov 2015 18:31:25 +0000 (10:31 -0800)]
ixgbe: fix broken PFC with X550

Orabug: 23177316

PFC is configuration is skipped for X550 devices due to a incorrect
device id check, fixing that to include X550 PFC configuration.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit cb78cf12d6e90f57f6e7d090867ef19b6a189dde)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: use correct FCoE DDP max check
Vasu Dev [Mon, 23 Nov 2015 18:31:01 +0000 (10:31 -0800)]
ixgbe: use correct FCoE DDP max check

Orabug: 23177316

Use fcoe_ddp_xid from netdev as this is correctly set for different
device IDs to avoid DDP skip error on X550 as "xid=0x20b out-of-range"

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit f10166aba2def9bc6443290231c60f7e2f70129b)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Fill at least min credits to a TC credit refills
Vasu Dev [Mon, 23 Nov 2015 18:30:51 +0000 (10:30 -0800)]
ixgbe: Fill at least min credits to a TC credit refills

Orabug: 23177316

Currently credit_refill and credit_max could be zero for a TC and that
is causing Tx hang for CEE mode configuration, so to fix that have at
min credit assigned to a TC and that is as what IEEE mode already does.

Change-ID: If652c133093a21e530f4e9eab09097976f57fb12
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 3efcb86e2da69989827066c231edb30ec10de932)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Fix bugs in ixgbe_clear_vf_vlans()
Alexander Duyck [Wed, 23 Dec 2015 17:00:35 +0000 (09:00 -0800)]
ixgbe: Fix bugs in ixgbe_clear_vf_vlans()

Orabug: 23177316

When I had rewritten the code for ixgbe_clear_vf_vlans() it looks like I
had transitioned back and forth between using word as an offset and using
word as a register offset.  As a result I honestly don't see how the code
was working before other than the fact that resetting the VLANs on the VF
like didn't do much to clear them.

Another issue found is that the mask was using a divide instead of a
modulus.  As a result the mask bit was incorrectly being set to either bit
0 or 1 based on the value of the VF being tested.  As a result the wrong
VFs were having their VLANs cleared if they were enabled.

I have updated the code so that word represents the offset in the array.
This way we can use the modulus and xor operations and they will make sense
instead of being performed on a 4 byte aligned value.

I replaced the statement "(word % 2) ^ 1" with "~word % 2" in order to
reduce the line length as the line exceeded 80 characters with the register
name inserted.  The two should be equivalent so the change should be safe.

Reported-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit ab3a3b7b0cf88021376d565c526aa27b1e105148)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Correct X550EM_x revision check
Mark Rustad [Fri, 20 Nov 2015 21:12:17 +0000 (13:12 -0800)]
ixgbe: Correct X550EM_x revision check

Orabug: 23177316

The X550EM_x revision check needs to check a value, not just a bit.
Use a mask and check the value. Also remove the redundant check
inside the ixgbe_enter_lplu_t_x550em, because it can only be called
when both the mac type and revision check pass.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 3ca2b2506ec9a3b1615930a6810d30ec9aba10a1)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: fix RSS limit for X550
Emil Tantilov [Fri, 20 Nov 2015 21:02:16 +0000 (13:02 -0800)]
ixgbe: fix RSS limit for X550

Orabug: 23177316

X550 allows for up to 64 RSS queues, but the driver can have max
of 63 (-1 MSIX vector for link).

On systems with >= 64 CPUs the driver will set the redirection table
for all 64 queues which will result in packets being dropped.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit e9ee3238f8a480bbca58e51d02a93628d7c1f265)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Clean up redundancy in hw_enc_features
Mark Rustad [Wed, 18 Nov 2015 23:37:04 +0000 (15:37 -0800)]
ixgbe: Clean up redundancy in hw_enc_features

Orabug: 23177316

Clean up minor redundancy in the setting of hw_enc_features that
makes it appears that X550 uniquely has more encapsulation features
than other devices. The driver only supports one more feature, so
make it look that way. No longer set NETIF_F_SG since that is set
by the register_netdev call. Thanks to Alex Duyck for noticing this
slight confusion.

Reported-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit fb8ad4a592c627783dc18cc147c7f4de55cf318d)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: report correct media type for KR, KX and KX4 interfaces
Veola Nazareth [Wed, 11 Nov 2015 23:22:59 +0000 (16:22 -0700)]
ixgbe: report correct media type for KR, KX and KX4 interfaces

Orabug: 23177316

Ethtool reports backplane type interfaces as 1000/10000baseT link modes.
This has been corrected to report the media as KR, KX or KX4 based on the
backplane interface present.

Signed-off-by: Veola Nazareth <veola.nazareth@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 695b816d1aeb09505f499ec7cc5e90657c8c11ac)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: add support for QSFP PHY types in ixgbe_get_settings()
Emil Tantilov [Mon, 9 Nov 2015 23:07:12 +0000 (15:07 -0800)]
ixgbe: add support for QSFP PHY types in ixgbe_get_settings()

Orabug: 23177316

Add missing QSFP PHY types to allow for more accurate reporting of
port settings.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit af56b4d865bf40e031df9118b0663ebf406ff121)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbevf: minor cleanups for ixgbevf_set_itr()
Emil Tantilov [Thu, 5 Nov 2015 00:02:21 +0000 (16:02 -0800)]
ixgbevf: minor cleanups for ixgbevf_set_itr()

Orabug: 23177316

adapter->rx_itr_setting is not a mask so check it with == instead of &
do not default to 12K interrupts in ixgbevf_set_itr()

There should be no functional effect from these changes.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 9ad3d6f7eb300d464bfce2c80e7b1594f5e5eff9)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c

9 years agoixgbevf: Fix handling of NAPI budget when multiple queues are enabled per vector
William Dauchy [Fri, 30 Oct 2015 17:16:30 +0000 (18:16 +0100)]
ixgbevf: Fix handling of NAPI budget when multiple queues are enabled per vector

Orabug: 23177316

This is the same patch as for ixgbe but applied differently according to
busy polling.  See commit 5d6002b7b822c74 ("ixgbe: Fix handling of NAPI
budget when multiple queues are enabled per vector")

Signed-off-by: William Dauchy <william@gandi.net>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit d0f71afffa1c3d5a36a4a278f1dbbd2643176dc3)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: do not report 2.5 Gbps as supported
Emil Tantilov [Sat, 7 Nov 2015 00:34:33 +0000 (16:34 -0800)]
ixgbe: do not report 2.5 Gbps as supported

Orabug: 23177316

Some X550 devices can connect at 2.5Gbps during fail-over, but only
with certain link partners. Also setting the advertised speed will
not work so we do not report it as supported to avoid confusion.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit d3428001c58dce10af624e889667c7862320390a)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Clean stale VLANs when changing port VLAN or resetting
Alexander Duyck [Tue, 3 Nov 2015 01:10:32 +0000 (17:10 -0800)]
ixgbe: Clean stale VLANs when changing port VLAN or resetting

Orabug: 23177316

This patch guarantees that the VFs do not have access to VLANs that they
were not supposed to.  What this patch does is add code so that we delete
the previous port VLAN after adding a new one, and if we reset the VF we
clear all of the filters associated with it.

Previously the code was leaving all previous VLANs mapped to the VF and
they didn't get deleted unless the VF specifically requested it or if the
PF itself was reset.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 4c7f35f679f592804736f9303051257de2c9f021)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Clear stale pool mappings
Alexander Duyck [Tue, 3 Nov 2015 01:10:26 +0000 (17:10 -0800)]
ixgbe: Clear stale pool mappings

Orabug: 23177316

This patch makes certain that we clear the pool mappings added when we
configure default MAC addresses for the interface.  Without this we run the
risk of leaking an address into pool 0 which really belongs to VF 0 when
SR-IOV is enabled.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 6e982aeae5779a67fc02c5f6873654c49af97e70)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Fix VLAN promisc in relation to SR-IOV
Alexander Duyck [Tue, 3 Nov 2015 01:10:19 +0000 (17:10 -0800)]
ixgbe: Fix VLAN promisc in relation to SR-IOV

Orabug: 23177316

This patch is a follow-on for enabling VLAN promiscuous and allowing the PF
to add VLANs without adding a VLVF entry.  What this patch does is go
through and free the VLVF registers if they are not needed as the VLAN
belongs only to the PF which is the default pool.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit e1d0a2af2b30f5f0cbce2e4dd438d4da2433b226)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Add support for VLAN promiscuous with SR-IOV
Alexander Duyck [Tue, 3 Nov 2015 01:10:13 +0000 (17:10 -0800)]
ixgbe: Add support for VLAN promiscuous with SR-IOV

Orabug: 23177316

This patch adds support for VLAN promiscuous with SR-IOV enabled.

The code prior to this patch was only adding the PF to VLANs that the VF
had added.  As such enabling promiscuous mode would actually not add any
additional VLAN filters so visibility was limited.  This lead to a number
of issues as the bridge and OVS would expect us to accept all VLAN tagged
packets when promiscuous mode was enabled, and instead we would filter out
most if not all depending on the configuration of the PF.

With this patch what we do is set all the bits in the VFTA and all of the
VLVF bits associated with the pool belonging to the PF.  By doing this the
PF is guaranteed to receive all VLAN tagged traffic associated with the RAR
filters assigned to the PF.  In addition we will clean up those same bits
in the event of promiscuous mode being disabled.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 16369564915a9777217244678ee6160f8f1acac7)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Reorder search to work from the top down instead of bottom up
Alexander Duyck [Tue, 3 Nov 2015 01:10:07 +0000 (17:10 -0800)]
ixgbe: Reorder search to work from the top down instead of bottom up

Orabug: 23177316

This patch is meant to reduce the complexity of the search function used
for finding a VLVF entry associated with a given VLAN ID.  The previous
code was searching from bottom to top.  I reordered it to search from top
to bottom.  In addition I pulled an AND statement out of the loop and
instead replaced it with an OR statement outside the loop.  This should
help to reduce the overall size and complexity of the function.

There was also some formatting I cleaned up in regards to whitespace and
such.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit c2bc9ce91c31cc214667b9e1a150cd3000856c1c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Add support for adding/removing VLAN on PF bypassing the VLVF
Alexander Duyck [Tue, 3 Nov 2015 01:10:01 +0000 (17:10 -0800)]
ixgbe: Add support for adding/removing VLAN on PF bypassing the VLVF

Orabug: 23177316

This patch adds support for bypassing the VLVF entry creation when the PF
is adding a new VLAN.  The advantage to doing this is that we can then save
the VLVF entries for the VFs which must have them in order to function,
versus the PF which can fall back on the default pool entry.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit b6488b662b5011a3640033a266886603892dfed1)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Simplify configuration of setting VLVF and VLVFB
Alexander Duyck [Tue, 3 Nov 2015 01:09:54 +0000 (17:09 -0800)]
ixgbe: Simplify configuration of setting VLVF and VLVFB

Orabug: 23177316

This patch addresses several issues within the VLVF and VLVFB
configuration

First was the fact that code was overly complicated with multiple
conditional paths depending on if we adding or removing and which bit we
were going to add or remove.  Instead of messing with all that I have
simplified it by using (vid / 32) and (1 - vid / 32) to identify our
register and the other vlvfb register.

Second was the fact that we were likely leaking a few packets into the PF
in cases where we were deleting an entry and the VFTA filter for that entry
as the ordering was such that we deleted the pool and then the VLAN filter
instead of the other way around.  I have updated that by adding a check for
no bits being set and if that occurs we clear things up in the proper
order.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 5ac736a65ac131e76edb5bbe75f7f9acef7a8a7b)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Reduce VT code indent in set_vfta by introducing jump label
Alexander Duyck [Tue, 3 Nov 2015 01:09:48 +0000 (17:09 -0800)]
ixgbe: Reduce VT code indent in set_vfta by introducing jump label

Orabug: 23177316

In order to clear the way for upcoming work I thought it best to drop the
level of indent in the ixgbe_set_vfta_generic function.  Most of the code
is held in the virtualization specific section.  So the easiest approach is
to just add a jump label and jump past the bulk of the code if it is not
enabled.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 63d9379a598ed9fbb887b8679623f8a328ee394e)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Simplify definitions for regidx and bit in set_vfta
Alexander Duyck [Tue, 3 Nov 2015 01:09:42 +0000 (17:09 -0800)]
ixgbe: Simplify definitions for regidx and bit in set_vfta

Orabug: 23177316

This patch simplifies the logic for setting the VFTA register by removing
the number of conditional checks needed.  Instead we just use some boolean
logic to generate vfta_delta, and if that is set then we xor the vfta by
that value and write it back.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit c18fbd5f024e47897a120f42d128c04fa708692c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Fix SR-IOV VLAN pool configuration
Alexander Duyck [Tue, 3 Nov 2015 01:09:35 +0000 (17:09 -0800)]
ixgbe: Fix SR-IOV VLAN pool configuration

Orabug: 23177316

The code for checking the PF bit in ixgbe_set_vf_vlan_msg was using the
wrong offset and as a result it was pulling the VLAN off of the PF even if
there were VFs numbered greater than 40 that still had the VLAN enabled.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 8e8e9a0b7df0194e95bb1d657f9edbdc6363f082)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Return error on failure to allocate mac_table
Alexander Duyck [Tue, 3 Nov 2015 01:09:29 +0000 (17:09 -0800)]
ixgbe: Return error on failure to allocate mac_table

Orabug: 23177316

Add a check to make certain mac_table was actually allocated and is not
NULL.  If it is NULL return -ENOMEM and allow the probe routine to fail
rather then causing a NULL pointer dereference further down the line.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 530fd82a9fea5bba8e044bdf6fdf2ddc495e3807)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbevf: Handle extended IPv6 headers in Tx path
Mark Rustad [Thu, 19 Nov 2015 21:56:30 +0000 (13:56 -0800)]
ixgbevf: Handle extended IPv6 headers in Tx path

Orabug: 23177316

Check for and handle IPv6 extended headers so that Tx checksum
offload can be done. Also use skb_checksum_help for unexpected
cases. Thanks to Tom Herbert for noticing these problems. Thanks
to Alexander Duyck for seeing how to coalesce the error handling
into one location.

Reported-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit d34a614adfb16a560ddb6759d532eb32b6651eae)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Always turn PHY power on when requested
Mark Rustad [Thu, 5 Nov 2015 19:02:14 +0000 (11:02 -0800)]
ixgbe: Always turn PHY power on when requested

Orabug: 23177316

Instead of inhibiting PHY power control when manageability is
present, only inhibit turning PHY power off when manageability
is present. Consequently, PHY power will always be turned on when
requested. Without this patch, some systems with X540 or X550
devices in some conditions will never get link.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 3c2f2b77a917488b56b2676b99adb5d3c07d6e68)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Handle extended IPv6 headers in Tx path
Mark Rustad [Wed, 18 Nov 2015 17:21:28 +0000 (09:21 -0800)]
ixgbe: Handle extended IPv6 headers in Tx path

Orabug: 23177316

Check for and handle IPv6 extended headers so that Tx checksum
offload can be done. Also use skb_checksum_help for unexpected
cases. Thanks to Tom Herbert for noticing these problems. Thanks
to Alexander Duyck for recognizing problems with the first version
of this patch and recognizing how to coalesce error conditions
into a single location.

Reported-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 36a92d7190e68e9387347695fe4625eb2c9e7e1c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Save VF info and take references
Mark Rustad [Fri, 30 Oct 2015 22:29:34 +0000 (15:29 -0700)]
ixgbe: Save VF info and take references

Orabug: 23177316

Save VF device pointers and take references to speed accesses used
to monitor the device behavior to avoid slot resets. The saved
information avoids lock contention during the search used to access
each of the VFs.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 988d13073fe122f0b6a2b80b5f2aa1b0717f9edb)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Wait for master disable to be set
Mark Rustad [Tue, 27 Oct 2015 20:23:23 +0000 (13:23 -0700)]
ixgbe: Wait for master disable to be set

Orabug: 23177316

According to the datasheets, the driver should wait for the master
disable bit to read as being set before checking the status
register for master disable.

Reported-by: Dan Streetman <dan.streetman@canonical.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 48b44612738793252c97c548f3d0bd56543d5273)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Correct spec violations by waiting after reset
Mark Rustad [Tue, 27 Oct 2015 20:23:14 +0000 (13:23 -0700)]
ixgbe: Correct spec violations by waiting after reset

Orabug: 23177316

The ixgbe driver was violating the specification in the datasheet
by not waiting 1ms before checking for the reset bit clearing. This
is called out for devices supported by ixgbe, so implement the
required delay.

Reported-by: Dan Streetman <dan.streetman@canonical.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit efff2e027758fd5cc739d500397f729591f32a94)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Update PTP to support X550EM_x devices
Mark Rustad [Tue, 27 Oct 2015 16:58:07 +0000 (09:58 -0700)]
ixgbe: Update PTP to support X550EM_x devices

Orabug: 23177316

The X550EM_x devices handle clocking differently, so update the
PTP implementation to accommodate them. This involves significant
changes to ixgbe's PTP code to accommodate the new range of
behaviors including things like non-power-of-2 clock wrapping.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit a9763f3cb54c7f1c6a47962c814935654476d09f)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Allow FDB entries access to more RAR filters
Alexander Duyck [Thu, 22 Oct 2015 23:26:42 +0000 (16:26 -0700)]
ixgbe: Allow FDB entries access to more RAR filters

Orabug: 23177316

This change makes it so that we allow the PF to make use of all free RAR
entries for FDB use if needed.

Previously the code limited us to 16 unicast entries, however this was
shared between MACVLAN which wasn't limited and the FDB code which was.  So
instead of treating the FDB code as a second class citizen I have updated
it so that it has access to just as many entries as the MACVLAN filters.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 2f9be1665585a3757a00a6d1b8201d0ede937a34)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Use __dev_uc_sync and __dev_uc_unsync for unicast addresses
Alexander Duyck [Thu, 22 Oct 2015 23:26:36 +0000 (16:26 -0700)]
ixgbe: Use __dev_uc_sync and __dev_uc_unsync for unicast addresses

Orabug: 23177316

This change replaces the ixgbe_write_uc_addr_list call in ixgbe_set_rx_mode
with a call to __dev_uc_sync instead.  This works much better with the MAC
addr list code that was already in place and solves an issue in which you
couldn't remove an FDB address without having to reset the port.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 0f079d22834ac0529413bdee5b5aa52485942162)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Refactor MAC address configuration code
Alexander Duyck [Thu, 22 Oct 2015 23:26:30 +0000 (16:26 -0700)]
ixgbe: Refactor MAC address configuration code

Orabug: 23177316

In the process of tracking down a memory leak when adding/removing FDB
entries I had to go through the MAC address configuration code for ixgbe.
In the process of doing so I found a number of issues that impacted
readability and performance.  This change updates the code in general to
clean it up so it becomes clear what each step is doing.  From what I can
tell there a couple of bugs cleaned up in this code.

First is the fact that the MAC addresses were being double counted for the
PF.  As a result once entries up to 63 had been used you could no longer
add additional filters.

A simple test case for this:
  for i in `seq 0 96`
  do
    ip link add link ens8 name mv$i type macvlan
    ip link set dev mv$i up
  done

Test script:
  ethregs -s 0:8.0 | grep -e "RAH" | grep 8000....$

When things are working correctly RAL/H registers 1 - 97 will be consumed.
In the failing case it will stop at 63 and prevent any further filters from
being added.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit c9f53e63c2089d8154900ed06da0aa7be9f74201)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbevf: Minor cleanups
Mark Rustad [Thu, 22 Oct 2015 00:21:20 +0000 (17:21 -0700)]
ixgbevf: Minor cleanups

Orabug: 23177316

Make some minor cleanups, such as simplifying return paths, deleting
unneeded initializations, return values more directly and so forth.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 50985b5f62cc74e9e222f0ddf890e1ba87be371a)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbevf: Use a private workqueue to avoid certain possible hangs
Mark Rustad [Thu, 22 Oct 2015 00:21:15 +0000 (17:21 -0700)]
ixgbevf: Use a private workqueue to avoid certain possible hangs

Orabug: 23177316

Use a private workqueue to avoid hangs that were otherwise possible
when performing stress tests, such as creating and destroying many
VFS repeatedly.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 40a13e2493c9882cb4d09054d81a5063cd1589a2)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Use private workqueue to avoid certain possible hangs
Mark Rustad [Thu, 22 Oct 2015 00:21:10 +0000 (17:21 -0700)]
ixgbe: Use private workqueue to avoid certain possible hangs

Orabug: 23177316

Use a private workqueue to avoid hangs that were otherwise possible
when performing stress tests, such as creating and destroying many
VFS repeatedly.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 780484d853d096b4253b966e1789c4f338dd7301)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Add support for newer thermal alarm
Mark Rustad [Mon, 19 Oct 2015 16:22:14 +0000 (09:22 -0700)]
ixgbe: Add support for newer thermal alarm

Orabug: 23177316

The newer copper PHY implementation used with newer X550EM_x
devices uses a different thermal alarm type than the earlier
one. Make changes to support both types.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 83a9fb20ecc4bb8b36a610ab833962fed52db64c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Prevent KR PHY reset in ixgbe_init_phy_ops_x550em
Mark Rustad [Fri, 16 Oct 2015 20:27:49 +0000 (13:27 -0700)]
ixgbe: Prevent KR PHY reset in ixgbe_init_phy_ops_x550em

Orabug: 23177316

This patch removes KR PHY reset from ixgbe_init_phy_ops_x550em,
since this function is meant to initialize function pointers for
the detected PHY type. Internal PHY reset was moved to
ixgbe_setup_internal_phy_t_x550em which will now detect which
mode the internal PHY operates in and set it up as required.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit f164b84529e3bf9ae43882fd3ac84bef94d104cf)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbevf: fix spoofed packets with random MAC
Emil Tantilov [Mon, 12 Oct 2015 17:56:00 +0000 (10:56 -0700)]
ixgbevf: fix spoofed packets with random MAC

Orabug: 23177316

If ixgbevf is loaded while the corresponding PF interface is down
and the driver assigns a random MAC address, that address can be
overwritten with the value of hw->mac.perm_addr, which would be 0 at
that point.

To avoid this case we init hw->mac.perm_addr to the randomly generated
address and do not set it unless we receive ACK from ixgbe.

Reported-by: John Greene <jogreene@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 465fc643c2dcbe08e0debac80c225f6750b40d3c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbevf: use ether_addr_copy instead of memcpy
Emil Tantilov [Mon, 12 Oct 2015 17:55:51 +0000 (10:55 -0700)]
ixgbevf: use ether_addr_copy instead of memcpy

Orabug: 23177316

replace some instances of memcpy for setting up the mac address with
ether_addr_copy()

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 91a76baadec1f30e8441c3d52c2559468a4da693)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Remove CS4227 diagnostic code
Mark Rustad [Fri, 2 Oct 2015 16:23:53 +0000 (09:23 -0700)]
ixgbe: Remove CS4227 diagnostic code

Orabug: 23177316

Testing has now shown that the diagnostic code used with the CS4227
is no longer needed, so remove it.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit d206563ad8f6fb41943366cf22f1aabc19d2b1a7)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe/ixgbevf: use napi_schedule_irqoff()
Alexander Duyck [Tue, 29 Sep 2015 22:19:43 +0000 (15:19 -0700)]
ixgbe/ixgbevf: use napi_schedule_irqoff()

Orabug: 23177316

The ixgbe_intr and ixgbe/ixgbevf_msix_clean_rings functions run from hard
interrupt context or with interrupts already disabled in netpoll.

They can use napi_schedule_irqoff() instead of napi_schedule()

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit ef2662b2a820aaca4c147b91659bf57c06688ede)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbevf: Limit lowest interrupt rate for adaptive interrupt moderation to 12K
Alexander Duyck [Tue, 29 Sep 2015 20:11:15 +0000 (13:11 -0700)]
ixgbevf: Limit lowest interrupt rate for adaptive interrupt moderation to 12K

Orabug: 23177316

This patch is the ixgbevf version of commit 8ac34f10a5ea4 "ixgbe: Limit
lowest interrupt rate for adaptive interrupt moderation to 12K"

The same logic applies here as well as the same results since a netperf
test will starve for memory in the time from one Tx interrupt to the next.
As a result the ixgbevf driver underperformed when compared to vhost_net.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 8a9ca1104da0de6dd8551237e7d0e50eeeea4e80)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Add KR mode support for CS4227 chip
Mark Rustad [Mon, 28 Sep 2015 21:37:47 +0000 (14:37 -0700)]
ixgbe: Add KR mode support for CS4227 chip

Orabug: 23177316

KR auto-neg mode is what we will be using going forward. The SW
interface for this mode is different that what was used for iXFI.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit d91e3a7d624590220e31ccb80a6fb5247cbfa64a)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Fix handling of NAPI budget when multiple queues are enabled per vector
Alexander Duyck [Tue, 22 Sep 2015 21:35:41 +0000 (14:35 -0700)]
ixgbe: Fix handling of NAPI budget when multiple queues are enabled per vector

Orabug: 23177316

This patch corrects an issue in which the polling routine would increase
the budget for Rx to at least 1 per queue if multiple queues were present.
This would result in Rx packets being processed when the budget was 0 which
is meant to indicate that no Rx can be handled.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 5d6002b7b822c7423e75d4651e6790bfb5642b1b)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: fix multiple kernel-doc errors
Jean Sacren [Sat, 19 Sep 2015 11:08:44 +0000 (05:08 -0600)]
ixgbe: fix multiple kernel-doc errors

Orabug: 23177316

The commit dfaf891dd3e1 ("ixgbe: Refactor the RSS configuration code")
introduced a few kernel-doc errors:

1) The function name is missing;
2) The format is wrong;
3) The short description is redundant.

Fix all the above for the correct execution of the kernel doc.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit a897a2adb602fe3d9223aa59393be07341d3a124)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Delete redundant include file
Mark Rustad [Fri, 18 Sep 2015 17:08:00 +0000 (10:08 -0700)]
ixgbe: Delete redundant include file

Orabug: 23177316

Delete a redundant include of net/vxlan.h.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit cc1f88ba16fa5cc4769cf25dca9fafeb1546be50)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: drop null test before destroy functions
Julia Lawall [Sun, 13 Sep 2015 12:15:13 +0000 (14:15 +0200)]
ixgbe: drop null test before destroy functions

Orabug: 23177316

Remove unneeded NULL test.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@ expression x; @@
-if (x != NULL)
  \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit edab421a57fbdb7f7b83fb494a48c47bc719a7f0)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Reset interface after enabling SR-IOV
Alexander Duyck [Tue, 20 Oct 2015 20:28:17 +0000 (13:28 -0700)]
ixgbe: Reset interface after enabling SR-IOV

Orabug: 23177316

Enabling SR-IOV and then bringing the interface up was resulting in the PF
MAC addresses getting into a bad state.  Specifically the MAC address was
enabled for both VF 0 and the PF.  This resulted in some odd behaviors such
as VF 0 receiving a copy of the PFs traffic, which in turn enables the
ability for VF 0 to spoof the PF.

A workaround for this issue appears to be to bring up the interface first
and then enable SR-IOV as this way the reset is then triggered in the
existing code.

In order to correct this I have added a change to ixgbe_setup_tc where if
the interface is down we still will at least call ixgbe_reset so that the
MAC addresses for the device are reset to the correct pools.

Steps to reproduce issue:
modprobe ixgbe
echo 7 > /sys/bus/pci/devices/0000\:01\:00.1/sriov_numvfs
ifconfig enp1s0f1 up
ethregs -s 1:00.1 | grep MPSAR | grep -v 00000000

Result:
MPSAR[0]               00000081
MPSAR[254]             00000001

Expected Result, behavior after patch:
MPSAR[0]               00000080
MPSAR[254]             00000080

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit bf4d67d94c842edf57e3cac2c4dff58a9ce7ac41)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe, ixgbevf: Add new mbox API xcast mode
Hiroshi Shimamoto [Fri, 28 Aug 2015 06:59:03 +0000 (06:59 +0000)]
ixgbe, ixgbevf: Add new mbox API xcast mode

Orabug: 23177316

The limitation of the number of multicast address for VF is not enough
for the large scale server with SR-IOV feature. IPv6 requires the multicast
MAC address for each IP address to handle the Neighbor Solicitation
message. We couldn't assign over 30 IPv6 addresses to a single VF.

This patch introduces the new mailbox API, IXGBE_VF_UPDATE_XCAST_MODE,
to update multicast mode of VF. This adds 3 modes;
  - NONE     only L2 exact match addresses or Flow Director enabled
  - MULTI    BAM and ROMPE set
  - ALLMULTI BAM, ROMPE and MPE set

If a guest VF user wants over 30 MAC multicast addresses, set IFF_ALLMULTI
to request PF to update xcast mode to enable VF multicast promiscuous mode.

On the other hand, enabling VF multicast promiscuous mode may affect
security and performance in the network of the NIC. Only trusted VF can
enable multicast promiscuous mode. The behavior of untrusted VF is the
same as previous version.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 8443c1a4b192089e62642d847ebac3e4d15134c3)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/net/ethernet/intel/ixgbe/ixgbe.h
drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c

9 years agoixgbe: Check for setup_internal_link method
Mark Rustad [Wed, 9 Sep 2015 20:37:33 +0000 (13:37 -0700)]
ixgbe: Check for setup_internal_link method

Orabug: 23177316

Only call the internal_setup_link method when it is provided. This
check is required for newer version parts.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit a85ce532f28efabda030d9065a0c2023a2003f36)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: Fix CS4227-related semaphore error on reset failure
Mark Rustad [Wed, 26 Aug 2015 21:10:22 +0000 (14:10 -0700)]
ixgbe: Fix CS4227-related semaphore error on reset failure

Orabug: 23177316

If the reset never completes, it is necessary to retake the
semaphore before returning, because the caller will release
the semaphore.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 8bf7a7b879985321c63e3ae46fee4e7f0d654ab1)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: disable LRO by default
Emil Tantilov [Tue, 25 Aug 2015 01:08:31 +0000 (18:08 -0700)]
ixgbe: disable LRO by default

Orabug: 23177316

This patch disables LRO by default in favor of GRO.

LRO is incompatible with forwarding and is disabled when forwarding
is turned on which makes the default offloads of the driver
inconsistent. LRO can still be enabled via ethtool.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 72bfd32d2f84d26aa132dd74a8eef14d039d326f)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoixgbe: add flow control ethertype to the anti-spoofing filter
Emil Tantilov [Thu, 20 Aug 2015 22:31:20 +0000 (15:31 -0700)]
ixgbe: add flow control ethertype to the anti-spoofing filter

Orabug: 23177316

This patch makes sure that flow control packets initiated by the VF are
dropped and reported as spoofed.

Flow control packets can be used to limit the throughput or as DOS
attack when generated from a VF. Flow control is not supported per VF
hence any pause frames generated from a VF are considered malicious.

Also cleaned up indentation and some redundant comments.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit f079fa005aae08ee0e1bc32699874ff4f02e11c1)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoi40e: queue-specific settings for interrupt moderation
Kan Liang [Fri, 19 Feb 2016 14:24:04 +0000 (09:24 -0500)]
i40e: queue-specific settings for interrupt moderation

Orabug: 23176970

For i40e driver, each vector has its own ITR register. However, there
are no concept of queue-specific settings in the driver proper. Only
global variable is used to store ITR values. That will cause problems
especially when resetting the vector. The specific ITR values could be
lost.
This patch move rx_itr_setting and tx_itr_setting to i40e_ring to store
specific ITR register for each queue.
i40e_get_coalesce and i40e_set_coalesce are also modified accordingly to
support queue-specific settings. To make it compatible with old ethtool,
if user doesn't specify the queue number, i40e_get_coalesce will return
queue 0's value. While i40e_set_coalesce will apply value to all queues.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a75e8005d506f374554b17383c39aa82db0ea860)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
9 years agoNVMe: IO ending fixes on surprise removal
Keith Busch [Fri, 11 Dec 2015 20:14:28 +0000 (13:14 -0700)]
NVMe: IO ending fixes on surprise removal

This patch fixes a lost request discovered during IO + hot removal.

The driver's pci removal deletes gendisks prior to shutting down the
controller to allow dirty data to sync. Dirty data can not be synced on
a surprise removal, though, and would potentially block indefinitely.

The driver previously had marked the queue as dying in this scenario
to prevent new requests from attempting, however it will still block
for requests that already entered the queue. This patch fixes this by
quiescing IO first, then aborting the requeued requests before deleting
disks.

Reported-by: Sujith Pandel <sujith_pandel@dell.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Tested-by: Sujith Pandel <sujith_pandel@dell.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit b5875222de2fb91339db79a753677ba4f68120d0)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
Conflicts:
drivers/nvme/host/pci.c

9 years agonvme: temporary fix for Apple controller reset
Stephan Günther [Tue, 1 Dec 2015 20:23:22 +0000 (13:23 -0700)]
nvme: temporary fix for Apple controller reset

Recent patches added basic support for the Apple NVMe controller but
still cause resets and data corruption on that particular controller
when a specific pattern of read/flush commands occurs. Limiting the
queue depth to 2 works around that issue.

This patch enforces that limit only for the Apple controller and is
considered a temporary fix until we find the root source of that
problem.

Signed-off-by: Stephan Günther <guenther@tum.de>
Signed-off-by: Maurice Leclaire <leclaire@in.tum.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 1f390c1fde3a96974784be53cb3a645da3e4849c)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agonvme: add missing unmaps in nvme_queue_rq
Christoph Hellwig [Fri, 16 Oct 2015 05:58:31 +0000 (07:58 +0200)]
nvme: add missing unmaps in nvme_queue_rq

When we fail various metadata related operations in nvme_queue_rq we
need to unmap the data SGL.

Cc: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit bf508e910b02a6107a5aa054e03c6fc8a65dae1e)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: default to 4k device page size
Nishanth Aravamudan [Tue, 24 Nov 2015 16:55:05 +0000 (09:55 -0700)]
NVMe: default to 4k device page size

We received a bug report recently when DDW (64-bit direct DMA on Power)
is not enabled for NVMe devices. In that case, we fall back to 32-bit
DMA via the IOMMU, which is always done via 4K TCEs (Translation Control
Entries).

The NVMe device driver, though, assumes that the DMA alignment for the
PRP entries will match the device's page size, and that the DMA aligment
matches the kernel's page aligment. On Power, the the IOMMU page size,
as mentioned above, can be 4K, while the device can have a page size of
8K, while the kernel has a page size of 64K. This eventually trips the
BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple
of 4K but not 8K (e.g., 0xF000).

In this particular case of page sizes, we clearly want to use the
IOMMU's page size in the driver. And generally, the NVMe driver in this
function should be using the IOMMU's page size for the default device
page size, rather than the kernel's page size. There is not currently an
API to obtain the IOMMU's page size across all architectures and in the
interest of a stop-gap fix to this functional issue, default the NVMe
device page size to 4K, with the intent of adding such an API and
implementation across all architectures in the next merge window.

With the functionally equivalent v3 of this patch, our hardware test
exerciser survives when using 32-bit DMA; without the patch, the kernel
will BUG within a few minutes.

Signed-off-by: Nishanth Aravamudan <nacc at linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit c5c9f25b98a568451d665afe4aeefe17bf9f2995)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: reap completion entries when deleting queue
Keith Busch [Fri, 20 Nov 2015 15:38:13 +0000 (08:38 -0700)]
NVMe: reap completion entries when deleting queue

Make sure that there are no unprocesssed entries on a completion
queue before deleting it, and check for validity of the CQ
door bell before writing completions to it.

This fixes problems with doing a sysfs reset of the device while
it's handling IO.

Tested-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 604e8c8da8854351496215d269c3fa93859e3fee)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: Fix possible arithmetic overflow for max segments
Keith Busch [Wed, 18 Nov 2015 23:33:08 +0000 (16:33 -0700)]
NVMe: Fix possible arithmetic overflow for max segments

Reported-by: Paul Grabinar <paul.grabinar@ranbarg.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 6824c5ef5e8900e61ce8ed40885cacc1c9301c14)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: add support for Apple NVMe controller
Stephan Günther [Tue, 3 Nov 2015 23:49:45 +0000 (00:49 +0100)]
NVMe: add support for Apple NVMe controller

Add PCI ID of Apple's NVMe controller.

Signed-off-by: Stephan Guenther <guenther@tum.de>
Signed-off-by: Maurice Leclaire <leclaire@in.tum.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit c74dc7801d515d01847fd5cf2b472489fa5717b1)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: use split lo_hi_{read,write}q
Stephan Günther [Sun, 8 Nov 2015 01:07:02 +0000 (18:07 -0700)]
NVMe: use split lo_hi_{read,write}q

Some controllers may require ordered split transfers even on 64bit
machines, e.g. Apple's NVMe controller as found in the MacBook8,1 and
MacBookAir7,1 (256/512GB models).

This patch enforces ordered split transfers on 64bit platforms, which
works around that issue for all controllers. As pointed out by Christoph
[1] there should be no performance impact due to that modification.

[1] http://lists.infradead.org/pipermail/linux-nvme/2015-November/002965.html

Signed-off-by: Stephan Guenther <guenther@tum.de>
Signed-off-by: Maurice Leclaire <leclaire@in.tum.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Updated by me to explicitly use lo_hi_read/writeq instead of playing
define tricks.

Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit a310acd7a7ea53533886c11bb7edd11ffd61a036)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: Increase the max transfer size when mdts is 0
Sathyavathi M [Thu, 5 Nov 2015 19:52:28 +0000 (12:52 -0700)]
NVMe: Increase the max transfer size when mdts is 0

This patch address the issue when IO with 128KB from FIO is split into
two parts, 124KB and 4KB, due to max transfer size(127KB). This degrades
the device performance.

Signed-off-by: Sathyavathi M <sathya.m@samsung.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit b12363d0a5da00c422641f3d926fffb713192ea3)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
Conflicts:
drivers/nvme/host/pci.c

9 years agonvme: fix 32-bit build warning
Arnd Bergmann [Tue, 6 Oct 2015 20:29:48 +0000 (22:29 +0200)]
nvme: fix 32-bit build warning

Compiling the nvme driver on 32-bit warns about a cast from a __u64
variable to a pointer:

drivers/block/nvme-core.c: In function 'nvme_submit_io':
drivers/block/nvme-core.c:1847:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    (void __user *)io.addr, length, NULL, 0);

The cast here is intentional and safe, so we can shut up the
gcc warning by adding an intermediate cast to 'uintptr_t'.

I had previously submitted a patch to fix this problem in the
nvme driver, but it was accepted on the same day that two new
warnings got added.

For clarification, I also change the third instance of this cast
to use uintptr_t instead of unsigned long now.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: d29ec8241c10e ("nvme: submit internal commands through the block layer")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 3d42e67fe5ebc1e5c3aae9b1037e38ec99a362cc)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
Conflicts:
drivers/nvme/host/pci.c

9 years agoNVMe: Add explicit block config dependency
Keith Busch [Mon, 12 Oct 2015 17:37:38 +0000 (11:37 -0600)]
NVMe: Add explicit block config dependency

The nvme driver was moved from drivers/block, losing our implicit
dependency on CONFIG_BLOCK. This makes it an explicit driver dependency.

Reported-by: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 11feb18f4edb1423ed6091908c45de7ade30d5b7)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agonvme: move to a new drivers/nvme/host directory
Jay Sternberg [Fri, 9 Oct 2015 16:17:06 +0000 (18:17 +0200)]
nvme: move to a new drivers/nvme/host directory

This patch moves the NVMe driver from drivers/block/ to its own new
drivers/nvme/host/ directory.  This is in preparation of splitting the
current monolithic driver up and add support for the upcoming NVMe
over Fabrics standard.  The drivers/nvme/host/ is chose to leave space
for a NVMe target implementation in addition to this host side driver.

Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com>
[hch: rebased, renamed core.c to pci.c, slight tweaks]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 57dacad5f2288e3de91f99b29f07b4a2793446d2)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoblock: export blkdev_reread_part() and __blkdev_reread_part()
Jarod Wilson [Wed, 6 May 2015 04:26:22 +0000 (12:26 +0800)]
block: export blkdev_reread_part() and __blkdev_reread_part()

This patch exports blkdev_reread_part() for block drivers, also
introduce __blkdev_reread_part().

For some drivers, such as loop, reread of partitions can be run
from the release path, and bd_mutex may already be held prior to
calling ioctl_by_bdev(bdev, BLKRRPART, 0), so introduce
__blkdev_reread_part for use in such cases.

CC: Christoph Hellwig <hch@lst.de>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit be32417796c2b8a83fe4cbece83bea96ab9e378f)

Needed by Nvme driver update

Orabug: 22620486

Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNvme: fix build error
Jason Luo [Wed, 3 Feb 2016 04:38:16 +0000 (12:38 +0800)]
Nvme: fix build error

error: implicit declaration of function 'nvme_identify'
which is introduced by b4145035630715ee821528f209f098

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNvme: fix several build errors
Jason Luo [Wed, 3 Feb 2016 03:15:05 +0000 (11:15 +0800)]
Nvme: fix several build errors

struct nvme_queue' has no member named 'hctx which introduced
by a55c4be4f9afc516368d3a45360f735887cf72fe.

'REQ_TYPE_DRV_PRIV' undeclared (first use in this function)
while it's redefined as REQ_TYPE_SPECIAL in uek4.

Orabug: 22620486

Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agonvme: properly handle partially initialized queues in nvme_create_io_queues
Christoph Hellwig [Fri, 2 Oct 2015 16:51:31 +0000 (18:51 +0200)]
nvme: properly handle partially initialized queues in nvme_create_io_queues

This avoids having to clean up later in a seemingly unrelated place.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 2659e57b906562bb020fb093b0c1b670b9700314)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agonvme: merge nvme_dev_start, nvme_dev_resume and nvme_async_probe
Christoph Hellwig [Sat, 3 Oct 2015 07:49:23 +0000 (09:49 +0200)]
nvme: merge nvme_dev_start, nvme_dev_resume and nvme_async_probe

And give the resulting function a sensible name.  This keeps all the
error handling in a single place and will allow for further improvements
to it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 3cf519b5a8d4d067e3de19736283c9414402d3a2)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agonvme: factor reset code into a common helper
Christoph Hellwig [Fri, 2 Oct 2015 16:49:23 +0000 (18:49 +0200)]
nvme: factor reset code into a common helper

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 90667892c5a78b47080359883a569a260e9e87ed)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agonvme: delete dev from dev_list in nvme_reset
Christoph Hellwig [Fri, 2 Oct 2015 16:48:36 +0000 (18:48 +0200)]
nvme: delete dev from dev_list in nvme_reset

Device resets need to delete the device from the device list before
kicking of the reset an re-probe, otherwise we get the device added
to the list twice.  nvme_reset is the only side missing this deletion
at the moment, and this patch adds it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 201cf1ecdfe5ea2774cbb21d4214c98ec8b418de)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: Namespace removal simplifications
Keith Busch [Fri, 2 Oct 2015 16:37:28 +0000 (10:37 -0600)]
NVMe: Namespace removal simplifications

This liberates namespace removal from the device, allowing gendisk
references to be closed independent of the nvme controller reference
count.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 5105aa555c1c681ae281ea0d6108efd0a5d8a5e8)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agonvme: merge nvme_dev_reset into nvme_reset_failed_dev
Christoph Hellwig [Fri, 2 Oct 2015 15:41:18 +0000 (17:41 +0200)]
nvme: merge nvme_dev_reset into nvme_reset_failed_dev

And give the resulting function a more descriptive name.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 77b50d9e15e113fdb871218aa0f2e3bed12ee731)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agonvme.h: add missing nvme_id_ctrl endianess annotations
Christoph Hellwig [Fri, 2 Oct 2015 13:27:16 +0000 (15:27 +0200)]
nvme.h: add missing nvme_id_ctrl endianess annotations

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 08c69640cfcbdcc7aaed31c05bbfaf03bb60611c)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agonvme: move hardware structures out of the uapi version of nvme.h
Christoph Hellwig [Fri, 2 Oct 2015 13:25:49 +0000 (15:25 +0200)]
nvme: move hardware structures out of the uapi version of nvme.h

Currently all NVMe command and completion structures are exposed to userspace
through the uapi version of nvme.h.  They are not an ABI between the kernel
and userspace, and will change in C-incompatible way for future versions of
the spec.  Move them to the kernel version of the file and rename the uapi
header to nvme_ioctl.h so that userspace can easily detect the presence of
the new clean header.  Nvme-cli already carries a local copy of the header,
so it won't be affected by this move.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 9d99a8dda154f38307d43d9c9aa504bd3703d596)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agonvme: add a local nvme.h header
Christoph Hellwig [Sat, 3 Oct 2015 13:46:41 +0000 (15:46 +0200)]
nvme: add a local nvme.h header

Add a new drivers/block/nvme.h which contains all the driver internal
interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit f11bb3e244c4b14e2d0a3b9d7e41895752997170)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: Simplify device resume on io queue failure
Keith Busch [Fri, 2 Oct 2015 16:37:29 +0000 (10:37 -0600)]
NVMe: Simplify device resume on io queue failure

Releasing IO queues and disks was done in a work queue outside the
controller resume context to delete namespaces if the controller failed
after a resume from suspend. This is unnecessary since we can resume
a device asynchronously.

This patch makes resume use probe_work so it can directly remove
namespaces if the device is manageable but not IO capable. Since the
deleting disks was the only reason we had the convoluted "reset_workfn",
this patch removes that unnecessary indirection.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 0a7385ad69f0f210c5cfbfd334b42423a6e05e5a)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: Reference count open namespaces
Keith Busch [Thu, 1 Oct 2015 23:14:10 +0000 (17:14 -0600)]
NVMe: Reference count open namespaces

Dynamic namespace attachment means the namespace may be removed at any
time, so the namespace reference count can not be tied to the device
reference count. This fixes a NULL dereference if an opened namespace
is detached from a controller.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 188c3568f814fea965947ed24739987ba9c5a87e)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: Using PRACT bit to generate and verify PI by controller
Alok Pandey [Wed, 26 Aug 2015 14:56:14 +0000 (08:56 -0600)]
NVMe: Using PRACT bit to generate and verify PI by controller

This patch enables the PRCHK and reftag support when PRACT bit is set, and
block layer integrity is disabled.

Signed-off-by: Alok Pandey <pandey.alok@samsung.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit e19b127f5b76ec03b9c52b64f117dc75bb39eda1)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: removed unused nn var from nvme_dev_add
Matias Bjørling [Tue, 18 Aug 2015 16:13:41 +0000 (10:13 -0600)]
NVMe: removed unused nn var from nvme_dev_add

The logic in nvme_dev_add to enumerate namespaces was moved to
nvme_dev_scan. When moved, the nn variable is no longer used. This patch
removes it.

Fixes: a5768aai ("NVMe: Automatic namespace rescan")
Signed-off-by: Matias Bjørling <m@bjorling.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit b2b1ec9b55ed0840956db15f823c4a73383c08be)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: Set queue max segments
Keith Busch [Wed, 12 Aug 2015 22:17:54 +0000 (16:17 -0600)]
NVMe: Set queue max segments

This sets the queue's max segment size to match the device's
capabilities. The default of 128 is usable until a device's transfer
capability exceeds 512k, assuming a device page size of 4k. Many nvme
devices exceed that transfer limit, so this lets the block layer know what
kind of commands it to allow to form rather than unnecessarily split them.

One additional segment is added to account for a transfer that may start
in the middle of a page.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit e824410ffcf4b245296b56c6fdf7b9797fce8c3e)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: Add nvme subsystem reset IOCTL
Jon Derrick [Mon, 10 Aug 2015 21:20:41 +0000 (15:20 -0600)]
NVMe: Add nvme subsystem reset IOCTL

Controllers can perform optional subsystem resets as introduced in NVMe
1.1. This patch adds an IOCTL to trigger the subsystem reset by writing
"NVMe" to the NSSR register.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 81f03fedcce7ee7e83c37237ecaa2f68aad236fd)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: Add nvme subsystem reset support
Keith Busch [Mon, 10 Aug 2015 21:20:40 +0000 (15:20 -0600)]
NVMe: Add nvme subsystem reset support

Controllers part of an NVMe subsystem may be reset by any other controller
in the subsystem. If the device is capable of subsystem resets, this
patch adds detection for such events and performs appropriate controller
initialization upon subsystem reset detection.

The register bit is a RW1C type, so the driver needs to write a 1 to the
status bit to clear the subsystem reset occured bit during initialization.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit dfbac8c7ac5f58448b2216fe42ff52aaf175421d)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe:Remove unreachable code in nvme_abort_req
Sunad Bhandary [Fri, 31 Jul 2015 13:26:58 +0000 (18:56 +0530)]
NVMe:Remove unreachable code in nvme_abort_req

Removing unreachable code from nvme_abort_req as nvme_submit_cmd has no
failure status to return.

Signed-off-by: Sunad Bhandary <sunad.s@samsung.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit e3f879bf1ea3e03f433d292b0114807785f0754b)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agonvme: Fixes u64 division which breaks i386 builds
Jon Derrick [Tue, 21 Jul 2015 21:08:13 +0000 (15:08 -0600)]
nvme: Fixes u64 division which breaks i386 builds

Uses div_u64 for u64 division and round_down, a bitwise operation,
instead of rounddown, which uses a modulus.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit c45f5c9943ce0b16b299b543c2aae12408039027)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>
9 years agoNVMe: Use CMB for the IO SQes if available
Jon Derrick [Mon, 20 Jul 2015 16:14:09 +0000 (10:14 -0600)]
NVMe: Use CMB for the IO SQes if available

Some controllers have a controller-side memory buffer available for use
for submissions, completions, lists, or data.

If a CMB is available, the entire CMB will be ioremapped and it will
attempt to map the IO SQes onto the CMB. The queues will be shrunk as
needed. The CMB will not be used if the queue depth is shrunk below some
threshold where it may have reduced performance over a larger queue
in system memory.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 8ffaadf7429270914b8f146ec13cf305e01df20d)

Orabug: 22620486
Signed-off-by: Jason Luo <zhangqing.luo@oracle.com>