www.infradead.org Git - users/jedix/linux-maple.git/log

timekeeping: Add missing update call in timekeeping_resume()

This is a backport of 3e997130bd2e8c6f5aaa49d6e3161d4d29b43ab0

The leap second rework unearthed another issue of inconsistent data.

On timekeeping_resume() the timekeeper data is updated, but nothing
calls timekeeping_update(), so now the update code in the timer
interrupt sees stale values.

This has been the case before those changes, but then the timer
interrupt was using stale data as well so this went unnoticed for quite
some time.

Add the missing update call, so all the data is consistent everywhere.

Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Reported-and-tested-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Reported-and-tested-by: Martin Steigerwald <Martin@lichtvoll.de>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 0851978b661f25192ff763289698f3175b1bab42)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

hrtimer: Update hrtimer base offsets each hrtimer_interrupt

This is a backport of 5baefd6d84163443215f4a99f6a20f054ef11236

The update of the hrtimer base offsets on all cpus cannot be made
atomically from the timekeeper.lock held and interrupt disabled region
as smp function calls are not allowed there.

clock_was_set(), which enforces the update on all cpus, is called
either from preemptible process context in case of do_settimeofday()
or from the softirq context when the offset modification happened in
the timer interrupt itself due to a leap second.

In both cases there is a race window for an hrtimer interrupt between
dropping timekeeper lock, enabling interrupts and clock_was_set()
issuing the updates. Any interrupt which arrives in that window will
see the new time but operate on stale offsets.

So we need to make sure that an hrtimer interrupt always sees a
consistent state of time and offsets.

ktime_get_update_offsets() allows us to get the current monotonic time
and update the per cpu hrtimer base offsets from hrtimer_interrupt()
to capture a consistent state of monotonic time and the offsets. The
function replaces the existing ktime_get() calls in hrtimer_interrupt().

The overhead of the new function vs. ktime_get() is minimal as it just
adds two store operations.

This ensures that any changes to realtime or boottime offsets are
noticed and stored into the per-cpu hrtimer base structures, prior to
any hrtimer expiration and guarantees that timers are not expired early.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-8-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit bb6ed34f2a6eeb40608b8ca91f3ec90ec9dca26f)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

timekeeping: Provide hrtimer update function

This is a backport of f6c06abfb3972ad4914cef57d8348fcb2932bc3b

To finally fix the infamous leap second issue and other race windows
caused by functions which change the offsets between the various time
bases (CLOCK_MONOTONIC, CLOCK_REALTIME and CLOCK_BOOTTIME) we need a
function which atomically gets the current monotonic time and updates
the offsets of CLOCK_REALTIME and CLOCK_BOOTTIME with minimalistic
overhead. The previous patch which provides ktime_t offsets allows us
to make this function almost as cheap as ktime_get() which is going to
be replaced in hrtimer_interrupt().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/1341960205-56738-7-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 22f4bbcfb131e2392c78ad67af35fdd436d4dd54)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

hrtimers: Move lock held region in hrtimer_interrupt()

This is a backport of 196951e91262fccda81147d2bcf7fdab08668b40

We need to update the base offsets from this code and we need to do
that under base->lock. Move the lock held region around the
ktime_get() calls. The ktime_get() calls are going to be replaced with
a function which gets the time and the offsets atomically.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/1341960205-56738-6-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6c89f2ce05ea7e26a7580ad9eb950f2c4f10891b)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

timekeeping: Maintain ktime_t based offsets for hrtimers

This is a backport of 5b9fe759a678e05be4937ddf03d50e950207c1c0

We need to update the hrtimer clock offsets from the hrtimer interrupt
context. To avoid conversions from timespec to ktime_t maintain a
ktime_t based representation of those offsets in the timekeeper. This
puts the conversion overhead into the code which updates the
underlying offsets and provides fast accessible values in the hrtimer
interrupt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-4-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 03a90b9a6f7eec70edde4eb1f88fa8a5c058d85e)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

timekeeping: Fix leapsecond triggered load spike issue

This is a backport of 4873fa070ae84a4115f0b3c9dfabc224f1bc7c51

The timekeeping code misses an update of the hrtimer subsystem after a
leap second happened. Due to that timers based on CLOCK_REALTIME are
either expiring a second early or late depending on whether a leap
second has been inserted or deleted until an operation is initiated
which causes that update. Unless the update happens by some other
means this discrepancy between the timekeeping and the hrtimer data
stays forever and timers are expired either early or late.

The reported immediate workaround - $ data -s "`date`" - is causing a
call to clock_was_set() which updates the hrtimer data structures.
See: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix

Add the missing clock_was_set() call to update_wall_time() in case of
a leap second event. The actual update is deferred to softirq context
as the necessary smp function call cannot be invoked from hard
interrupt context.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-3-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit d21e4baf4523fec26e3c70cb78b013ad3b245c83)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

hrtimer: Provide clock_was_set_delayed()

This is a backport of f55a6faa384304c89cfef162768e88374d3312cb

clock_was_set() cannot be called from hard interrupt context because
it calls on_each_cpu().

For fixing the widely reported leap seconds issue it is necessary to
call it from hard interrupt context, i.e. the timer tick code, which
does the timekeeping updates.

Provide a new function which denotes it in the hrtimer cpu base
structure of the cpu on which it is called and raise the hrtimer
softirq. We then execute the clock_was_set() notificiation from
softirq context in run_hrtimer_softirq(). The hrtimer softirq is
rarely used, so polling the flag there is not a performance issue.

[ tglx: Made it depend on CONFIG_HIGH_RES_TIMERS. We really should get
rid of all this ifdeffery ASAP ]

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-2-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 62b787f886e2d96cc7c5428aeee05dbe32a9531b)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

time: Move common updates to a function

This is a backport of cc06268c6a87db156af2daed6e96a936b955cc82

While not a bugfix itself, it allows following fixes to backport
in a more straightforward manner.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c7e2580578671c4d19a1a83e6fdb2482cc136283)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond

This is a backport of fad0c66c4bb836d57a5f125ecd38bed653ca863a
which resolves a bug the previous commit.

Commit 6b43ae8a61 (ntp: Fix leap-second hrtimer livelock) broke the
leapsecond update of CLOCK_MONOTONIC. The missing leapsecond update to
wall_to_monotonic causes discontinuities in CLOCK_MONOTONIC.

Adjust wall_to_monotonic when NTP inserted a leapsecond.

Reported-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Richard Cochran <richardcochran@gmail.com>
Link: http://lkml.kernel.org/r/1338400497-12420-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c33f2424c3941986d402c81d380d4e805870a20f)
Conflicts:
kernel/time/timekeeping.c

Signed-off-by: Joe Jin <joe.jin@oracle.com>

ntp: Correct TAI offset during leap second

This is a backport of dd48d708ff3e917f6d6b6c2b696c3f18c019feed

When repeating a UTC time value during a leap second (when the UTC
time should be 23:59:60), the TAI timescale should not stop. The kernel
NTP code increments the TAI offset one second too late. This patch fixes
the issue by incrementing the offset during the leap second itself.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 96bab736bad82423c2b312d602689a9078481fa9)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

Revert "3.0.x: hrtimer: Fix clock_was_set so it is safe to call from irq context"

This reverts commit c51e012012e48ca262d4b489e33bc113bb5ac74d.

Revert "3.0.x: time: Fix leapsecond triggered hrtimer/futex load spike issue"

This reverts commit aac67aba83c32bd03f4b59bdd932a076afbee089.

Revert "3.0.x: hrtimer: Update hrtimer base offsets each hrtimer_interrupt"

This reverts commit 54b16ee687c86dfd6c94e49bdaa1535a3bf3cc9f.

SPEC: v2.6.39-300.6.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

Merge branch 'uek2-merge' of git://ca-git.us.oracle.com/linux-konrad-public

[kabi] update kabi

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[config] clean up NBD settings in kernel config

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

Merge branch 'stable/for-linus-3.7.rebased' into uek2-merge

* stable/for-linus-3.7.rebased:
xen/p2m: When revectoring deal with holes in the P2M array.
xen/mmu: Recycle the Xen provided L4, L3, and L2 pages

Fixes Oracle-Bug: 14577662

xen/p2m: When revectoring deal with holes in the P2M array.

When we free the PFNs and then subsequently populate them back
during bootup:

Freeing 20000-20200 pfn range: 512 pages freed
1-1 mapping on 20000->20200
Freeing 40000-40200 pfn range: 512 pages freed
1-1 mapping on 40000->40200
Freeing bad80-badf4 pfn range: 116 pages freed
1-1 mapping on bad80->badf4
Freeing badf6-bae7f pfn range: 137 pages freed
1-1 mapping on badf6->bae7f
Freeing bb000-100000 pfn range: 282624 pages freed
1-1 mapping on bb000->100000
Released 283999 pages of unused memory
Set 283999 page(s) to 1-1 mapping
Populating 1acb8a-1f20e9 pfn range: 283999 pages added

We end up having the P2M array (that is the one that was
grafted on the P2M tree) filled with IDENTITY_FRAME or
INVALID_P2M_ENTRY) entries. The patch titled

"xen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID."
recycles said slots and replaces the P2M tree leaf's with
&mfn_list[xx] with p2m_identity or p2m_missing.

And re-uses the P2M array sections for other P2M tree leaf's.
For the above mentioned bootup excerpt, the PFNs at
0x20000->0x20200 are going to be IDENTITY based:

P2M[0][256][0] -> P2M[0][257][0] get turned in IDENTITY_FRAME.

We can re-use that and replace P2M[0][256] to point to p2m_identity.
The "old" page (the grafted P2M array provided by Xen) that was at
P2M[0][256] gets put somewhere else. Specifically at P2M[6][358],
b/c when we populate back:

Populating 1acb8a-1f20e9 pfn range: 283999 pages added

we fill P2M[6][358][0] (and P2M[6][358], P2M[6][359], ...) with
the new MFNs.

That is all OK, except when we revector we assume that the PFN
count would be the same in the grafted P2M array and in the
newly allocated. Since that is no longer the case, as we have
holes in the P2M that point to p2m_missing or p2m_identity we
have to take that into account.

[v2: Check for overflow]
[v3: Move within the __va check]
[v4: Fix the computation]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 3fc509fc0c590900568ef516a37101d88f3476f5)

Conflicts:

arch/x86/xen/p2m.c

xen/mmu: Recycle the Xen provided L4, L3, and L2 pages

As we are not using them. We end up only using the L1 pagetables
and grafting those to our page-tables.

[v1: Per Stefano's suggestion squashed two commits]
[v2: Per Stefano's suggestion simplified loop]
[v3: Fix smatch warnings]
[v4: Add more comments]
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 488f046df922af992c1a718eff276529c0510885)

Conflicts:

arch/x86/xen/mmu.c

Merge branch 'uek2-2.6.39-300-be2iscsi-update' of git://ca-git.us.oracle.com/linux-snits-public

qla2xxx: Update the driver version to 8.04.00.08.39.0-k.

Bugdb: 13653
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>

qla2xxx: Correct loop_id_map allocation-size and usage.

Bugdb: 13653
Original code incorrectly assigned LOOPID_MAP_SIZE to be the
allocation size in bytes rather than total bit size.
Additionally corrected code to check for bit-allocation failure
in qla2x00_find_new_loop_id().

JIRA Key: V2632FC-270

dm mpath: delay retry of bypassed pg

Orabug: 14478983
If I/O needs retrying and only bypassed priority groups are available,
set the pg_init_delay_retry flag to wait before retrying.

If, for example, the reason for the bypass is that the controller is
getting reset or there is a firmware upgrade happening, retrying right
away would cause a flood of log messages and retries for what could be a
few seconds or even several minutes.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

[kabi] update kabi for ASM and ACFS

Orabug: 14547312
Bugz: 13687, 13688, 13689
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

Merge branch 'uek-2.6.39-300-nic-drv-update' of git://ca-git.us.oracle.com/linux-joejin-public

Network drivers update.

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

be2iscsi: Bump the driver version.

Patch sent to upstream kernel.

Signed-off-by: John Soni Jose <sony.john-n@emulex.com>
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>

be2iscsi: Fix a kernel panic because of TCP RST/FIN received.

A TCP RST/FIN can be received even before the connection specific
structures are initialized.This fix checks for the conn structure
is intialized or not when RST/FIN is received.

Signed-off-by: John Soni Jose <sony.john-n@emulex.com>
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>

be2iscsi: Configure the VLAN settings on the adapter.

Configure the VLAN parameters on the adapter using the iscsiadm
interface.

Patch submitted to upstream kernel.

Signed-off-by: John Soni Jose <sony.john-n@emulex.com>
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>

be2iscsi: Format the MAC_ADDR with sysfs_format_mac.

The MAC_ADDR stored in driver private structure is of
unsigned char data type but strlcpy parameters is of
signed char data type. This conversion of data types
lead to change in the value.This changed value is passed
to the upper layer and junk characters were displayed
when "iscsiadm -m iface" command was run.

In case of iSCSI boot, since the the MAC_ADDR was coming
junk the boot was also not working

Patch submitted to upstream kernel

Signed-off-by: John Soni Jose <sony.john-n@emulex.com>
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>

be2iscsi: Logging mechanism for the driver.

Log level mechanism for different events enabled. These
log levels can be set at driver load time/run time. The
log level is set for each Scsi_host.

Fixed few multi-line print warning to get over the new checkpatch.pl
warnings on multi-line strings.

Patch Submitted to upstream kernel

Signed-off-by: John Soni Jose <sony.john-n@emulex.com>
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>

be2iscsi: Issue MBX Cmd for login to boot target in crashdump mode

When the driver comes up in crashdump mode, it has to explicitly
issue command to FW for logging to the boot target. This fix issues
MBX Cmd to login to boot target in crashdump mode.

Patch Submitted to upstream kernel.

Signed-off-by: John Soni Jose <sony.john-n@emulex.com>
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>

be2iscsi: Removing the iscsi_data_pdu setting.

The setting of iscsi_data_pdu is not required anymore,
as this was required for BE1 adapters only. The BE1 adapter
were not supported in any previous versions of the kernel.

Patch Submitted to upstream kernel.

Signed-off-by: John Soni Jose <sony.john-n@emulex.com>
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>

be2iscsi: fix dma free size mismatch regression

Upstream kernel commit details are below.

commit b83d543fd934d565fb243ef348b06a61d794b31d
Author: Mike Christie <michaelc@cs.wisc.edu>
Date: Wed May 23 20:40:54 2012 -0500

[SCSI] be2iscsi: fix dma free size mismatch regression

This patch should go into 3.5 fixes. The bug was added in the
patches for the 3.5 feature window.

As you can see from the patch I made a mistake. During
development I switched from passing a struct to the size of
the struct, but left the sizeof. This results in us allocating
4 bytes (sizeof(int)) but then calling pci_free_consistent
with the size of the struct.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: John Soni Jose <sony.john-n@emulex.com>
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>

Merge branch 'v2.6.39-300.5.0.ol5_bug13910619#ocfs2_for_uek_v5' into master_igb

Orabug: 13910619
Merge git://ca-git.us.oracle.com/linux-xiaowhu-public.git v2.6.39-300.5.0.ol5_bug13910619#ocfs2_for_uek_v5

x86/nmi: Clean up register_nmi_handler() usage

Implement a cleaner and easier to maintain version for the section
warning fixes implemented in commit eeaaa96a3a21
("x86/nmi: Fix section mismatch warnings on 32-bit").

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Jan Beulich <JBeulich@suse.com>
Link: http://lkml.kernel.org/r/1340049393-17771-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Conflicts:

arch/x86/include/asm/nmi.h
arch/x86/kernel/nmi_selftest.c

x86/nmi: Fix page faults by nmiaction if kmemcheck is enabled

This patch tries to fix the problem of page fault exception
caused by accessing nmiaction structure in nmi if kmemcheck
is enabled.

If kmemcheck is enabled, the memory allocated through slab are
in pages that are marked non-present, so that some checks could
be done in the page fault handling code ( e.g. whether the
memory is read before written to ).

As nmiaction is allocated in this way, so it resides in a
non-present page. Then there is a page fault while the nmi code
accessing the nmiaction structure, which would then cause a
warning by WARN_ON_ONCE(in_nmi()) in kmemcheck_fault(), called
by do_page_fault().

This significantly simplifies the code as well, as the whole
dynamic allocation dance goes away.

v2: as Peter suggested, changed the nmiaction to use static
    storage.

v3: as Peter suggested, use macro to shorten the codes. Also
    keep the original usage of register_nmi_handler, so users of
    this call doesn't need change.

Tested-by: Seiji Aguchi <seiji.aguchi@hds.com>
Fixes: https://lkml.org/lkml/2012/3/2/356
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
[ simplified the wrappers ]
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: thomas.mingarelli@hp.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1333051877-15755-4-git-send-email-dzickus@redhat.com
[ tidied the patch a bit ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>

[hpwdt] add include NMI

Commit:
x86/nmi: Add new NMI queues to deal with IO_CHK and SERR
Added new NMI calls. But skipped to add nmi.h. Fix
merge commit here.
Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

be2net: Add functionality to support RoCE driver

- Increase MSI-X vectors by 5 for RoCE traffic.
- Add macro to check roce support on a device.
- Add device-specific doorbell and MSI-X vector fields shared with NIC
functionality.
- Provide RoCE driver registration and deregistration functions.
- Add support functions which will be invoked on adapter add/remove
and port up/down events.
- Traverse through the list of adapters to invoke callback functions.

Signed-off-by: Parav Pandit <parav.pandit@emulex.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Roland Dreier <roland@purestorage.com>

[igb] uek2 fix driver merge

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

[igb] update to version 3.4.8

Update driver from e1000.sf.net

ocfs2: use list_for_each_entry in ocfs2_find_local_alias()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit a614a092bf28d58c742b9ec43209f3f78c3d9fb3)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: fix NULL pointer dereference in __ocfs2_change_file_space()

As ocfs2_fallocate() will invoke __ocfs2_change_file_space() with a NULL
as the first parameter (file), it may trigger a NULL pointer dereferrence
due to a missing check.

Addresses http://bugs.launchpad.net/bugs/1006012

Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Reported-by: Bret Towe <magnade@gmail.com>
Tested-by: Bret Towe <magnade@gmail.com>
Cc: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit a4e08d001f2e50bb8b3c4eebadcf08e5535f02ee)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: Fix bogus error message from ocfs2_global_read_info

'status' variable in ocfs2_global_read_info() is always != 0 when leaving the
function because it happens to contain number of read bytes. Thus we always log
error message although everything is OK. Since all error cases properly call
mlog_errno() before jumping to out_err, there's no reason to call mlog_errno()
on exit at all. This is a fallout of c1e8d35e (conversion of mlog_exit()
calls).

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
(cherry picked from commit a4564ead763a9264edbec6d4e72aa273f05eb39c)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: Misplaced parens in unlikley

Fix misplaced parentheses

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
(cherry picked from commit 16865b7c42fbce8a4d2b278460e387e719e289cb)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs: simplify symlink handling

seeing that "fast" symlinks still get allocation + copy, we might as
well simply switch them to pagecache-based variant of ->follow_link();
just need an appropriate ->readpage() for them...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: kill endianness abuses in blockcheck.c

ocfs2_block_check is for little-endian contents; if we just want to
its fields converted to host-endian in a couple of functions, just
put those values into local u32 and u16...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 1db5df98faaf7aa6c25bc7d9703342d13678452a)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: deal with __user misannotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit f6a5690324d5ab9c33bbc0a6b4cc59c7fa34eeec)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: trivial endianness misannotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 8515841086d14594b24cdc8febdcc7fd1bbc313e)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: ->rl_count endianness breakage

le16, not le32...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 28748b325dc2d730ccc312830a91c4ae0c0d9379)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs: ->rl_used breakage on big-endian

it's le16, not le32 or le64...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: fix leaks on failure exits in module_init

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 342827d7d19cb52b562bb3efeb4d4b672d008c35)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

... and the same failure exits cleanup for ocfs2

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit be0d93f0aa5682a24a2a9ec0dd26fffaad608cce)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: remove the second argument of k[un]map_atomic()

Acked-by: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Cong Wang <amwang@redhat.com>
(cherry picked from commit c4bc8dcbbe7a7876d76e3f3e129a2ccec46d7cdb)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: deal with wraparounds of i_nlink in ocfs2_rename()

unfortunately, nlink_t may be smaller than 32 bits and ->i_nlink
on ocfs2 can grow up to 0xffffffff; storing it in nlink_t variable
will lose upper bits on such architectures. Needs to be made u32,
until we get kernel-side nlink_t uniformly 32bit...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 847c9db5cb50841589b8ebd3da0769b1b02fb3b2)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: propagate umode_t

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 67697cbdccb8b63eae892a9437bcc79d08b79578)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

dlmfs: use inode_init_owner()

don't open-code it...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 2b15ad068418a91687c2d5819c6c03c227d391f2)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: avoid unaligned access to dqc_bitmap

The dqc_bitmap field of struct ocfs2_local_disk_chunk is 32-bit aligned,
but not 64-bit aligned.  The dqc_bitmap is accessed by ocfs2_set_bit(),
ocfs2_clear_bit(), ocfs2_test_bit(), or ocfs2_find_next_zero_bit().  These
are wrapper macros for ext2_*_bit() which need to take an unsigned long
aligned address (though some architectures are able to handle unaligned
address correctly)

So some 64bit architectures may not be able to access the dqc_bitmap
correctly.

This avoids such unaligned access by using another wrapper functions for
ext2_*_bit().  The code is taken from fs/ext4/mballoc.c which also need to
handle unaligned bitmap access.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
(cherry picked from commit 939255798a468e1a92f03546de6e87be7b491e57)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: Use filemap_write_and_wait() instead of write_inode_now()

Since ocfs2 has no ->write_inode method, there's no point in calling
write_inode_now() from ocfs2_cleanup_delete_inode(). Use
filemap_write_and_wait() instead. This helps us to cleanup inode writing
interfaces...

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
(cherry picked from commit 249ec93c01db8898058899a80ffb537c8d27f86f)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: honor O_(D)SYNC flag in fallocate

We need to sync the transaction which updates i_size if the file is marked
as needing sync semantics.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
(cherry picked from commit df295d4a4b3c98af1a2445a82aef169e7e5d96b8)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2

With indexed_dir enabled, ocfs2 maintains a list of dirblocks having
space.

The credit calculation in ocfs2_link_credits() did not correctly account
for adding an entry that exactly fills a dirblock that triggers removing
that dirblock by changing the pointer in the previous block in the list.
The credit calculation did not account for that previous block.

To expose, do:

mkfs.ocfs2 -b 512 -M local /dev/sdX
mount /dev/sdX /ocfs2
mkdir /ocfs2/linkdir
touch /ocfs2/linkdir/file1
for i in `seq 1 29` ; do link /ocfs2/linkdir/file1
/ocfs2/linkdir/linklinklinklinklinklink$i; done
rm -f /ocfs2/linkdir/linklinklinklinklinklink10
sleep 8
link /ocfs2/linkdir/file1
/ocfs2/linkdir/linklinklinklinklinklinkaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

Note:
The link names have been crafted for a 512 byte blocksize. Reproducing
with a larger blocksize will require longer (or more) links. The sleep
is important. We want jbd2 to commit the transaction so that the missing
block does not piggy back on account of the previous transaction.

Signed-off-by: XiaoweiHu <xiaowei.hu at oracle.com>
Reviewed-by: WengangWang <wen.gang.wang at oracle.com>
Reviewed-by: Sunil.Mushran <sunil.mushran at oracle.com>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
(cherry picked from commit 0393afea31874947b1d149b82d17b7dccac4f210)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: Commit transactions in error cases -v2

There are three cases found that in error cases, journal transactions are not
committed nor aborted. We should take care of these case by committing the
transactions. Otherwise, there would left a journal handle which will lead to
, in same process context, the comming ocfs2_start_trans() gets wrong credits.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
(cherry picked from commit b8a0ae579fb8d9b21008ac386be08b9428902455)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: make direntry invalid when deleting it

When we deleting a direntry from a directory, if it's the first in a block we
invalid it by setting inode to 0; otherwise, we merge the deleted one to the
prior and contiguous direntry. And we don't truncate directories.

There is a problem for the later case since inode is not set to 0.
This problem happens when the caller passes a file position as parameter to
ocfs2_dir_foreach_blk(). If the position happens to point to a stale(not
the first, deleted in betweens of ocfs2_dir_foreach_blk()s) direntry, we are
not able to recognize its staleness. So that we treat it as a live one wrongly.

The fix is to set inode to 0 in both cases indicating the direntry is stale.
This won't introduce additional IOs.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
(cherry picked from commit 8298524803339a9a8df053ebdfebc2975ec55be9)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

fs/ocfs2/dlm/dlmlock.c: free kmem_cache_zalloc'd data using kmem_cache_free

Memory allocated using kmem_cache_zalloc should be freed using
kmem_cache_free, not kfree.

The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression x,e,e1,e2;
@@

x = kmem_cache_zalloc(e1,e2)
... when != x = e
?-kfree(x)
+kmem_cache_free(e1,x)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
(cherry picked from commit fc9f899483435935c1cd7005df29681929d1c99b)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: remove unnecessary nlink setting

alloc_inode() initializes i_nlink to 1. Remove unnecessary
re-initialization.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Joel Becker <jlbec@evilplan.org>
CC: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
(cherry picked from commit a7732b05f775a5575baac34c03bb0e8d16950edf)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: Fix ocfs2_page_mkwrite()

This patch address two shortcomings in ocfs2_page_mkwrite():
1. Makes the function return better VM_FAULT_* errors.
2. It handles a error that is triggered when a page is dropped from the mapping
due to memory pressure. This patch locks the page to prevent that.

[Patch was cleaned up by Sunil Mushran.]

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
(cherry picked from commit 5cffff9e29866a3de98c2c25135b3199491f93b0)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: Add comment about orphan scanning

Add a comment that explains the reason as to why orphan scan scans all the slots.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
(cherry picked from commit a035bff6b82aca89c1223e2c614adc2d17ec8aa2)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: Clean up messages in the fs

Convert useful messages from ML_NOTICE to KERN_NOTICE to improve readability.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
(cherry picked from commit 619c200de144b44f5e405305241bcd7edbb8c6cf)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: Clean up messages in stack_o2cb.c

o2cb messages needed a facelift.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
(cherry picked from commit 394eb3d38a3ecc549cc34a3040103a9164be516b)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2_init_acl(): fix a leak

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit c0d960f038bdfe0fa73c9f698ba836ed20b672c9)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: use proper little-endian bitops

Using __test_and_{set,clear}_bit_le() with ignoring its return value
can be replaced with __{set,clear}_bit_le().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: ocfs2-devel@oss.oracle.com
Signed-off-by: Joel Becker <jlbec@evilplan.org>
(cherry picked from commit 730e663bd82c1a10a85ff00728d34152a5a67ec8)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

ocfs2: checking the wrong variable in ocfs2_move_extent()

"new_phys_cpos" is always a valid pointer here.
ocfs2_probe_alloc_group() allocates "*new_phys_cpos".

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
(cherry picked from commit 3d75be7c4771c7e4d5b5fa586a599af8473de32c)

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>

e1000e: disable rxhash when try to enable jumbo frame also rxhash and rxcsum have enabled

commit ffd3d6 check if both rxhash and rxcsum enabled when enable jumbo
frames and disallowed all of them enabled at the same time.
Since jumbo frame widely be used in real world, and el5 did not supported
enable/disable rxhash, so we changed default behavior to disable rxhash
when try to enable jumbo frames also rxhash and rxcsum have enabled.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>
Acked-by: Adnan Misherfi <adnan.misherfi@oracle.com>

r8169: verbose error message.

(cherry picked from commit 82e316efbd1c68946c8760f930b81d73e9c4425a)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: remove rtl_ocpdr_cond.

It is not needed for mac_ocp_{write / read}. Actually bit 31 of OCPDR
does not change and r8168_mac_ocp_read always returns ~0.

(cherry picked from commit 3a83ad12b850c3c5b89fa9008bdd0c0782f0cf68)
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Tested-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: fix argument in rtl_hw_init_8168g.

(cherry picked from commit 5f8bcce99e83b1155954b1ae7291dc754ad9025e)
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: support RTL8168G

For RTL8111G, the settings of phy and firmware are replaced with
ocp functions. r8168g_mdio_{write / read} redirects the relative
settings to suitable ocp functions. A per-device variable is needed
to evaluate the real address of ocp functions.
rtl_writephy(tp, 0x1f, xxxx) is dedicated to keeping said variable
up-to-date.

(backported from upstream commit c558386b836ee97762e12495101c6e373f20e69d)
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: abstract out loop conditions.

Twelve functions can fail silently. Now they have a chance to complain.

Macro and pasting abuse has been kept at a level where tags and
friends should not be hurt.

(cherry picked from commit ffc46952b313ff037debca1b4e3da9472ff4b441)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: ephy, eri and efuse functions signature changes.

(cherry picked from commit fdf6fc067aaeb13aba89d1b56aa39d3bf06fde43)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: csi_ops signature change.

(cherry picked from commit 52989f0e429a97bc9075245e2e14ece2a4ebca5c)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: mdio_ops signature change.

Further changes need more context down in the call stack.

(cherry picked from commit 24192210a57a24a45b29dc3519dc42e073ea7b0a)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: add RTL8106E support.

(cherry picked from commit 5598bfe5191d09cdd622aeac39badc42508b227f)
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: RxConfig hack for the 8168evl.

The 8168evl (RTL_GIGA_MAC_VER_34) based Gigabyte GA-990FXA motherboards
are very prone to NETDEV watchdog problems without this change. See
https://bugzilla.kernel.org/show_bug.cgi?id=42899 for instance.

I don't know why it *works*. It's depressingly effective though.

For the record:
- the problem may go along IOMMU (AMD-Vi) errors but it really looks
  like a red herring.
- the patch sets the RX_MULTI_EN bit. If the 8168c doc is any guide,
  the chipset now fetches several Rx descriptors at a time.
- long ago the driver ignored the RX_MULTI_EN bit.
  e542a2269f232d61270ceddd42b73a4348dee2bb changed the RxConfig
  settings. Whatever the problem it's now labeled a regression.
- Realtek's own driver can identify two different 8168evl devices
  (CFG_METHOD_16 and CFG_METHOD_17) where the r8169 driver only
  sees one. It sucks.

(cherry picked from commit eb2dc35d99028b698cdedba4f5522bc43e576bd2)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: avoid NAPI scheduling delay.

While reworking the r8169 driver a few months ago to perform the
smallest amount of work in the irq handler, I took care of avoiding
any irq mask register operation in the slow work dedicated user
context thread. The slow work thread scheduled an extra round of NAPI
work which would ultimately set the irq mask register as required,
thus keeping such irq mask operations in the NAPI handler.
It would eventually race with the irq handler and delay NAPI execution
for - assuming no further irq - a whole ksoftirqd period. Mildly a
problem for rare link changes or corner case PCI events.

The race was always lost after the last bh disabling lock had been
removed from the work thread and people started wondering where those
pesky "NOHZ: local_softirq_pending 08" messages came from.

Actually the irq mask register _can_ be set up directly in the slow
work thread.

(cherry picked from commit 7dbb491878a2c51d372a8890fa45a8ff80358af1)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Marc Dionne <marc.c.dionne@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: call netif_napi_del at errpaths and at driver unload

when register_netdev fails, the init'ed NAPIs by netif_napi_add must be
deleted with netif_napi_del, and also when driver unloads, it should
delete the NAPI before unregistering netdevice using unregister_netdev.

(cherry picked from commit ad1be8d345416a794dea39761a374032aa471a76)
Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

8139cp/8139too: terminate the eeprom access with the right opmode

Currently, we terminate the eeprom access through clearing the CS by:

RTL_W8 (Cfg9346, ~EE_CS); or writeb (~EE_CS, ee_addr);

This would left the eeprom into "Config. Register Write Enable:"
state which is not expcted as the highest two bits were set to
0x11 ( expected is the "Normal" mode (0x00)). Solving this by write
0x0 instead of ~EE_CS when terminating the eeprom access.

(cherry picked from commit 0bc777bca480357941418952cf228484f5485daf)
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

8139cp: set ring address before enabling receiver

Currently, we enable the receiver before setting the ring address which could
lead the card DMA into unexpected areas. Solving this by set the ring address
before enabling the receiver.

btw. I find and test this in qemu as I didn't have a 8139cp card in hand. please
review it carefully.

(cherry picked from commit b01af4579ec41f48e9b9c774e70bd6474ad210db)
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: support the new RTL8411 chip.

Compared with previous chipsets, it needs no special action trough the
jumbo{enable/disable} helpers to operate with jumbo frames.

(cherry picked from commit b3d7b2f2f07ff0ab87442f2d499f2860ef59bfaa)
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: adjust some functions of 8111f

Put some settings of 8111f into one function which may be reused.

(cherry picked from commit 5f886e08901adaaaa1c79d1f964035aee6a29370)
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: support the new RTL8402 chip.

(cherry picked from commit 7e18dca16246b2891239cfc3c6e2dfcea715d353)
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: add device specific CSI access helpers.

New chipsets need it.

(cherry picked from commit beb1fe184f673fae83ddd9beca3fe662019ef876)
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: modify pll power function

Adjust r810x_pll_power_down, r810x_pll_power_up, and r8168_pll_power_up.
Always power up device during rtl_open. For r810x, turn off more power
when the WOL is disabled.

(cherry picked from commit 0004299ad41885a0a1fd321715fe7396be17ce35)
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: 8168c and later require bit 0x20 to be set in Config2 for PME signaling.

The new 84xx stopped flying below the radars.

(cherry picked from commit d387b427c973974dd619a33549c070ac5d0e089f)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: Config1 is read-only on 8168c and later.

Suggested by Hayes.

(cherry picked from commit 851e60221926a53344b4227879858bef841b0477)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

8139too: dev->{base_addr, irq} removal.

(backported from upstream commit 65712ec016788538d27c0b0452e57b751776914e)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

8139cp: stop using net_device.{base_addr, irq}.

(backported from upstream commit a69afe3263717ba9384cf18d05722c598f6820af)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169.c: fix comment typo

The below patch fixes a typo that I found while reading the code.

(cherry picked from commit a9d7e794ea66902a255be6e87f633286d04c2b39)
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: move rtl_cfg_info closer to its caller.

(cherry picked from commit 31fa8b1855cb1f1fd99e2f2f9b8f2c8f113e9f2e)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: move the netpoll handler after the irq handler.

(cherry picked from commit dc1c00ce70da5d3bb3fc97707e04f598ff72e7ba)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>

r8169: move rtl8169_open after rtl_task it depends on.

(backported from upstream commit df43ac7831a0e321b6b183b7eb48ae4577207453)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>