www.infradead.org Git - users/jedix/linux-maple.git/log

mm: compaction: make isolate_lru_page() filter-aware

commit 39deaf8585152f1a35c1676d3d7dc6ae0fb65967 upstream.

Stable note: Not tracked in Bugzilla. THP and compaction disrupt the LRU
list leading to poor reclaim decisions which has a variable
performance impact.

In async mode, compaction doesn't migrate dirty or writeback pages. So,
it's meaningless to pick the page and re-add it to lru list.

Of course, when we isolate the page in compaction, the page might be dirty
or writeback but when we try to migrate the page, the page would be not
dirty, writeback. So it could be migrated. But it's very unlikely as
isolate and migration cycle is much faster than writeout.

So, this patch helps cpu overhead and prevent unnecessary LRU churning.

Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 19faec0520b3b16dfd58cde30938a3c4d3dcdd5b)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

mm: change isolate mode from #define to bitwise type

commit 4356f21d09283dc6d39a6f7287a65ddab61e2808 upstream.

Stable note: Not tracked in Bugzilla. This patch makes later patches
easier to apply but has no other impact.

Change ISOLATE_XXX macro with bitwise isolate_mode_t type. Normally,
macro isn't recommended as it's type-unsafe and making debugging harder as
symbol cannot be passed throught to the debugger.

Quote from Johannes
" Hmm, it would probably be cleaner to fully convert the isolation mode
into independent flags. INACTIVE, ACTIVE, BOTH is currently a
tri-state among flags, which is a bit ugly."

This patch moves isolate mode from swap.h to mmzone.h by memcontrol.h

Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a15a3971cc49eefbde40b397a446c0fa9c5fed9c)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

mm: compaction: trivial clean up in acct_isolated()

commit b9e84ac1536d35aee03b2601f19694949f0bd506 upstream.

Stable note: Not tracked in Bugzilla. This patch makes later patches
easier to apply but has no other impact.

acct_isolated of compaction uses page_lru_base_type which returns only
base type of LRU list so it never returns LRU_ACTIVE_ANON or
LRU_ACTIVE_FILE. In addtion, cc->nr_[anon|file] is used in only
acct_isolated so it doesn't have fields in conpact_control.

This patch removes fields from compact_control and makes clear function of
acct_issolated which counts the number of anon|file pages isolated.

Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f665a680f89357a6773fb97684690c76933888f6)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

vmscan: abort reclaim/compaction if compaction can proceed

commit e0c23279c9f800c403f37511484d9014ac83adec upstream.

Stable note: Not tracked on Bugzilla. THP and compaction was found to
aggressively reclaim pages and stall systems under different
situations that was addressed piecemeal over time.

If compaction can proceed, shrink_zones() stops doing any work but its
callers still call shrink_slab() which raises the priority and potentially
sleeps. This is unnecessary and wasteful so this patch aborts direct
reclaim/compaction entirely if compaction can proceed.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Johannes Weiner <jweiner@redhat.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 4682e89d1455d66e2536d9efb2875d61a1f1f294)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

vmscan: limit direct reclaim for higher order allocations

commit e0887c19b2daa140f20ca8104bdc5740f39dbb86 upstream.

Stable note: Not tracked on Bugzilla. THP and compaction was found to
aggressively reclaim pages and stall systems under different
situations that was addressed piecemeal over time.  Paragraph
3 of this changelog is the motivation for this patch.

When suffering from memory fragmentation due to unfreeable pages, THP page
faults will repeatedly try to compact memory.  Due to the unfreeable
pages, compaction fails.

Needless to say, at that point page reclaim also fails to create free
contiguous 2MB areas.  However, that doesn't stop the current code from
trying, over and over again, and freeing a minimum of 4MB (2UL <<
sc->order pages) at every single invocation.

This resulted in my 12GB system having 2-3GB free memory, a corresponding
amount of used swap and very sluggish response times.

This can be avoided by having the direct reclaim code not reclaim from
zones that already have plenty of free memory available for compaction.

If compaction still fails due to unmovable memory, doing additional
reclaim will only hurt the system, not help.

[jweiner@redhat.com: change comment to explain the order check]
Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <jweiner@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Johannes Weiner <jweiner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 4d4724067d512e7f17010112da8ec64917c192e7)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

vmscan: reduce wind up shrinker->nr when shrinker can't do work

commit 3567b59aa80ac4417002bf58e35dce5c777d4164 upstream.

Stable note: Not tracked in Bugzilla. This patch reduces excessive
reclaim of slab objects reducing the amount of information that
has to be brought back in from disk. The third and fourth paragram
in the series describes the impact.

When a shrinker returns -1 to shrink_slab() to indicate it cannot do
any work given the current memory reclaim requirements, it adds the
entire total_scan count to shrinker->nr. The idea ehind this is that
whenteh shrinker is next called and can do work, it will do the work
of the previously aborted shrinker call as well.

However, if a filesystem is doing lots of allocation with GFP_NOFS
set, then we get many, many more aborts from the shrinkers than we
do successful calls. The result is that shrinker->nr winds up to
it's maximum permissible value (twice the current cache size) and
then when the next shrinker call that can do work is issued, it
has enough scan count built up to free the entire cache twice over.

This manifests itself in the cache going from full to empty in a
matter of seconds, even when only a small part of the cache is
needed to be emptied to free sufficient memory.

Under metadata intensive workloads on ext4 and XFS, I'm seeing the
VFS caches increase memory consumption up to 75% of memory (no page
cache pressure) over a period of 30-60s, and then the shrinker
empties them down to zero in the space of 2-3s. This cycle repeats
over and over again, with the shrinker completely trashing the inode
and dentry caches every minute or so the workload continues.

This behaviour was made obvious by the shrink_slab tracepoints added
earlier in the series, and made worse by the patch that corrected
the concurrent accounting of shrinker->nr.

To avoid this problem, stop repeated small increments of the total
scan value from winding shrinker->nr up to a value that can cause
the entire cache to be freed. We still need to allow it to wind up,
so use the delta as the "large scan" threshold check - if the delta
is more than a quarter of the entire cache size, then it is a large
scan and allowed to cause lots of windup because we are clearly
needing to free lots of memory.

If it isn't a large scan then limit the total scan to half the size
of the cache so that windup never increases to consume the whole
cache. Reducing the total scan limit further does not allow enough
wind-up to maintain the current levels of performance, whilst a
higher threshold does not prevent the windup from freeing the entire
cache under sustained workloads.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7554e3446a916363447a81a29f9300d3f2fbf503)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

vmscan: shrinker->nr updates race and go wrong

commit acf92b485cccf028177f46918e045c0c4e80ee10 upstream.

Stable note: Not tracked in Bugzilla. This patch reduces excessive
reclaim of slab objects reducing the amount of information
that has to be brought back in from disk.

shrink_slab() allows shrinkers to be called in parallel so the
struct shrinker can be updated concurrently. It does not provide any
exclusio for such updates, so we can get the shrinker->nr value
increasing or decreasing incorrectly.

As a result, when a shrinker repeatedly returns a value of -1 (e.g.
a VFS shrinker called w/ GFP_NOFS), the shrinker->nr goes haywire,
sometimes updating with the scan count that wasn't used, sometimes
losing it altogether. Worse is when a shrinker does work and that
update is lost due to racy updates, which means the shrinker will do
the work again!

Fix this by making the total_scan calculations independent of
shrinker->nr, and making the shrinker->nr updates atomic w.r.t. to
other updates via cmpxchg loops.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6a5091a09f9278f8f821e3f33ac748633d143cea)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

vmscan: add shrink_slab tracepoints

commit 095760730c1047c69159ce88021a7fa3833502c8 upstream.

Stable note: This patch makes later patches easier to apply but otherwise
        has little to justify it. It is a diagnostic patch that was part
        of a series addressing excessive slab shrinking after GFP_NOFS
        failures. There is detailed information on the series' motivation
        at https://lkml.org/lkml/2011/6/2/42 .

It is impossible to understand what the shrinkers are actually doing
without instrumenting the code, so add a some tracepoints to allow
insight to be gained.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mel Gorman <mgorman@suse.de>
(cherry picked from commit 5e5b3d2ed3aee6f8bbe38c0945876aacce11ff03)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

vmscan: clear ZONE_CONGESTED for zone with good watermark

commit 439423f6894aa0dec22187526827456f5004baed upstream.

Stable note: Not tracked in Bugzilla. kswapd is responsible for clearing
ZONE_CONGESTED after it balances a zone and this patch fixes a bug
where that was failing to happen. Without this patch, processes
can stall in wait_iff_congested unnecessarily. For users, this can
look like an interactivity stall but some workloads would see it
as sudden drop in throughput.

ZONE_CONGESTED is only cleared in kswapd, but pages can be freed in any
task.  It's possible ZONE_CONGESTED isn't cleared in some cases:

1. the zone is already balanced just entering balance_pgdat() for
    order-0 because concurrent tasks free memory.  In this case, later
    check will skip the zone as it's balanced so the flag isn't cleared.

2. high order balance fallbacks to order-0.  quote from Mel: At the
    end of balance_pgdat(), kswapd uses the following logic;

If reclaiming at high order {
for each zone {
if all_unreclaimable
skip
if watermark is not met
order = 0
loop again

/* watermark is met */
clear congested
}
}

    i.e. it clears ZONE_CONGESTED if it the zone is balanced.  if not,
    it restarts balancing at order-0.  However, if the higher zones are
    balanced for order-0, kswapd will miss clearing ZONE_CONGESTED as
    that only happens after a zone is shrunk.  This can mean that
    wait_iff_congested() stalls unnecessarily.

This patch makes kswapd clear ZONE_CONGESTED during its initial
highmem->dma scan for zones that are already balanced.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 564ea9dd5ab042cb2fe8373f4d627073706e1d4f)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

mm: vmscan: fix force-scanning small targets without swap

commit a4d3e9e76337059406fcf3ead288c0df22a790e9 upstream.

Stable note: Not tracked in Bugzilla. This patch augments an earlier commit
that avoids scanning priority being artificially raised. The older
fix was particularly important for small memcgs to avoid calling
wait_iff_congested() unnecessarily.

Without swap, anonymous pages are not scanned. As such, they should not
count when considering force-scanning a small target if there is no swap.

Otherwise, targets are not force-scanned even when their effective scan
number is zero and the other conditions--kswapd/memcg--apply.

This fixes 246e87a93934 ("memcg: fix get_scan_count() for small
targets").

[akpm@linux-foundation.org: fix comment]
Signed-off-by: Johannes Weiner <jweiner@redhat.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Ying Han <yinghan@google.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 33c17eafdeefb08fbb6ded946abcf024f76c9615)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

mm: memory hotplug: Check if pages are correctly reserved on a per-section basis

commit 2bbcb8788311a40714b585fc11b51da6ffa2ab92 upstream.

Stable note: Fixes https://bugzilla.novell.com/show_bug.cgi?id=721039 .
Without the patch, memory hot-add can fail for kernel configurations
that do not set CONFIG_SPARSEMEM_VMEMMAP.

(Resending as I am not seeing it in -next so maybe it got lost)

mm: memory hotplug: Check if pages are correctly reserved on a per-section basis

It is expected that memory being brought online is PageReserved
similar to what happens when the page allocator is being brought up.
Memory is onlined in "memory blocks" which consist of one or more
sections. Unfortunately, the code that verifies PageReserved is
currently assuming that the memmap backing all these pages is virtually
contiguous which is only the case when CONFIG_SPARSEMEM_VMEMMAP is set.
As a result, memory hot-add is failing on those configurations with
the message;

kernel: section number XXX page number 256 not reserved, was it already online?

This patch updates the PageReserved check to lookup struct page once
per section to guarantee the correct struct page is being checked.

[Check pages within sections properly: rientjes@google.com]
[original patch by: nfont@linux.vnet.ibm.com]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
(cherry picked from commit 1126e70953638f9516b6a0b96385799c708815e4)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

dm raid1: fix crash with mirror recovery and discard

commit 751f188dd5ab95b3f2b5f2f467c38aae5a2877eb upstream.

This patch fixes a crash when a discard request is sent during mirror
recovery.

Firstly, some background.  Generally, the following sequence happens during
mirror synchronization:
- function do_recovery is called
- do_recovery calls dm_rh_recovery_prepare
- dm_rh_recovery_prepare uses a semaphore to limit the number
  simultaneously recovered regions (by default the semaphore value is 1,
  so only one region at a time is recovered)
- dm_rh_recovery_prepare calls __rh_recovery_prepare,
  __rh_recovery_prepare asks the log driver for the next region to
  recover. Then, it sets the region state to DM_RH_RECOVERING. If there
  are no pending I/Os on this region, the region is added to
  quiesced_regions list. If there are pending I/Os, the region is not
  added to any list. It is added to the quiesced_regions list later (by
  dm_rh_dec function) when all I/Os finish.
- when the region is on quiesced_regions list, there are no I/Os in
  flight on this region. The region is popped from the list in
  dm_rh_recovery_start function. Then, a kcopyd job is started in the
  recover function.
- when the kcopyd job finishes, recovery_complete is called. It calls
  dm_rh_recovery_end. dm_rh_recovery_end adds the region to
  recovered_regions or failed_recovered_regions list (depending on
  whether the copy operation was successful or not).

The above mechanism assumes that if the region is in DM_RH_RECOVERING
state, no new I/Os are started on this region. When I/O is started,
dm_rh_inc_pending is called, which increases reg->pending count. When
I/O is finished, dm_rh_dec is called. It decreases reg->pending count.
If the count is zero and the region was in DM_RH_RECOVERING state,
dm_rh_dec adds it to the quiesced_regions list.

Consequently, if we call dm_rh_inc_pending/dm_rh_dec while the region is
in DM_RH_RECOVERING state, it could be added to quiesced_regions list
multiple times or it could be added to this list when kcopyd is copying
data (it is assumed that the region is not on any list while kcopyd does
its jobs). This results in memory corruption and crash.

There already exist bypasses for REQ_FLUSH requests: REQ_FLUSH requests
do not belong to any region, so they are always added to the sync list
in do_writes. dm_rh_inc_pending does not increase count for REQ_FLUSH
requests. In mirror_end_io, dm_rh_dec is never called for REQ_FLUSH
requests. These bypasses avoid the crash possibility described above.

These bypasses were improperly implemented for REQ_DISCARD when
the mirror target gained discard support in commit
5fc2ffeabb9ee0fc0e71ff16b49f34f0ed3d05b4 (dm raid1: support discard).

In do_writes, REQ_DISCARD requests is always added to the sync queue and
immediately dispatched (even if the region is in DM_RH_RECOVERING).  However,
dm_rh_inc and dm_rh_dec is called for REQ_DISCARD resusts.  So it violates the
rule that no I/Os are started on DM_RH_RECOVERING regions, and causes the list
corruption described above.

This patch changes it so that REQ_DISCARD requests follow the same path
as REQ_FLUSH. This avoids the crash.

Reference: https://bugzilla.redhat.com/837607

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit fbb41f55c42a4f4e708c9e9af926dc6227a5b52d)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

UBIFS: fix a bug in empty space fix-up

commit c6727932cfdb13501108b16c38463c09d5ec7a74 upstream.

UBIFS has a feature called "empty space fix-up" which is a quirk to work-around
limitations of dumb flasher programs. Namely, of those flashers that are unable
to skip NAND pages full of 0xFFs while flashing, resulting in empty space at
the end of half-filled eraseblocks to be unusable for UBIFS. This feature is
relatively new (introduced in v3.0).

The fix-up routine (fixup_free_space()) is executed only once at the very first
mount if the superblock has the 'space_fixup' flag set (can be done with -F
option of mkfs.ubifs). It basically reads all the UBIFS data and metadata and
writes it back to the same LEB. The routine assumes the image is pristine and
does not have anything in the journal.

There was a bug in 'fixup_free_space()' where it fixed up the log incorrectly.
All but one LEB of the log of a pristine file-system are empty. And one
contains just a commit start node. And 'fixup_free_space()' just unmapped this
LEB, which resulted in wiping the commit start node. As a result, some users
were unable to mount the file-system next time with the following symptom:

UBIFS error (pid 1): replay_log_leb: first log node at LEB 3:0 is not CS node
UBIFS error (pid 1): replay_log_leb: log error detected while replaying the log at LEB 3:0

The root-cause of this bug was that 'fixup_free_space()' wrongly assumed
that the beginning of empty space in the log head (c->lhead_offs) was known
on mount. However, it is not the case - it was always 0. UBIFS does not store
in it the master node and finds out by scanning the log on every mount.

The fix is simple - just pass commit start node size instead of 0 to
'fixup_leb()'.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com>
Reported-by: Iwo Mergler <Iwo.Mergler@netcommwireless.com>
Tested-by: Iwo Mergler <Iwo.Mergler@netcommwireless.com>
Reported-by: James Nute <newten82@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit cd050f56481c26e8f2a1d2fc89188d6c92537545)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

mm: fix lost kswapd wakeup in kswapd_stop()

commit 1c7e7f6c0703d03af6bcd5ccc11fc15d23e5ecbe upstream.

Offlining memory may block forever, waiting for kswapd() to wake up
because kswapd() does not check the event kthread->should_stop before
sleeping.

The proper pattern, from Documentation/memory-barriers.txt, is:

   ---  waker  ---
   event_indicated = 1;
   wake_up_process(event_daemon);

   ---  sleeper  ---
   for (;;) {
      set_current_state(TASK_UNINTERRUPTIBLE);
      if (event_indicated)
         break;
      schedule();
   }

   set_current_state() may be wrapped by:
      prepare_to_wait();

In the kswapd() case, event_indicated is kthread->should_stop.

  === offlining memory (waker) ===
   kswapd_stop()
      kthread_stop()
         kthread->should_stop = 1
         wake_up_process()
         wait_for_completion()

  ===  kswapd_try_to_sleep (sleeper) ===
   kswapd_try_to_sleep()
      prepare_to_wait()
           .
           .
      schedule()
           .
           .
      finish_wait()

The schedule() needs to be protected by a test of kthread->should_stop,
which is wrapped by kthread_should_stop().

Reproducer:
   Do heavy file I/O in background.
   Do a memory offline/online in a tight loop

Signed-off-by: Aaditya Kumar <aaditya.kumar@ap.sony.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6d40de834ce9bb2964a70b7f91a98406eceb0399)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

ntp: Fix STA_INS/DEL clearing bug

commit 6b1859dba01c7d512b72d77e3fd7da8354235189 upstream.

In commit 6b43ae8a619d17c4935c3320d2ef9e92bdeed05d, I
introduced a bug that kept the STA_INS or STA_DEL bit
from being cleared from time_status via adjtimex()
without forcing STA_PLL first.

Usually once the STA_INS is set, it isn't cleared
until the leap second is applied, so its unlikely this
affected anyone. However during testing I noticed it
took some effort to cancel a leap second once STA_INS
was set.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1342156917-25092-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit dccecc646f06f06db8c32fc6615fee847852cec6)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

cifs: always update the inode cache with the results from a FIND_*

commit cd60042cc1392e79410dc8de9e9c1abb38a29e57 upstream.

When we get back a FIND_FIRST/NEXT result, we have some info about the
dentry that we use to instantiate a new inode. We were ignoring and
discarding that info when we had an existing dentry in the cache.

Fix this by updating the inode in place when we find an existing dentry
and the uniqueid is the same.

Reported-and-Tested-by: Andrew Bartlett <abartlet@samba.org>
Reported-by: Bill Robertson <bill_robertson@debortoli.com.au>
Reported-by: Dion Edwards <dion_edwards@debortoli.com.au>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit adccea444c2df5660fff32fe75563075b7d237f7)

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

SPEC: v2.6.39-300.9.0

Signed-off-by: Joe Jin <joe.jin@oracle.com>

cciss: Update HPSA_BOUNDARY.

Orabug: 14681165
When reverted commit 06a315b, did not update the HPSA_BOUNDARY, this made some
device may not worked, if pass cciss_allow_hpsa=1 to cciss driver.

Signed-off-by: Joe Jin <joe.jin@oracle.com>

SPEC: v2.6.39-300.8.0
Signed-off-by: Guru Anbalagane <guru.anbalagane@oracle.com>

ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path

from ocfs2-devel
Reported-and-Tested-by: Vincent Etienne <vetienne@aprogsys.com>
Signed-off-by: Sunil Mushran <sunil.mushran@gmail.com>

SPEC: v2.6.39-300.7.0

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>

Merge branch 'uek2-merge' of git://ca-git.us.oracle.com/linux-konrad-public

Merge branch 'uek2-2.6.39-300-lpfc-update' of git://ca-git.us.oracle.com/linux-snits-public

htrimer: fix kabi break.

Signed-off-by: Joe Jin <joe.jin@oracle.com>

timekeeping: Add missing update call in timekeeping_resume()

This is a backport of 3e997130bd2e8c6f5aaa49d6e3161d4d29b43ab0

The leap second rework unearthed another issue of inconsistent data.

On timekeeping_resume() the timekeeper data is updated, but nothing
calls timekeeping_update(), so now the update code in the timer
interrupt sees stale values.

This has been the case before those changes, but then the timer
interrupt was using stale data as well so this went unnoticed for quite
some time.

Add the missing update call, so all the data is consistent everywhere.

Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Reported-and-tested-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Reported-and-tested-by: Martin Steigerwald <Martin@lichtvoll.de>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 0851978b661f25192ff763289698f3175b1bab42)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

hrtimer: Update hrtimer base offsets each hrtimer_interrupt

This is a backport of 5baefd6d84163443215f4a99f6a20f054ef11236

The update of the hrtimer base offsets on all cpus cannot be made
atomically from the timekeeper.lock held and interrupt disabled region
as smp function calls are not allowed there.

clock_was_set(), which enforces the update on all cpus, is called
either from preemptible process context in case of do_settimeofday()
or from the softirq context when the offset modification happened in
the timer interrupt itself due to a leap second.

In both cases there is a race window for an hrtimer interrupt between
dropping timekeeper lock, enabling interrupts and clock_was_set()
issuing the updates. Any interrupt which arrives in that window will
see the new time but operate on stale offsets.

So we need to make sure that an hrtimer interrupt always sees a
consistent state of time and offsets.

ktime_get_update_offsets() allows us to get the current monotonic time
and update the per cpu hrtimer base offsets from hrtimer_interrupt()
to capture a consistent state of monotonic time and the offsets. The
function replaces the existing ktime_get() calls in hrtimer_interrupt().

The overhead of the new function vs. ktime_get() is minimal as it just
adds two store operations.

This ensures that any changes to realtime or boottime offsets are
noticed and stored into the per-cpu hrtimer base structures, prior to
any hrtimer expiration and guarantees that timers are not expired early.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-8-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit bb6ed34f2a6eeb40608b8ca91f3ec90ec9dca26f)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

timekeeping: Provide hrtimer update function

This is a backport of f6c06abfb3972ad4914cef57d8348fcb2932bc3b

To finally fix the infamous leap second issue and other race windows
caused by functions which change the offsets between the various time
bases (CLOCK_MONOTONIC, CLOCK_REALTIME and CLOCK_BOOTTIME) we need a
function which atomically gets the current monotonic time and updates
the offsets of CLOCK_REALTIME and CLOCK_BOOTTIME with minimalistic
overhead. The previous patch which provides ktime_t offsets allows us
to make this function almost as cheap as ktime_get() which is going to
be replaced in hrtimer_interrupt().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/1341960205-56738-7-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 22f4bbcfb131e2392c78ad67af35fdd436d4dd54)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

hrtimers: Move lock held region in hrtimer_interrupt()

This is a backport of 196951e91262fccda81147d2bcf7fdab08668b40

We need to update the base offsets from this code and we need to do
that under base->lock. Move the lock held region around the
ktime_get() calls. The ktime_get() calls are going to be replaced with
a function which gets the time and the offsets atomically.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/1341960205-56738-6-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6c89f2ce05ea7e26a7580ad9eb950f2c4f10891b)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

timekeeping: Maintain ktime_t based offsets for hrtimers

This is a backport of 5b9fe759a678e05be4937ddf03d50e950207c1c0

We need to update the hrtimer clock offsets from the hrtimer interrupt
context. To avoid conversions from timespec to ktime_t maintain a
ktime_t based representation of those offsets in the timekeeper. This
puts the conversion overhead into the code which updates the
underlying offsets and provides fast accessible values in the hrtimer
interrupt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-4-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 03a90b9a6f7eec70edde4eb1f88fa8a5c058d85e)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

timekeeping: Fix leapsecond triggered load spike issue

This is a backport of 4873fa070ae84a4115f0b3c9dfabc224f1bc7c51

The timekeeping code misses an update of the hrtimer subsystem after a
leap second happened. Due to that timers based on CLOCK_REALTIME are
either expiring a second early or late depending on whether a leap
second has been inserted or deleted until an operation is initiated
which causes that update. Unless the update happens by some other
means this discrepancy between the timekeeping and the hrtimer data
stays forever and timers are expired either early or late.

The reported immediate workaround - $ data -s "`date`" - is causing a
call to clock_was_set() which updates the hrtimer data structures.
See: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix

Add the missing clock_was_set() call to update_wall_time() in case of
a leap second event. The actual update is deferred to softirq context
as the necessary smp function call cannot be invoked from hard
interrupt context.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-3-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit d21e4baf4523fec26e3c70cb78b013ad3b245c83)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

hrtimer: Provide clock_was_set_delayed()

This is a backport of f55a6faa384304c89cfef162768e88374d3312cb

clock_was_set() cannot be called from hard interrupt context because
it calls on_each_cpu().

For fixing the widely reported leap seconds issue it is necessary to
call it from hard interrupt context, i.e. the timer tick code, which
does the timekeeping updates.

Provide a new function which denotes it in the hrtimer cpu base
structure of the cpu on which it is called and raise the hrtimer
softirq. We then execute the clock_was_set() notificiation from
softirq context in run_hrtimer_softirq(). The hrtimer softirq is
rarely used, so polling the flag there is not a performance issue.

[ tglx: Made it depend on CONFIG_HIGH_RES_TIMERS. We really should get
rid of all this ifdeffery ASAP ]

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1341960205-56738-2-git-send-email-johnstul@us.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 62b787f886e2d96cc7c5428aeee05dbe32a9531b)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

time: Move common updates to a function

This is a backport of cc06268c6a87db156af2daed6e96a936b955cc82

While not a bugfix itself, it allows following fixes to backport
in a more straightforward manner.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c7e2580578671c4d19a1a83e6fdb2482cc136283)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond

This is a backport of fad0c66c4bb836d57a5f125ecd38bed653ca863a
which resolves a bug the previous commit.

Commit 6b43ae8a61 (ntp: Fix leap-second hrtimer livelock) broke the
leapsecond update of CLOCK_MONOTONIC. The missing leapsecond update to
wall_to_monotonic causes discontinuities in CLOCK_MONOTONIC.

Adjust wall_to_monotonic when NTP inserted a leapsecond.

Reported-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Richard Cochran <richardcochran@gmail.com>
Link: http://lkml.kernel.org/r/1338400497-12420-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c33f2424c3941986d402c81d380d4e805870a20f)
Conflicts:
kernel/time/timekeeping.c

Signed-off-by: Joe Jin <joe.jin@oracle.com>

ntp: Correct TAI offset during leap second

This is a backport of dd48d708ff3e917f6d6b6c2b696c3f18c019feed

When repeating a UTC time value during a leap second (when the UTC
time should be 23:59:60), the TAI timescale should not stop. The kernel
NTP code increments the TAI offset one second too late. This patch fixes
the issue by incrementing the offset during the leap second itself.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 96bab736bad82423c2b312d602689a9078481fa9)

Signed-off-by: Joe Jin <joe.jin@oracle.com>

Revert "3.0.x: hrtimer: Fix clock_was_set so it is safe to call from irq context"

This reverts commit c51e012012e48ca262d4b489e33bc113bb5ac74d.

Revert "3.0.x: time: Fix leapsecond triggered hrtimer/futex load spike issue"

This reverts commit aac67aba83c32bd03f4b59bdd932a076afbee089.

Revert "3.0.x: hrtimer: Update hrtimer base offsets each hrtimer_interrupt"

This reverts commit 54b16ee687c86dfd6c94e49bdaa1535a3bf3cc9f.

scsi/lpfc: Resolve spinlock issue

Modify lpfc_scsi_cmd_iocb_cmpl() and lpfc_abort_handler() to use
spin_lock_irqsave/spin_unlock_irqrestore to lock phba->hbalock.

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Update lpfc version for 8.3.5.82.2p driver release

commit id: None

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix null pointer error for piocbq

commit id: a629852ab810015223eec7a2f31a6bd5f93c83cf

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Add missing jumps to mempool_free to fix potential memory leak

commit id: 4f4c18634d2a05079194ba333c7882349f25d6f7

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed leaking memory from pci dma pool

commit id: http://marc.info/?l=linux-scsi&m=134496910830011

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Logged XRI of the SCSI command to be aborted on abort handler timeout

commit id: http://marc.info/?l=linux-scsi&m=134496908630003

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix bug with driver logging too many fcp underrun messages

commit id: http://marc.info/?l=linux-scsi&m=134496907930001

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed unnecessary SCSI device reset escalation due to LLD handling of I/O abort

commit id: http://marc.info/?l=linux-scsi&m=134496910030010

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed system panic due to midlayer abort and driver complete race on SCSI cmd

commit id: 4f2e66c6d225a14fcf77d826fe71f6137cb27352

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix unable to create vports on FCoE SLI4 adapter

commit id: a7dd9c0f44966b4328b52c5e32f8c3345e3482e5

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix BlockGuard lpfc_printf_vlog messages

commit id: http://marc.info/?l=linux-scsi&m=134496906329998

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix parameter field in CQE to mask for LOCAL_REJECT status

commit id: http://marc.info/?l=linux-scsi&m=134496905829996

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed new requirement compatibility with Resource and Capacity Descriptors

commit id: http://marc.info/?l=linux-scsi&m=134496904429993

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed incomplete list of SLI4 commands with extended 300 second timeout value

commit id: http://marc.info/?l=linux-scsi&m=134401210706394

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix switching ports on Fabric causing additional fc_host rport entries

commit id: http://marc.info/?l=linux-scsi&m=134401198306327

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix conflicts in log message numbers

commit id: http://marc.info/?l=linux-scsi&m=134401198306327

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed kernel panic after scsi_eh escalation by checking the proper return status

commit id: http://marc.info/?l=linux-scsi&m=134401197206322

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix driver not checking data transfered on write commands

commit id: http://marc.info/?l=linux-scsi&m=134401196206316

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix bug with message 2520 appearing in the messages file

commit id: http://marc.info/?l=linux-scsi&m=134401194806312

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix bug with rrq_pool not being destroyed during driver removal

commit id: http://marc.info/?l=linux-scsi&m=134401193306311

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix Driver not attaching to OCe14000 adapters

commit id: http://marc.info/?l=linux-scsi&m=134401245106524

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix bug with driver not setting the diag set valid bit for loopback testing

commit id: http://marc.info/?l=linux-scsi&m=134401185006275

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix bug with driver does not reporting misconfigured ports for Ganymede

commit id: 4b8bae08b296a1199ef40f21ea7f4685b2c56ec7

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix System Panic During IO Test using Medusa tool

commit id: 6b415f5d6c05eb7f4808e98baf539c5dbc53cdbc

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix fcp_imax module parameter to dynamically change FCP EQ delay multiplier

commit id: 173edbb2c326ce4839bae8caa868fe83ce46dda3

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix successful aborts returning incorrect status

commit id: 3a70730aa06c37d46086ecdbca7107531fe2d2c5

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed system held-up when performing resource provsion through same PCI function

commit id: 618a5230b8fa62bc7901b8b754b4379b3fcfa0f9

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed debug helper routine failed to dump CQ and EQ entries in non-MSI-X mode

commit id: 3b3da6a974357887c73c5ee61988dbe3a8f62d88

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed system crash due to not providing SCSI error-handling host reset handler

commit id: 27b01b821f136e657c28078007a865a307816c1a

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix bug with driver using the wrong xritag when sending an els echo

commit id: 93d1379e6924daef1968779d97c46ba2e0915fd2

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Increment capability to dump various SLI4 queues via debug helper routines

commit id: 809c75368d94d73c1fb4f1e6e3578ae3b5b72b1c

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix unsol abts xri lookup

commit id: ee0f4fe17b0fda87c7f4eb3ec6e96ef8291419bd

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Bug fixes for LPe16000 to LPe16000 discovery (CR 130446)

commit id: 939723a4a680a7863fc95179b1480c5529f31d88

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Reregister VPI for SLI3 after cable moved to new 8Gb FC Adapter port

commit id: 27aa1b73539f2c7118a68c9baaad590d3a92462f

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix driver crash during back-to-back ramp events

commit id: 75ad83a452116c00c092bdc4c842c4401cd24080

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix log message 2597 displayed when no error is detected

commit id: cc459f19e32bdc783f9f0ce5c872c1ff399e3e82

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Address FCP LOG support for Finisar trace correlation

commit id: 5a0d80fc0dd3c134d42df34e66e0f5fc91261b53

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix kernel panic when going into to sleep state

commit id: 043c956f50ee9e19a02a681cdf198b0b964cf772

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix error message displayed even when not an error

commit id: 81378052645b137e9973aa5e5b2bc0ddd69023d8

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix Read Link status data

commit id: 37db57e32bd1b00170fdd38ab36a7f2acdd7557c

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix initiator sending flogi after acking flogi from target

commit id: e64464391d39b69c950d3645f001eb1af7a8bfd0

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix bug with driver not supporting the get controller attributes command

commit id: b99570dd63757834cd0c21e1b117c857af90a04a

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Incremented capability for handling SLI4-port XRI resource-provisioning profile change

commit id: 8a9d2e8003040d2e1cd24ac5e83bb30b68f7f488

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Sync driver base with upstream code

commit id: None

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Change default DA_ID support from disabled to enabled

commit id: cf9712403f384f9e832f489e7f41ab535c8f1a74

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix bug with driver unload leaving a scsi host for a vport around

commit id: bdcd2b926192c7f690a9cb4fb2de30eb820983fc

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Incremented capability for T10 DIF debugfs error injection (CR 123966)

commit id: 4ac9b22625333f9d86c01df702c83d2dfe732131

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Update copyright date for files modified in 2012

commit id: d85296cfddb0a4702bc9b05a6f288516b0adb6ba

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Refine T10 DIF debugfs error injection capability for verification usage (CR 123966)

commit id: acd6859b084d1e1b3ec8bc9befe6532223260d33

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Update copyright date for files modified in 2012

commit id: d85296cfddb0a4702bc9b05a6f288516b0adb6ba

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Make BA_ACC work on a fully qualified exchange (CR 126289)

commit id: f09c3acc451670a6f635a45acc6bdf4dc7ef2a4b

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix KERNEL allocation while lock held

commit id: b42c07c8ade6ae9d74f0fd01638760650b049cdd

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Incorrect usage of bghm for BlockGuard errors (CR 127022)

commit id: acd6859b084d1e1b3ec8bc9befe6532223260d33

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed capability to inject T10 DIF errors via debugfs (CR 123966)

commit id: acd6859b084d1e1b3ec8bc9befe6532223260d33

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fix SLI4 BlockGuard behavior when protection data is generated by HBA (CR 121980)

commit id: acd6859b084d1e1b3ec8bc9befe6532223260d33

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed driver logging in area of SLI4 port error attention and reset recovery

commit id: 6b5151fd7baec6812fece993ddd7a2cf9fd0125f

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Fixed the ability to process T10 DIF/Blockguard with SLI4 16Gb FC Adapters (CR 121980)

commit id: acd6859b084d1e1b3ec8bc9befe6532223260d33

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

scsi/lpfc: Merge from upstream: scsi: Fix up files implicitly depending on module.h inclusion

commit id: acf3368ffb75fc4a83726655d697e79646fe4eb3

Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>

Merge branch 'stable/for-linus-3.7.rebased' into uek2-merge

* stable/for-linus-3.7.rebased:
xen/p2m: Fix one by off error in checking the P2M tree directory.

xen/p2m: Fix one by off error in checking the P2M tree directory.

We would traverse the full P2M top directory (from 0->MAX_DOMAIN_PAGES
inclusive) when trying to figure out whether we can re-use some of the
P2M middle leafs.

Which meant that if the kernel was compiled with MAX_DOMAIN_PAGES=512
we would try to use the 512th entry. Fortunately for us the p2m_top_index
has a check for this:

BUG_ON(pfn >= MAX_P2M_PFN);

which we hit and saw this:

(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1.2-OVM  x86_64  debug=n  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff819cadeb>]
(XEN) RFLAGS: 0000000000000212   EM: 1   CONTEXT: pv guest
(XEN) rax: ffffffff81db5000   rbx: ffffffff81db4000   rcx: 0000000000000000
(XEN) rdx: 0000000000480211   rsi: 0000000000000000   rdi: ffffffff81db4000
(XEN) rbp: ffffffff81793db8   rsp: ffffffff81793d38   r8:  0000000008000000
(XEN) r9:  4000000000000000   r10: 0000000000000000   r11: ffffffff81db7000
(XEN) r12: 0000000000000ff8   r13: ffffffff81df1ff8   r14: ffffffff81db6000
(XEN) r15: 0000000000000ff8   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 0000000661795000   cr2: 0000000000000000

Fixes-Oracle-Bug: 14570662
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>