]> www.infradead.org Git - nvme.git/log
nvme.git
5 years agomm/page_poison.c: replace bool variable with static key
Mateusz Nosek [Fri, 16 Oct 2020 03:07:33 +0000 (20:07 -0700)]
mm/page_poison.c: replace bool variable with static key

Variable 'want_page_poisoning' is a switch deciding if page poisoning
should be enabled.  This patch changes it to be static key.

Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Oscar Salvador <OSalvador@suse.com>
Link: https://lkml.kernel.org/r/20200921152931.938-1-mateusznosek0@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: try to narrow window race for free pages
Oscar Salvador [Fri, 16 Oct 2020 03:07:29 +0000 (20:07 -0700)]
mm,hwpoison: try to narrow window race for free pages

Aristeu Rozanski reported that a customer test case started to report
-EBUSY after the hwpoison rework patchset.

There is a race window between spotting a free page and taking it off its
buddy freelist, so it might be that by the time we try to take it off, the
page has been already allocated.

This patch tries to handle such race window by trying to handle the new
type of page again if the page was allocated under us.

Reported-by: Aristeu Rozanski <aris@ruivo.org>
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Aristeu Rozanski <aris@ruivo.org>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-15-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: double-check page count in __get_any_page()
Naoya Horiguchi [Fri, 16 Oct 2020 03:07:25 +0000 (20:07 -0700)]
mm,hwpoison: double-check page count in __get_any_page()

Soft offlining could fail with EIO due to the race condition with hugepage
migration.  This issuse became visible due to the change by previous patch
that makes soft offline handler take page refcount by its own.  We have no
way to directly pin zero refcount page, and the page considered as a zero
refcount page could be allocated just after the first check.

This patch adds the second check to find the race and gives us chance to
handle it more reliably.

Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-14-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: introduce MF_MSG_UNSPLIT_THP
Naoya Horiguchi [Fri, 16 Oct 2020 03:07:21 +0000 (20:07 -0700)]
mm,hwpoison: introduce MF_MSG_UNSPLIT_THP

memory_failure() is supposed to call action_result() when it handles a
memory error event, but there's one missing case.  So let's add it.

I find that include/ras/ras_event.h has some other MF_MSG_* undefined, so
this patch also adds them.

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-13-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: return 0 if the page is already poisoned in soft-offline
Oscar Salvador [Fri, 16 Oct 2020 03:07:17 +0000 (20:07 -0700)]
mm,hwpoison: return 0 if the page is already poisoned in soft-offline

Currently, there is an inconsistency when calling soft-offline from
different paths on a page that is already poisoned.

1) madvise:

        madvise_inject_error skips any poisoned page and continues
        the loop.
        If that was the only page to madvise, it returns 0.

2) /sys/devices/system/memory/:

        When calling soft_offline_page_store()->soft_offline_page(),
        we return -EBUSY in case the page is already poisoned.
        This is inconsistent with a) the above example and b)
        memory_failure, where we return 0 if the page was poisoned.

Fix this by dropping the PageHWPoison() check in madvise_inject_error, and
let soft_offline_page return 0 if it finds the page already poisoned.

Please, note that this represents a user-api change, since now the return
error when calling soft_offline_page_store()->soft_offline_page() will be
different.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-12-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page
Oscar Salvador [Fri, 16 Oct 2020 03:07:13 +0000 (20:07 -0700)]
mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page

Merging soft_offline_huge_page and __soft_offline_page let us get rid of
quite some duplicated code, and makes the code much easier to follow.

Now, __soft_offline_page will handle both normal and hugetlb pages.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-11-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: rework soft offline for in-use pages
Oscar Salvador [Fri, 16 Oct 2020 03:07:09 +0000 (20:07 -0700)]
mm,hwpoison: rework soft offline for in-use pages

This patch changes the way we set and handle in-use poisoned pages.  Until
now, poisoned pages were released to the buddy allocator, trusting that
the checks that take place at allocation time would act as a safe net and
would skip that page.

This has proved to be wrong, as we got some pfn walkers out there, like
compaction, that all they care is the page to be in a buddy freelist.

Although this might not be the only user, having poisoned pages in the
buddy allocator seems a bad idea as we should only have free pages that
are ready and meant to be used as such.

Before explaining the taken approach, let us break down the kind of pages
we can soft offline.

- Anonymous THP (after the split, they end up being 4K pages)
- Hugetlb
- Order-0 pages (that can be either migrated or invalited)

* Normal pages (order-0 and anon-THP)

  - If they are clean and unmapped page cache pages, we invalidate
    then by means of invalidate_inode_page().
  - If they are mapped/dirty, we do the isolate-and-migrate dance.

Either way, do not call put_page directly from those paths.  Instead, we
keep the page and send it to page_handle_poison to perform the right
handling.

page_handle_poison sets the HWPoison flag and does the last put_page.

Down the chain, we placed a check for HWPoison page in
free_pages_prepare, that just skips any poisoned page, so those pages
do not end up in any pcplist/freelist.

After that, we set the refcount on the page to 1 and we increment
the poisoned pages counter.

If we see that the check in free_pages_prepare creates trouble, we can
always do what we do for free pages:

  - wait until the page hits buddy's freelists
  - take it off, and flag it

The downside of the above approach is that we could race with an
allocation, so by the time we  want to take the page off the buddy, the
page has been already allocated so we cannot soft offline it.
But the user could always retry it.

* Hugetlb pages

  - We isolate-and-migrate them

After the migration has been successful, we call dissolve_free_huge_page,
and we set HWPoison on the page if we succeed.
Hugetlb has a slightly different handling though.

While for non-hugetlb pages we cared about closing the race with an
allocation, doing so for hugetlb pages requires quite some additional
and intrusive code (we would need to hook in free_huge_page and some other
places).
So I decided to not make the code overly complicated and just fail
normally if the page we allocated in the meantime.

We can always build on top of this.

As a bonus, because of the way we handle now in-use pages, we no longer
need the put-as-isolation-migratetype dance, that was guarding for poisoned
pages to end up in pcplists.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-10-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: rework soft offline for free pages
Oscar Salvador [Fri, 16 Oct 2020 03:07:05 +0000 (20:07 -0700)]
mm,hwpoison: rework soft offline for free pages

When trying to soft-offline a free page, we need to first take it off the
buddy allocator.  Once we know is out of reach, we can safely flag it as
poisoned.

take_page_off_buddy will be used to take a page meant to be poisoned off
the buddy allocator.  take_page_off_buddy calls break_down_buddy_pages,
which splits a higher-order page in case our page belongs to one.

Once the page is under our control, we call page_handle_poison to set it
as poisoned and grab a refcount on it.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-9-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: unify THP handling for hard and soft offline
Oscar Salvador [Fri, 16 Oct 2020 03:07:01 +0000 (20:07 -0700)]
mm,hwpoison: unify THP handling for hard and soft offline

Place the THP's page handling in a helper and use it from both hard and
soft-offline machinery, so we get rid of some duplicated code.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-8-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: kill put_hwpoison_page
Oscar Salvador [Fri, 16 Oct 2020 03:06:57 +0000 (20:06 -0700)]
mm,hwpoison: kill put_hwpoison_page

After commit 4e41a30c6d50 ("mm: hwpoison: adjust for new thp
refcounting"), put_hwpoison_page got reduced to a put_page.  Let us just
use put_page instead.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-7-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: refactor madvise_inject_error
Oscar Salvador [Fri, 16 Oct 2020 03:06:53 +0000 (20:06 -0700)]
mm,hwpoison: refactor madvise_inject_error

Make a proper if-else condition for {hard,soft}-offline.

Signed-off-by: Oscar Salvador <osalvador@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Link: https://lkml.kernel.org/r/20200908075626.11976-3-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: unexport get_hwpoison_page and make it static
Oscar Salvador [Fri, 16 Oct 2020 03:06:50 +0000 (20:06 -0700)]
mm,hwpoison: unexport get_hwpoison_page and make it static

Since get_hwpoison_page is only used in memory-failure code now, let us
un-export it and make it private to that code.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-5-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison-inject: don't pin for hwpoison_filter
Naoya Horiguchi [Fri, 16 Oct 2020 03:06:46 +0000 (20:06 -0700)]
mm,hwpoison-inject: don't pin for hwpoison_filter

Another memory error injection interface debugfs:hwpoison/corrupt-pfn also
takes bogus refcount for hwpoison_filter().  It's justified because this
does a coarse filter, expecting that memory_failure() redoes the check for
sure.

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-4-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm, hwpoison: remove recalculating hpage
Naoya Horiguchi [Fri, 16 Oct 2020 03:06:42 +0000 (20:06 -0700)]
mm, hwpoison: remove recalculating hpage

hpage is never used after try_to_split_thp_page() in memory_failure(), so
we don't have to update hpage.  So let's not recalculate/use hpage.

Suggested-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-3-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm,hwpoison: cleanup unused PageHuge() check
Naoya Horiguchi [Fri, 16 Oct 2020 03:06:38 +0000 (20:06 -0700)]
mm,hwpoison: cleanup unused PageHuge() check

Patch series "HWPOISON: soft offline rework", v7.

This patchset fixes a couple of issues that the patchset Naoya sent [1]
contained due to rebasing problems and a misunterdansting.

Main focus of this series is to stabilize soft offline.  Historically soft
offlined pages have suffered from racy conditions because PageHWPoison is
used to a little too aggressively, which (directly or indirectly) invades
other mm code which cares little about hwpoison.  This results in
unexpected behavior or kernel panic, which is very far from soft offline's
"do not disturb userspace or other kernel component" policy.  An example
of this can be found here [2].

Along with several cleanups, this code refactors and changes the way soft
offline work.  Main point of this change set is to contain target page
"via buddy allocator" or in migrating path.  For ther former we first free
the target page as we do for normal pages, and once it has reached buddy
and it has been taken off the freelists, we flag it as HWpoison.  For the
latter we never get to release the page in unmap_and_move, so the page is
under our control and we can handle it in hwpoison code.

[1] https://patchwork.kernel.org/cover/11704083/
[2] https://lore.kernel.org/linux-mm/20190826104144.GA7849@linux/T/#u

This patch (of 14):

Drop the PageHuge check, which is dead code since memory_failure() forks
into memory_failure_hugetlb() for hugetlb pages.

memory_failure() and memory_failure_hugetlb() shares some functions like
hwpoison_user_mappings() and identify_page_state(), so they should
properly handle 4kB page, thp, and hugetlb.

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Qian Cai <cai@lca.pw>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: Oscar Salvador <osalvador@suse.com>
Link: https://lkml.kernel.org/r/20200922135650.1634-1-osalvador@suse.de
Link: https://lkml.kernel.org/r/20200922135650.1634-2-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/readahead: pass a file_ra_state into force_page_cache_ra
David Howells [Fri, 16 Oct 2020 03:06:35 +0000 (20:06 -0700)]
mm/readahead: pass a file_ra_state into force_page_cache_ra

The file_ra_state being passed into page_cache_sync_readahead() was being
ignored in favour of using the one embedded in the struct file.  The only
caller for which this makes a difference is the fsverity code if the file
has been marked as POSIX_FADV_RANDOM, but it's confusing and worth fixing.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-10-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/filemap: fold ra_submit into do_sync_mmap_readahead
David Howells [Fri, 16 Oct 2020 03:06:31 +0000 (20:06 -0700)]
mm/filemap: fold ra_submit into do_sync_mmap_readahead

Fold ra_submit() into its last remaining user and pass the
readahead_control struct to both do_page_cache_ra() and
page_cache_sync_ra().

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-9-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/readahead: add page_cache_sync_ra and page_cache_async_ra
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:06:28 +0000 (20:06 -0700)]
mm/readahead: add page_cache_sync_ra and page_cache_async_ra

Reimplement page_cache_sync_readahead() and page_cache_async_readahead()
as wrappers around versions of the function which take a readahead_control
in preparation for making do_sync_mmap_readahead() pass down an RAC
struct.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-8-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/readahead: pass readahead_control to force_page_cache_ra
David Howells [Fri, 16 Oct 2020 03:06:24 +0000 (20:06 -0700)]
mm/readahead: pass readahead_control to force_page_cache_ra

Reimplement force_page_cache_readahead() as a wrapper around
force_page_cache_ra().  Pass the existing readahead_control from
page_cache_sync_readahead().

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-7-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/readahead: make ondemand_readahead take a readahead_control
David Howells [Fri, 16 Oct 2020 03:06:21 +0000 (20:06 -0700)]
mm/readahead: make ondemand_readahead take a readahead_control

Make ondemand_readahead() take a readahead_control struct in preparation
for making do_sync_mmap_readahead() pass down an RAC struct.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-6-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/readahead: make do_page_cache_ra take a readahead_control
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:06:17 +0000 (20:06 -0700)]
mm/readahead: make do_page_cache_ra take a readahead_control

Rename __do_page_cache_readahead() to do_page_cache_ra() and call it
directly from ondemand_readahead() instead of indirecting via ra_submit().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-5-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/readahead: make page_cache_ra_unbounded take a readahead_control
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:06:14 +0000 (20:06 -0700)]
mm/readahead: make page_cache_ra_unbounded take a readahead_control

Define it in the callers instead of in page_cache_ra_unbounded().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-4-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/readahead: add DEFINE_READAHEAD
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:06:10 +0000 (20:06 -0700)]
mm/readahead: add DEFINE_READAHEAD

Patch series "Readahead patches for 5.9/5.10".

These are infrastructure for both the THP patchset and for the fscache
rewrite,

For both pieces of infrastructure being build on top of this patchset, we
want the ractl to be available higher in the call-stack.

For David's work, he wants to add the 'critical page' to the ractl so that
he knows which page NEEDS to be brought in from storage, and which ones
are nice-to-have.  We might want something similar in block storage too.
It used to be simple -- the first page was the critical one, but then mmap
added fault-around and so for that usecase, the middle page is the
critical one.  Anyway, I don't have any code to show that yet, we just
know that the lowest point in the callchain where we have that information
is do_sync_mmap_readahead() and so the ractl needs to start its life
there.

For THP, we havew the code that needs it.  It's actually the apex patch to
the series; the one which finally starts to allocate THPs and present them
to consenting filesystems:
http://git.infradead.org/users/willy/pagecache.git/commitdiff/798bcf30ab2eff278caad03a9edca74d2f8ae760

This patch (of 8):

Allow for a more concise definition of a struct readahead_control.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Biggers <ebiggers@google.com>
Cc: David Howells <dhowells@redhat.com>
Link: https://lkml.kernel.org/r/20200903140844.14194-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20200903140844.14194-3-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: fix a race during THP splitting
Huang Ying [Fri, 16 Oct 2020 03:06:07 +0000 (20:06 -0700)]
mm: fix a race during THP splitting

It is reported that the following bug is triggered if the HDD is used as
swap device,

[ 5758.157556] BUG: kernel NULL pointer dereference, address: 0000000000000007
[ 5758.165331] #PF: supervisor write access in kernel mode
[ 5758.171161] #PF: error_code(0x0002) - not-present page
[ 5758.176894] PGD 0 P4D 0
[ 5758.179721] Oops: 0002 [#1] SMP PTI
[ 5758.183614] CPU: 10 PID: 316 Comm: kswapd1 Kdump: loaded Tainted: G S               --------- ---  5.9.0-0.rc3.1.tst.el8.x86_64 #1
[ 5758.196717] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.02.01.0002.082220131453 08/22/2013
[ 5758.208176] RIP: 0010:split_swap_cluster+0x47/0x60
[ 5758.213522] Code: c1 e3 06 48 c1 eb 0f 48 8d 1c d8 48 89 df e8 d0 20 6a 00 80 63 07 fb 48 85 db 74 16 48 89 df c6 07 00 66 66 66 90 31 c0 5b c3 <80> 24 25 07 00 00 00 fb 31 c0 5b c3 b8 f0 ff ff ff 5b c3 66 0f 1f
[ 5758.234478] RSP: 0018:ffffb147442d7af0 EFLAGS: 00010246
[ 5758.240309] RAX: 0000000000000000 RBX: 000000000014b217 RCX: ffffb14779fd9000
[ 5758.248281] RDX: 000000000014b217 RSI: ffff9c52f2ab1400 RDI: 000000000014b217
[ 5758.256246] RBP: ffffe00c51168080 R08: ffffe00c5116fe08 R09: ffff9c52fffd3000
[ 5758.264208] R10: ffffe00c511537c8 R11: ffff9c52fffd3c90 R12: 0000000000000000
[ 5758.272172] R13: ffffe00c51170000 R14: ffffe00c51170000 R15: ffffe00c51168040
[ 5758.280134] FS:  0000000000000000(0000) GS:ffff9c52f2a80000(0000) knlGS:0000000000000000
[ 5758.289163] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5758.295575] CR2: 0000000000000007 CR3: 0000000022a0e003 CR4: 00000000000606e0
[ 5758.303538] Call Trace:
[ 5758.306273]  split_huge_page_to_list+0x88b/0x950
[ 5758.311433]  deferred_split_scan+0x1ca/0x310
[ 5758.316202]  do_shrink_slab+0x12c/0x2a0
[ 5758.320491]  shrink_slab+0x20f/0x2c0
[ 5758.324482]  shrink_node+0x240/0x6c0
[ 5758.328469]  balance_pgdat+0x2d1/0x550
[ 5758.332652]  kswapd+0x201/0x3c0
[ 5758.336157]  ? finish_wait+0x80/0x80
[ 5758.340147]  ? balance_pgdat+0x550/0x550
[ 5758.344525]  kthread+0x114/0x130
[ 5758.348126]  ? kthread_park+0x80/0x80
[ 5758.352214]  ret_from_fork+0x22/0x30
[ 5758.356203] Modules linked in: fuse zram rfkill sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mgag200 iTCO_wdt crct10dif_pclmul iTCO_vendor_support drm_kms_helper crc32_pclmul ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops cec rapl joydev intel_cstate ipmi_si ipmi_devintf drm intel_uncore i2c_i801 ipmi_msghandler pcspkr lpc_ich mei_me i2c_smbus mei ioatdma ip_tables xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg igb ahci libahci i2c_algo_bit crc32c_intel libata dca wmi dm_mirror dm_region_hash dm_log dm_mod
[ 5758.412673] CR2: 0000000000000007
[    0.000000] Linux version 5.9.0-0.rc3.1.tst.el8.x86_64 (mockbuild@x86-vm-15.build.eng.bos.redhat.com) (gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), GNU ld version 2.30-79.el8) #1 SMP Wed Sep 9 16:03:34 EDT 2020

After further digging it's found that the following race condition exists in the
original implementation,

CPU1                                                             CPU2
----                                                             ----
deferred_split_scan()
  split_huge_page(page) /* page isn't compound head */
    split_huge_page_to_list(page, NULL)
      __split_huge_page(page, )
        ClearPageCompound(head)
        /* unlock all subpages except page (not head) */
                                                                 add_to_swap(head)  /* not THP */
                                                                   get_swap_page(head)
                                                                   add_to_swap_cache(head, )
                                                                     SetPageSwapCache(head)
     if PageSwapCache(head)
       split_swap_cluster(/* swap entry of head */)
         /* Deref sis->cluster_info: NULL accessing! */

So, in split_huge_page_to_list(), PageSwapCache() is called for the already
split and unlocked "head", which may be added to swap cache in another CPU.  So
split_swap_cluster() may be called wrongly.

To fix the race, the call to split_swap_cluster() is moved to
__split_huge_page() before all subpages are unlocked.  So that the
PageSwapCache() is stable.

Fixes: 59807685a7e77 ("mm, THP, swap: support splitting THP for THP swap out")
Reported-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Rafael Aquini <aquini@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Link: https://lkml.kernel.org/r/20201009073647.1531083-1-ying.huang@intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agofs: do not update nr_thps for mappings which support THPs
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:06:03 +0000 (20:06 -0700)]
fs: do not update nr_thps for mappings which support THPs

The nr_thps counter is to support THPs in the page cache when the
filesystem doesn't understand THPs.  Eventually it will be removed, but we
should still support filesystems which do not understand THPs yet.  Move
the nr_thp manipulation functions to filemap.h since they're page-cache
specific.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Link: https://lkml.kernel.org/r/20200916032717.22917-2-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agofs: add a filesystem flag for THPs
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:06:00 +0000 (20:06 -0700)]
fs: add a filesystem flag for THPs

The page cache needs to know whether the filesystem supports THPs so that
it doesn't send THPs to filesystems which can't handle them.  Dave Chinner
points out that getting from the page mapping to the filesystem type is
too many steps (mapping->host->i_sb->s_type->fs_flags) so cache that
information in the address space flags.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Link: https://lkml.kernel.org/r/20200916032717.22917-1-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/vmscan: allow arbitrary sized pages to be paged out
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:56 +0000 (20:05 -0700)]
mm/vmscan: allow arbitrary sized pages to be paged out

Remove the assumption that a compound page has HPAGE_PMD_NR pins from the
page cache.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: "Huang, Ying" <ying.huang@intel.com>
Link: https://lkml.kernel.org/r/20200908195539.25896-12-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/page-writeback: support tail pages in wait_for_stable_page
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:53 +0000 (20:05 -0700)]
mm/page-writeback: support tail pages in wait_for_stable_page

page->mapping is undefined for tail pages, so operate exclusively on the
head page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Link: https://lkml.kernel.org/r/20200908195539.25896-11-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/truncate: fix truncation for pages of arbitrary size
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:50 +0000 (20:05 -0700)]
mm/truncate: fix truncation for pages of arbitrary size

Remove the assumption that a compound page is HPAGE_PMD_SIZE, and the
assumption that any page is PAGE_SIZE.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Link: https://lkml.kernel.org/r/20200908195539.25896-10-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/rmap: fix assumptions of THP size
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:46 +0000 (20:05 -0700)]
mm/rmap: fix assumptions of THP size

Ask the page what size it is instead of assuming it's PMD size.  Do this
for anon pages as well as file pages for when someone decides to support
that.  Leave the assumption alone for pages which are PMD mapped; we don't
currently grow THPs beyond PMD size, so we don't need to change this code
yet.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Link: https://lkml.kernel.org/r/20200908195539.25896-9-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/huge_memory: fix can_split_huge_page assumption of THP size
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:43 +0000 (20:05 -0700)]
mm/huge_memory: fix can_split_huge_page assumption of THP size

Ask the page how many subpages it has instead of assuming it's PMD size.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: "Huang, Ying" <ying.huang@intel.com>
Link: https://lkml.kernel.org/r/20200908195539.25896-8-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/huge_memory: fix page_trans_huge_mapcount assumption of THP size
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:39 +0000 (20:05 -0700)]
mm/huge_memory: fix page_trans_huge_mapcount assumption of THP size

Ask the page what size it is instead of assuming it's PMD size.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Link: https://lkml.kernel.org/r/20200908195539.25896-7-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/huge_memory: fix split assumption of page size
Kirill A. Shutemov [Fri, 16 Oct 2020 03:05:36 +0000 (20:05 -0700)]
mm/huge_memory: fix split assumption of page size

File THPs may now be of arbitrary size, and we can't rely on that size
after doing the split so remember the number of pages before we start the
split.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Cc: Huang Ying <ying.huang@intel.com>
Link: https://lkml.kernel.org/r/20200908195539.25896-6-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/huge_memory: fix total_mapcount assumption of page size
Kirill A. Shutemov [Fri, 16 Oct 2020 03:05:33 +0000 (20:05 -0700)]
mm/huge_memory: fix total_mapcount assumption of page size

File THPs may now be of arbitrary order.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Cc: Huang Ying <ying.huang@intel.com>
Link: https://lkml.kernel.org/r/20200908195539.25896-5-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/page_owner: change split_page_owner to take a count
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:29 +0000 (20:05 -0700)]
mm/page_owner: change split_page_owner to take a count

The implementation of split_page_owner() prefers a count rather than the
old order of the page.  When we support a variable size THP, we won't
have the order at this point, but we will have the number of pages.
So change the interface to what the caller and callee would prefer.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Link: https://lkml.kernel.org/r/20200908195539.25896-4-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/memory: remove page fault assumption of compound page size
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:26 +0000 (20:05 -0700)]
mm/memory: remove page fault assumption of compound page size

A compound page in the page cache will not necessarily be of PMD size,
so check explicitly.

[willy@infradead.org: fix remove page fault assumption of compound page size]
Link: https://lkml.kernel.org/r/20201001152259.14932-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20200908195539.25896-3-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/filemap: fix page cache removal for arbitrary sized THPs
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:23 +0000 (20:05 -0700)]
mm/filemap: fix page cache removal for arbitrary sized THPs

Patch series "Remove assumptions of THP size".

There are a number of places in the VM which assume that a THP is a PMD in
size.  That's true today, and remains true after this patch series, but
this is a prerequisite for switching to arbitrary-sized THPs.
thp_nr_pages() still returns either HPAGE_PMD_NR or 1, but will be changed
later.

This patch (of 11):

page_cache_free_page() assumes THPs are PMD_SIZE; fix that assumption.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Link: https://lkml.kernel.org/r/20200908195539.25896-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20200908195539.25896-2-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/filemap: fix storing to a THP shadow entry
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:20 +0000 (20:05 -0700)]
mm/filemap: fix storing to a THP shadow entry

When a THP is removed from the page cache by reclaim, we replace it with a
shadow entry that occupies all slots of the XArray previously occupied by
the THP.  If the user then accesses that page again, we only allocate a
single page, but storing it into the shadow entry replaces all entries
with that one page.  That leads to bugs like

page dumped because: VM_BUG_ON_PAGE(page_to_pgoff(page) != offset)
------------[ cut here ]------------
kernel BUG at mm/filemap.c:2529!

https://bugzilla.kernel.org/show_bug.cgi?id=206569

This is hard to reproduce with mainline, but happens regularly with the
THP patchset (as so many more THPs are created).  This solution is take
from the THP patchset.  It splits the shadow entry into order-0 pieces at
the time that we bring a new page into cache.

Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Qian Cai <cai@lca.pw>
Link: https://lkml.kernel.org/r/20200903183029.14930-4-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoXArray: add xas_split
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:16 +0000 (20:05 -0700)]
XArray: add xas_split

In order to use multi-index entries for huge pages in the page cache, we
need to be able to split a multi-index entry (eg if a file is truncated in
the middle of a huge page entry).  This version does not support splitting
more than one level of the tree at a time.  This is an acceptable
limitation for the page cache as we do not expect to support order-12
pages in the near future.

[akpm@linux-foundation.org: export xas_split_alloc() to modules]
[willy@infradead.org: fix xarray split]
Link: https://lkml.kernel.org/r/20200910175450.GV6583@casper.infradead.org
[willy@infradead.org: fix xarray]
Link: https://lkml.kernel.org/r/20201001233943.GW20115@casper.infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Qian Cai <cai@lca.pw>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lkml.kernel.org/r/20200903183029.14930-3-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoXArray: add xa_get_order
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:05:13 +0000 (20:05 -0700)]
XArray: add xa_get_order

Patch series "Fix read-only THP for non-tmpfs filesystems".

As described more verbosely in the [3/3] changelog, we can inadvertently
put an order-0 page in the page cache which occupies 512 consecutive
entries.  Users are running into this if they enable the
READ_ONLY_THP_FOR_FS config option; see
https://bugzilla.kernel.org/show_bug.cgi?id=206569 and Qian Cai has also
reported it here:
https://lore.kernel.org/lkml/20200616013309.GB815@lca.pw/

This is a rather intrusive way of fixing the problem, but has the
advantage that I've actually been testing it with the THP patches, which
means that it sees far more use than it does upstream -- indeed, Song has
been entirely unable to reproduce it.  It also has the advantage that it
removes a few patches from my gargantuan backlog of THP patches.

This patch (of 3):

This function returns the order of the entry at the index.  We need this
because there isn't space in the shadow entry to encode its order.

[akpm@linux-foundation.org: export xa_get_order to modules]

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Qian Cai <cai@lca.pw>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lkml.kernel.org/r/20200903183029.14930-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20200903183029.14930-2-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/debug_vm_pgtable: avoid doing memory allocation with pgtable_t mapped.
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:05:10 +0000 (20:05 -0700)]
mm/debug_vm_pgtable: avoid doing memory allocation with pgtable_t mapped.

With highmem, pte_alloc_map() keep the level4 page table mapped using
kmap_atomic().  Avoid doing new memory allocation with page table mapped
like above.

[    9.409233] BUG: sleeping function called from invalid context at mm/page_alloc.c:4822
[    9.410557] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper
[    9.411932] no locks held by swapper/1.
[    9.412595] CPU: 0 PID: 1 Comm: swapper Not tainted 5.9.0-rc3-00323-gc50eb1ed654b5 #2
[    9.413824] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[    9.415207] Call Trace:
[    9.415651]  ? ___might_sleep.cold+0xa7/0xcc
[    9.416367]  ? __alloc_pages_nodemask+0x14c/0x5b0
[    9.417055]  ? swap_migration_tests+0x50/0x293
[    9.417704]  ? debug_vm_pgtable+0x4bc/0x708
[    9.418287]  ? swap_migration_tests+0x293/0x293
[    9.418911]  ? do_one_initcall+0x82/0x3cb
[    9.419465]  ? parse_args+0x1bd/0x280
[    9.419983]  ? rcu_read_lock_sched_held+0x36/0x60
[    9.420673]  ? trace_initcall_level+0x1f/0xf3
[    9.421279]  ? trace_initcall_level+0xbd/0xf3
[    9.421881]  ? do_basic_setup+0x9d/0xdd
[    9.422410]  ? do_basic_setup+0xc3/0xdd
[    9.422938]  ? kernel_init_freeable+0x72/0xa3
[    9.423539]  ? rest_init+0x134/0x134
[    9.424055]  ? kernel_init+0x5/0x12c
[    9.424574]  ? ret_from_fork+0x19/0x30

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200913110327.645310-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/debug_vm_pgtable: avoid none pte in pte_clear_test
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:05:06 +0000 (20:05 -0700)]
mm/debug_vm_pgtable: avoid none pte in pte_clear_test

pte_clear_tests operate on an existing pte entry.  Make sure that is not a
none pte entry.

[aneesh.kumar@linux.ibm.com: avoid kernel crash with riscv]
Link: https://lkml.kernel.org/r/20201015033206.140550-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Link: https://lkml.kernel.org/r/20200902114222.181353-14-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/debug_vm_pgtable/hugetlb: disable hugetlb test on ppc64
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:05:03 +0000 (20:05 -0700)]
mm/debug_vm_pgtable/hugetlb: disable hugetlb test on ppc64

The seems to be missing quite a lot of details w.r.t allocating the
correct pgtable_t page (huge_pte_alloc()), holding the right lock
(huge_pte_lock()) etc.  The vma used is also not a hugetlb VMA.

ppc64 do have runtime checks within CONFIG_DEBUG_VM for most of these.
Hence disable the test on ppc64.

[anshuman.khandual@arm.com: drop hugetlb_advanced_tests()]
Link: https://lore.kernel.org/lkml/289c3fdb-1394-c1af-bdc4-5542907089dc@linux.ibm.com/#t
Link: https://lkml.kernel.org/r/1600914446-21890-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lkml.kernel.org/r/20200902114222.181353-13-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/debug_vm_pgtable/pmd_clear: don't use pmd/pud_clear on pte entries
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:04:59 +0000 (20:04 -0700)]
mm/debug_vm_pgtable/pmd_clear: don't use pmd/pud_clear on pte entries

pmd_clear() should not be used to clear pmd level pte entries.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lkml.kernel.org/r/20200902114222.181353-12-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/debug_vm_pgtable/thp: use page table depost/withdraw with THP
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:04:56 +0000 (20:04 -0700)]
mm/debug_vm_pgtable/thp: use page table depost/withdraw with THP

Architectures like ppc64 use deposited page table while updating the huge
pte entries.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lkml.kernel.org/r/20200902114222.181353-11-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/debug_vm_pgtable/locks: take correct page table lock
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:04:53 +0000 (20:04 -0700)]
mm/debug_vm_pgtable/locks: take correct page table lock

Make sure we call pte accessors with correct lock held.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lkml.kernel.org/r/20200902114222.181353-10-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/debug_vm_pgtable/locks: move non page table modifying test together
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:04:49 +0000 (20:04 -0700)]
mm/debug_vm_pgtable/locks: move non page table modifying test together

This will help in adding proper locks in a later patch

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lkml.kernel.org/r/20200902114222.181353-9-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/debug_vm_pgtable/set_pte/pmd/pud: don't use set_*_at to update an existing pte...
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:04:46 +0000 (20:04 -0700)]
mm/debug_vm_pgtable/set_pte/pmd/pud: don't use set_*_at to update an existing pte entry

set_pte_at() should not be used to set a pte entry at locations that
already holds a valid pte entry.  Architectures like ppc64 don't do TLB
invalidate in set_pte_at() and hence expect it to be used to set locations
that are not a valid PTE.

Link: https://lkml.kernel.org/r/20200902114222.181353-8-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/debug_vm_pgtable/savedwrite: enable savedwrite test with CONFIG_NUMA_BALANCING
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:04:40 +0000 (20:04 -0700)]
mm/debug_vm_pgtable/savedwrite: enable savedwrite test with CONFIG_NUMA_BALANCING

Saved write support was added to track the write bit of a pte after
marking the pte protnone.  This was done so that AUTONUMA can convert a
write pte to protnone and still track the old write bit.  When converting
it back we set the pte write bit correctly thereby avoiding a write fault
again.  Hence enable the test only when CONFIG_NUMA_BALANCING is enabled
and use protnone protflags.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lkml.kernel.org/r/20200902114222.181353-6-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/debug_vm_pgtables/hugevmap: use the arch helper to identify huge vmap support.
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:04:36 +0000 (20:04 -0700)]
mm/debug_vm_pgtables/hugevmap: use the arch helper to identify huge vmap support.

ppc64 supports huge vmap only with radix translation.  Hence use arch
helper to determine the huge vmap support.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lkml.kernel.org/r/20200902114222.181353-5-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/debug_vm_pgtable/ppc64: avoid setting top bits in radom value
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:04:33 +0000 (20:04 -0700)]
mm/debug_vm_pgtable/ppc64: avoid setting top bits in radom value

ppc64 use bit 62 to indicate a pte entry (_PAGE_PTE).  Avoid setting that
bit in random value.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lkml.kernel.org/r/20200902114222.181353-4-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agopowerpc/mm: move setting pte specific flags to pfn_pte
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:04:29 +0000 (20:04 -0700)]
powerpc/mm: move setting pte specific flags to pfn_pte

powerpc used to set the pte specific flags in set_pte_at().  This is
different from other architectures.  To be consistent with other
architecture update pfn_pte to set _PAGE_PTE on ppc64.  Also, drop now
unused pte_mkpte.

We add a VM_WARN_ON() to catch the usage of calling set_pte_at() without
setting _PAGE_PTE bit.  We will remove that after a few releases.

With respect to huge pmd entries, pmd_mkhuge() takes care of adding the
_PAGE_PTE bit.

[akpm@linux-foundation.org: whitespace fix, per Christophe]

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lkml.kernel.org/r/20200902114222.181353-3-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agopowerpc/mm: add DEBUG_VM WARN for pmd_clear
Aneesh Kumar K.V [Fri, 16 Oct 2020 03:04:26 +0000 (20:04 -0700)]
powerpc/mm: add DEBUG_VM WARN for pmd_clear

Patch series "mm/debug_vm_pgtable fixes", v4.

This patch series includes fixes for debug_vm_pgtable test code so that
they follow page table updates rules correctly.  The first two patches
introduce changes w.r.t ppc64.

Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
page table update rules.

These tests are broken w.r.t page table update rules and results in kernel
crash as below.

[   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
    pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
    lr: c0000000005eeeec: pte_update+0x11c/0x190
    sp: c000000c6d1e7950
   msr: 8000000002029033
  current = 0xc000000c6d172c80
  paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
    pid   = 1, comm = swapper/0
kernel BUG at arch/powerpc/mm/pgtable.c:304!
[link register   ] c0000000005eeeec pte_update+0x11c/0x190
[c000000c6d1e79500000000000000001 (unreliable)
[c000000c6d1e79b0c0000000005eee14 pte_update+0x44/0x190
[c000000c6d1e7a10c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
[c000000c6d1e7ab0c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
[c000000c6d1e7ba0c0000000000116ec do_one_initcall+0xac/0x5f0
[c000000c6d1e7c80c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
[c000000c6d1e7db0c000000000012474 kernel_init+0x24/0x160
[c000000c6d1e7e20c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c

With DEBUG_VM disabled

[   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
[   20.530183] Faulting instruction address: 0xc0000000000df330
cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
    pc: c0000000000df330: memset+0x68/0x104
    lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
    sp: c000000c6d19f990
   msr: 8000000002009033
   dar: 0
  current = 0xc000000c6d177480
  paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
    pid   = 1, comm = swapper/0
[link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
[c000000c6d19f990c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
[c000000c6d19fa10c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
[c000000c6d19fab0c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
[c000000c6d19fba0c0000000000116ec do_one_initcall+0xac/0x5f0
[c000000c6d19fc80c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
[c000000c6d19fdb0c000000000012474 kernel_init+0x24/0x160
[c000000c6d19fe20c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c

This patch (of 13):

With the hash page table, the kernel should not use pmd_clear for clearing
huge pte entries.  Add a DEBUG_VM WARN to catch the wrong usage.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lkml.kernel.org/r/20200902114222.181353-1-aneesh.kumar@linux.ibm.com
Link: https://lkml.kernel.org/r/20200902114222.181353-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agodevice-dax/kmem: fix resource release
Dan Williams [Fri, 16 Oct 2020 03:04:22 +0000 (20:04 -0700)]
device-dax/kmem: fix resource release

The conversion to request_mem_region() is broken because it assumes that
the range is marked busy prior to release.  However, due to the way that
the kmem driver manipulates the IORESOURCE_BUSY flag (clears it to let
{add,remove}_memory() handle busy) it requires a manual release_resource()
to perform cleanup.

Given that the actual 'struct resource *' needs to be recalled, not just
the range, add that tracking to the kmem driver-data.

Fixes: 0513bd5bb114 ("device-dax/kmem: replace release_resource() with release_mem_region()")
Reported-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jia He <justin.he@arm.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lkml.kernel.org/r/160272252925.3136502.17220638073995895400.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoMerge tag 'linux-kselftest-kunit-fixes-5.10-rc1' of git://git.kernel.org/pub/scm...
Linus Torvalds [Thu, 15 Oct 2020 22:17:06 +0000 (15:17 -0700)]
Merge tag 'linux-kselftest-kunit-fixes-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kunit updates from Shuah Khan:
 "Several kunit tool bug fixes in flag handling, run outside kernel
  tree, make errors, and generating results"

* tag 'linux-kselftest-kunit-fixes-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: tool: fix display of make errors
  kunit: tool: handle when .kunit exists but .kunitconfig does not
  kunit: tool: fix --alltests flag
  kunit: tool: allow generating test results in JSON
  kunit: tool: fix running kunit_tool from outside kernel tree

5 years agoMerge tag 'linux-kselftest-next-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Thu, 15 Oct 2020 22:14:32 +0000 (15:14 -0700)]
Merge tag 'linux-kselftest-next-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest updates from Shuah Khan:

 - speed up headers_install done during selftest build

 - add generic make nesting support

 - add support to select individual tests:

   Selftests build/install generates run_kselftest.sh script to run
   selftests on a target system. Currently the script doesn't have
   support for selecting individual tests. Add support for it.

   With this enhancement, user can select test collections (or tests)
   individually. e.g:

      run_kselftest.sh -c seccomp -t timers:posix_timers -t timers:nanosleep

   Additionally adds a way to list all known tests with "-l", usage with
   "-h", and perform a dry run without running tests with "-n".

* tag 'linux-kselftest-next-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  doc: dev-tools: kselftest.rst: Update examples and paths
  selftests/run_kselftest.sh: Make each test individually selectable
  selftests: Extract run_kselftest.sh and generate stand-alone test list
  selftests: Add missing gitignore entries
  selftests: more general make nesting support
  selftests: use "$(MAKE)" instead of "make" for headers_install

5 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Linus Torvalds [Thu, 15 Oct 2020 22:11:56 +0000 (15:11 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial

Pull trivial updates from Jiri Kosina:
 "The latest advances in computer science from the trivial queue"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  xtensa: fix Kconfig typo
  spelling.txt: Remove some duplicate entries
  mtd: rawnand: oxnas: cleanup/simplify code
  selftests: vm: add fragment CONFIG_GUP_BENCHMARK
  perf: Fix opt help text for --no-bpf-event
  HID: logitech-dj: Fix spelling in comment
  bootconfig: Fix kernel message mentioning CONFIG_BOOT_CONFIG
  MAINTAINERS: rectify MMP SUPPORT after moving cputype.h
  scif: Fix spelling of EACCES
  printk: fix global comment
  lib/bitmap.c: fix spello
  fs: Fix missing 'bit' in comment

5 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Linus Torvalds [Thu, 15 Oct 2020 22:09:12 +0000 (15:09 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID updates from Jiri Kosina:

 - Lenovo X1 Tablet support improvements from Mikael Wikström

 - "heartbeat" report fix for several Wacom devices from Jason Gerecke

 - bounds checking fix in hid-roccat from Dan Carpenter

 - stylus battery reporting fix from Dmitry Torokhov

 - i2c-hid support for wakeup from suspend-to-idle from Kai-Heng Feng

 - new driver for Vivaldi devices from Sean O'Brien

 - other assorted small fixes and device ID additions

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: i2c-hid: Enable wakeup capability from Suspend-to-Idle
  HID: add vivaldi HID driver
  HID: hid-input: fix stylus battery reporting
  HID: wacom: Avoid entering wacom_wac_pen_report for pad / battery
  HID: i2c-hid: fix kerneldoc warnings in i2c-hid-core.c
  HID: core: fix kerneldoc warnings in hid-core.c
  HID: multitouch: Lenovo X1 Tablet Gen2 trackpoint and buttons
  HID: multitouch: Lenovo X1 Tablet Gen3 trackpoint and buttons
  HID: alps: clean up indentation issue
  HID: intel-ish-hid: simplify the return expression of ishtp_bus_remove_device()
  HID: hid-debug: fix nonblocking read semantics wrt EIO/ERESTARTSYS
  HID: i2c-hid: Prefer asynchronous probe
  HID: ite: Add USB id match for Acer One S1003 keyboard dock
  HID: roccat: add bounds checking in kone_sysfs_write_settings()
  HID: wiimote: narrow spinlock range in wiimote_hid_event()
  HID: wiimote: make handlers[] const
  HID: apple: Add support for Matias wireless keyboard
  HID: cp2112: Use irqchip template

5 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatchin...
Linus Torvalds [Thu, 15 Oct 2020 22:07:57 +0000 (15:07 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching

Pull livepatching update from Jiri Kosina:
 "livepatching kselftest output fix from Miroslav Benes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
  selftests/livepatch: Do not check order when using "comm" for dmesg checking

5 years agoMerge tag 'dio_for_v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack...
Linus Torvalds [Thu, 15 Oct 2020 22:03:10 +0000 (15:03 -0700)]
Merge tag 'dio_for_v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull direct-io fix from Jan Kara:
 "Fix for unaligned direct IO read past EOF in legacy DIO code"

* tag 'dio_for_v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  direct-io: defer alignment check until after the EOF check
  direct-io: don't force writeback for reads beyond EOF
  direct-io: clean up error paths of do_blockdev_direct_IO

5 years agoMerge tag 'fs_for_v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack...
Linus Torvalds [Thu, 15 Oct 2020 21:56:15 +0000 (14:56 -0700)]
Merge tag 'fs_for_v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull UDF, reiserfs, ext2, quota fixes from Jan Kara:

 - a couple of UDF fixes for issues found by syzbot fuzzing

 - a couple of reiserfs fixes for issues found by syzbot fuzzing

 - some minor ext2 cleanups

 - quota patches to support grace times beyond year 2038 for XFS quota
   APIs

* tag 'fs_for_v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  reiserfs: Fix oops during mount
  udf: Limit sparing table size
  udf: Remove pointless union in udf_inode_info
  udf: Avoid accessing uninitialized data on failed inode read
  quota: clear padding in v2r1_mem2diskdqb()
  reiserfs: Initialize inode keys properly
  udf: Fix memory leak when mounting
  udf: Remove redundant initialization of variable ret
  reiserfs: only call unlock_new_inode() if I_NEW
  ext2: Fix some kernel-doc warnings in balloc.c
  quota: Expand comment describing d_itimer
  quota: widen timestamps for the fs_disk_quota structure
  reiserfs: Fix memory leak in reiserfs_parse_options()
  udf: Use kvzalloc() in udf_sb_alloc_bitmap()
  ext2: remove duplicate include

5 years agoMerge tag 'configfs-5.10' of git://git.infradead.org/users/hch/configfs
Linus Torvalds [Thu, 15 Oct 2020 21:52:45 +0000 (14:52 -0700)]
Merge tag 'configfs-5.10' of git://git.infradead.org/users/hch/configfs

Pull configfs updates from Christoph Hellwig:
 "Various cleanups for the configfs samples (Bartosz Golaszewski)"

* tag 'configfs-5.10' of git://git.infradead.org/users/hch/configfs:
  samples: configfs: prefer pr_err() over bare printk(KERN_ERR
  samples: configfs: don't use spaces before tabs
  samples: configfs: consolidate local variables of the same type
  samples: configfs: don't reinitialize variables which are already zeroed
  samples: configfs: replace simple_strtoul() with kstrtoint()
  samples: configfs: fix alignment in item struct
  samples: configfs: drop unnecessary ternary operators
  samples: configfs: remove redundant newlines
  MAINTAINERS: add the sample directory to the configfs entry

5 years agoMerge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping
Linus Torvalds [Thu, 15 Oct 2020 21:43:29 +0000 (14:43 -0700)]
Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - rework the non-coherent DMA allocator

 - move private definitions out of <linux/dma-mapping.h>

 - lower CMA_ALIGNMENT (Paul Cercueil)

 - remove the omap1 dma address translation in favor of the common code

 - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)

 - support per-node DMA CMA areas (Barry Song)

 - increase the default seg boundary limit (Nicolin Chen)

 - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)

 - various cleanups

* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
  ARM/ixp4xx: add a missing include of dma-map-ops.h
  dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
  dma-direct: factor out a dma_direct_alloc_from_pool helper
  dma-direct check for highmem pages in dma_direct_alloc_pages
  dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h>
  dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma
  dma-mapping: move dma-debug.h to kernel/dma/
  dma-mapping: remove <asm/dma-contiguous.h>
  dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>
  dma-contiguous: remove dma_contiguous_set_default
  dma-contiguous: remove dev_set_cma_area
  dma-contiguous: remove dma_declare_contiguous
  dma-mapping: split <linux/dma-mapping.h>
  cma: decrease CMA_ALIGNMENT lower limit to 2
  firewire-ohci: use dma_alloc_pages
  dma-iommu: implement ->alloc_noncoherent
  dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
  dma-mapping: add a new dma_alloc_pages API
  dma-mapping: remove dma_cache_sync
  53c700: convert to dma_alloc_noncoherent
  ...

5 years agoMerge tag 'dmaengine-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul...
Linus Torvalds [Thu, 15 Oct 2020 21:33:52 +0000 (14:33 -0700)]
Merge tag 'dmaengine-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
 "Core:

   - Mark dma_request_slave_channel() deprecated in favour of
     dma_request_chan()

   - subsystem conversion for tasklet_setup() API

   - subsystem removal of local dma_parms for arm drivers

  Also updates to bunch of driver notably TI, DW and AXI-DMAC"

* tag 'dmaengine-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (104 commits)
  dmaengine: owl-dma: fix kernel-doc style for enum
  dmaengine: zynqmp_dma: fix kernel-doc style for tasklet
  dmaengine: xilinx_dma: fix kernel-doc style for tasklet
  dmaengine: qcom: bam_dma: fix kernel-doc style for tasklet
  dmaengine: altera-msgdma: fix kernel-doc style for tasklet
  dmaengine: xilinx: dpdma: convert tasklets to use new tasklet_setup() API
  dmaengine: sf-pdma: convert tasklets to use new tasklet_setup() API
  dt-bindings: Fix 'reg' size issues in zynqmp examples
  dmaengine: rcar-dmac: drop double zeroing
  dmaengine: sh: drop double zeroing
  dmaengine: ioat: Allocate correct size for descriptor chunk
  dmaengine: ti: k3-udma: use devm_platform_ioremap_resource_byname
  dmaengine: fsl: remove bad channel update
  dmaengine: dma-jz4780: Fix race in jz4780_dma_tx_status
  dmaengine: pl330: fix argument for tasklet
  dmaengine: dmatest: Return boolean result directly in filter()
  dmaengine: dmatest: Check list for emptiness before access its last entry
  dmaengine: ti: k3-udma-glue: fix channel enable functions
  dmaengine: iop-adma: Fix pointer cast warnings
  dmaengine: dw-edma: Fix Using plain integer as NULL pointer in dw-edma-v0-debugfs.c
  ...

5 years agoMerge branch 'for-5.10/i2c-hid' into for-linus
Jiri Kosina [Thu, 15 Oct 2020 18:46:23 +0000 (20:46 +0200)]
Merge branch 'for-5.10/i2c-hid' into for-linus

- i2c-hid support for wakeup from suspend-to-idle

5 years agoHID: i2c-hid: Enable wakeup capability from Suspend-to-Idle
Kai-Heng Feng [Thu, 9 Jul 2020 07:57:29 +0000 (15:57 +0800)]
HID: i2c-hid: Enable wakeup capability from Suspend-to-Idle

Many laptops can be woken up from Suspend-to-Idle by touchpad. This is
also the default behavior on other OSes.

However, if touchpad and touchscreen contact to each other when lid is
closed, wakeup events can be triggered inadventertly.

So let's disable the wakeup by default, but enable the wakeup capability
so users can enable it at their own discretion.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
5 years agoMerge branch 'for-5.9/upstream-fixes' into for-linus
Jiri Kosina [Thu, 15 Oct 2020 18:41:43 +0000 (20:41 +0200)]
Merge branch 'for-5.9/upstream-fixes' into for-linus

- "heartbeat" report fix for several Wacom devices
- Lenovo X1 Tablet support improvements
- new device IDs
- bounds checking fix in hid-roccat
- stylus battery reporting fix

5 years agoMerge branch 'for-5.10/wiimote' into for-linus
Jiri Kosina [Thu, 15 Oct 2020 18:41:00 +0000 (20:41 +0200)]
Merge branch 'for-5.10/wiimote' into for-linus

- code cleanups for hid-wiimote

5 years agoMerge branch 'for-5.10/vivaldi' into for-linus
Jiri Kosina [Thu, 15 Oct 2020 18:40:02 +0000 (20:40 +0200)]
Merge branch 'for-5.10/vivaldi' into for-linus

- driver for Vivaldi devices (keyboards which provide vendor-defined (Google)
  usages in their descriptor)

5 years agoMerge branch 'for-5.10/intel-ish-hid' into for-linus
Jiri Kosina [Thu, 15 Oct 2020 18:39:41 +0000 (20:39 +0200)]
Merge branch 'for-5.10/intel-ish-hid' into for-linus

- intel-ish-hid code cleanup

5 years agoMerge branch 'for-5.10/i2c-hid' into for-linus
Jiri Kosina [Thu, 15 Oct 2020 18:39:08 +0000 (20:39 +0200)]
Merge branch 'for-5.10/i2c-hid' into for-linus

- prefer async probing in i2c-hid even if built-in

5 years agoMerge branch 'for-5.10/cp2112' into for-linus
Jiri Kosina [Thu, 15 Oct 2020 18:38:07 +0000 (20:38 +0200)]
Merge branch 'for-5.10/cp2112' into for-linus

- make cp2112 driver use irqchip template properly

5 years agoMerge branch 'for-5.10/core' into for-linus
Jiri Kosina [Thu, 15 Oct 2020 18:37:01 +0000 (20:37 +0200)]
Merge branch 'for-5.10/core' into for-linus

- nonblocking read semantics fix for hid-debug

5 years agoMerge branch 'for-5.10/apple' into for-linus
Jiri Kosina [Thu, 15 Oct 2020 18:35:46 +0000 (20:35 +0200)]
Merge branch 'for-5.10/apple' into for-linus

- support for Matias wireless (identifies itself as ISO RevB Alu)

5 years agoMerge tag 'sound-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 15 Oct 2020 18:07:44 +0000 (11:07 -0700)]
Merge tag 'sound-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "The amount of changes is smaller at this round (what a surprise), but
  lots of activity is seen. Most of changes are about ASoC driver
  development, especially Intel platforms. Here are some highlights:

  General:
   - Replace all tasklet usages with other alternatives
   - Cleanup of the ASoC error unwinding code
   - Fixes for trivial issues caught by static checker
   - Spell fixes allover the places

  ALSA Core:
   - Lockdep fix for control devices
   - Fix for potential OSS sequencer mutex stalls

  HD-audio and USB-audio:
   - SoundBlaster AE-7 support
   - Changes in quirk table for the rename handling
   - Quirks for HP and ASUS machines, Pioneer DJ DJM-250MK2.

  ASoC:
   - Lots of updates for Intel SOF and SoundWire enablement
   - Replacement of the DSP driver for some older x86 systems; the new
     code was written from scratch, better maintenance expected
   - Helpers for parsing auxiluary devices from the device tree
   - New support for AllWinner A64, Cirrus Logic CS4234, Mediatek MT6359
     Microchip S/PDIF TX and RX controllers, Realtek RT1015P, and Texas
     Instruments J721E, TAS2110, TAS2564 and TAS2764"

* tag 'sound-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (498 commits)
  ALSA: hda/hdmi: fix incorrect locking in hdmi_pcm_close
  ALSA: hda: fix jack detection with Realtek codecs when in D3
  ALSA: fireworks: use semicolons rather than commas to separate statements
  ALSA: hda: use semicolons rather than commas to separate statements
  ALSA: hda/i915 - fix list corruption with concurrent probes
  ASoC: dmaengine: Document support for TX only or RX only streams
  ASoC: mchp-spdiftx: remove 'TX' from playback stream name
  ASoC: ti: davinci-mcasp: Use &pdev->dev for early dev_warn
  ASoC: tas2764: Add the driver for the TAS2764
  dt-bindings: tas2764: Add the TAS2764 binding doc
  ASoC: Intel: catpt: Add explicit DMADEVICES kconfig dependency
  ASoC: Intel: catpt: Fix compilation when CONFIG_MODULES is disabled
  ASoC: stm32: dfsdm: add actual resolution trace
  ASoC: stm32: dfsdm: change rate limits
  ASoC: qcom: sc7180: Add support for audio over DP
  Asoc: qcom: lpass-platform : Increase buffer size
  ASoC: qcom: Add support for lpass hdmi driver
  Asoc: qcom: lpass:Update lpaif_dmactl members order
  Asoc:qcom:lpass-cpu:Update dts property read API
  ASoC: dt-bindings: Add dt binding for lpass hdmi
  ...

5 years agoMerge tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Thu, 15 Oct 2020 17:46:16 +0000 (10:46 -0700)]
Merge tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Not a major amount of change, the i915 trees got split into display
  and gt trees to better facilitate higher level review, and there's a
  major refactoring of i915 GEM locking to use more core kernel concepts
  (like ww-mutexes). msm gets per-process pagetables, older AMD SI cards
  get DC support, nouveau got a bump in displayport support with common
  code extraction from i915.

  Outside of drm this contains a couple of patches for hexint
  moduleparams which you've acked, and a virtio common code tree that
  you should also get via it's regular path.

  New driver:
   - Cadence MHDP8546 DisplayPort bridge driver

  core:
   - cross-driver scatterlist cleanups
   - devm_drm conversions
   - remove drm_dev_init
   - devm_drm_dev_alloc conversion

  ttm:
   - lots of refactoring and cleanups

  bridges:
   - chained bridge support in more drivers

  panel:
   - misc new panels

  scheduler:
   - cleanup priority levels

  displayport:
   - refactor i915 code into helpers for nouveau

  i915:
   - split into display and GT trees
   - WW locking refactoring in GEM
   - execbuf2 extension mechanism
   - syncobj timeline support
   - GEN 12 HOBL display powersaving
   - Rocket Lake display additions
   - Disable FBC on Tigerlake
   - Tigerlake Type-C + DP improvements
   - Hotplug interrupt refactoring

  amdgpu:
   - Sienna Cichlid updates
   - Navy Flounder updates
   - DCE6 (SI) support for DC
   - Plane rotation enabled
   - TMZ state info ioctl
   - PCIe DPC recovery support
   - DC interrupt handling refactor
   - OLED panel fixes

  amdkfd:
   - add SMI events for thermal throttling
   - SMI interface events ioctl update
   - process eviction counters

  radeon:
   - move to dma_ for allocations
   - expose sclk via sysfs

  msm:
   - DSI support for sm8150/sm8250
   - per-process GPU pagetable support
   - Displayport support

  mediatek:
   - move HDMI phy driver to PHY
   - convert mtk-dpi to bridge API
   - disable mt2701 tmds

  tegra:
   - bridge support

  exynos:
   - misc cleanups

  vc4:
   - dual display cleanups

  ast:
   - cleanups

  gma500:
   - conversion to GPIOd API

  hisilicon:
   - misc reworks

  ingenic:
   - clock handling and format improvements

  mcde:
   - DSI support

  mgag200:
   - desktop g200 support

  mxsfb:
   - i.MX7 + i.MX8M
   - alpha plane support

  panfrost:
   - devfreq support
   - amlogic SoC support

  ps8640:
   - EDID from eDP retrieval

  tidss:
   - AM65xx YUV workaround

  virtio:
   - virtio-gpu exported resources

  rcar-du:
   - R8A7742, R8A774E1 and R8A77961 support
   - YUV planar format fixes
   - non-visible plane handling
   - VSP device reference count fix
   - Kconfig fix to avoid displaying disabled options in .config"

* tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm: (1494 commits)
  drm/ingenic: Fix bad revert
  drm/amdgpu: Fix invalid number of character '{' in amdgpu_acpi_init
  drm/amdgpu: Remove warning for virtual_display
  drm/amdgpu: kfd_initialized can be static
  drm/amd/pm: setup APU dpm clock table in SMU HW initialization
  drm/amdgpu: prevent spurious warning
  drm/amdgpu/swsmu: fix ARC build errors
  drm/amd/display: Fix OPTC_DATA_FORMAT programming
  drm/amd/display: Don't allow pstate if no support in blank
  drm/panfrost: increase readl_relaxed_poll_timeout values
  MAINTAINERS: Update entry for st7703 driver after the rename
  Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"
  drm/amd/display: HDMI remote sink need mode validation for Linux
  drm/amd/display: Change to correct unit on audio rate
  drm/amd/display: Avoid set zero in the requested clk
  drm/amdgpu: align frag_end to covered address space
  drm/amdgpu: fix NULL pointer dereference for Renoir
  drm/vmwgfx: fix regression in thp code due to ttm init refactor.
  drm/amdgpu/swsmu: add interrupt work handler for smu11 parts
  drm/amdgpu/swsmu: add interrupt work function
  ...

5 years agoMerge tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Thu, 15 Oct 2020 17:01:51 +0000 (10:01 -0700)]
Merge tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of char, misc, and other assorted driver subsystem
  patches for 5.10-rc1.

  There's a lot of different things in here, all over the drivers/
  directory. Some summaries:

   - soundwire driver updates

   - habanalabs driver updates

   - extcon driver updates

   - nitro_enclaves new driver

   - fsl-mc driver and core updates

   - mhi core and bus updates

   - nvmem driver updates

   - eeprom driver updates

   - binder driver updates and fixes

   - vbox minor bugfixes

   - fsi driver updates

   - w1 driver updates

   - coresight driver updates

   - interconnect driver updates

   - misc driver updates

   - other minor driver updates

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (396 commits)
  binder: fix UAF when releasing todo list
  docs: w1: w1_therm: Fix broken xref, mistakes, clarify text
  misc: Kconfig: fix a HISI_HIKEY_USB dependency
  LSM: Fix type of id parameter in kernel_post_load_data prototype
  misc: Kconfig: add a new dependency for HISI_HIKEY_USB
  firmware_loader: fix a kernel-doc markup
  w1: w1_therm: make w1_poll_completion static
  binder: simplify the return expression of binder_mmap
  test_firmware: Test partial read support
  firmware: Add request_partial_firmware_into_buf()
  firmware: Store opt_flags in fw_priv
  fs/kernel_file_read: Add "offset" arg for partial reads
  IMA: Add support for file reads without contents
  LSM: Add "contents" flag to kernel_read_file hook
  module: Call security_kernel_post_load_data()
  firmware_loader: Use security_post_load_data()
  LSM: Introduce kernel_post_load_data() hook
  fs/kernel_read_file: Add file_size output argument
  fs/kernel_read_file: Switch buffer size arg to size_t
  fs/kernel_read_file: Remove redundant size argument
  ...

5 years agoMerge tag 'usb-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Thu, 15 Oct 2020 16:51:18 +0000 (09:51 -0700)]
Merge tag 'usb-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB/PHY/Thunderbolt driver updates from Greg KH:
 "Here is the big set of USB, PHY, and Thunderbolt driver updates for
  5.10-rc1.

  Lots of tiny different things for these subsystems are in here,
  including:

   - phy driver updates

   - thunderbolt / USB 4 updates and additions

   - USB gadget driver updates

   - xhci fixes and updates

   - typec driver additions and updates

   - api conversions to various drivers for core kernel api changes

   - new USB control message functions to make it harder to get wrong,
     as found by syzbot (took 2 tries to get it right)

   - lots of tiny USB driver fixes and updates all over the place

  All of these have been in linux-next for a while, with the exception
  of the last "obviously correct" patch that updated a FALLTHROUGH
  comment that got merged last weekend"

* tag 'usb-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (374 commits)
  usb: musb: gadget: Use fallthrough pseudo-keyword
  usb: typec: Add QCOM PMIC typec detection driver
  USB: serial: option: add Cellient MPL200 card
  usb: typec: tcpci_maxim: Add support for Sink FRS
  usb: typec: tcpci: Implement callbacks for FRS
  usb: typec: tcpm: Add support for Sink Fast Role SWAP(FRS)
  usb: typec: tcpci_maxim: Chip level TCPC driver
  usb: typec: tcpci: Add set_vbus tcpci callback
  usb: typec: tcpci: Add a getter method to retrieve tcpm_port reference
  usbip: vhci_hcd: fix calling usb_hcd_giveback_urb() with irqs enabled
  usb: cdc-acm: add quirk to blacklist ETAS ES58X devices
  USB: serial: ftdi_sio: use cur_altsetting for consistency
  USB: serial: option: Add Telit FT980-KS composition
  USB: core: remove polling for /sys/kernel/debug/usb/devices
  usb: typec: add support for STUSB160x Type-C controller family
  usb: typec: add typec_find_pwr_opmode
  usb: typec: hd3ss3220: Use OF graph API to get the connector fwnode
  dt-bindings: usb: renesas,usb3-peri: Document HS and SS data bus
  dt-bindings: usb: convert ti,hd3ss3220 bindings to json-schema
  usb: dwc2: Fix INTR OUT transfers in DDMA mode.
  ...

5 years agoMerge tag 'staging-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Thu, 15 Oct 2020 16:46:23 +0000 (09:46 -0700)]
Merge tag 'staging-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging / IIO driver updates from Greg KH:
 "Here is the large set of staging and IIO driver updates for 5.10-rc1.

  Included in here are:

   - new IIO drivers

   - new IIO driver frameworks

   - various IIO driver fixes and updates

   - IIO device tree conversions to yaml

   - so many minor staging driver coding style cleanups

   - most cdev driver moved out of staging

   - no staging drivers added or removed

  Full details are in the shortlog.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'staging-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (476 commits)
  staging: comedi: check validity of wMaxPacketSize of usb endpoints found
  staging: wfx: improve robustness of wfx_get_hw_rate()
  staging: wfx: drop unicode characters from strings
  staging: wfx: gpiod_get_value() can return an error
  staging: wfx: increase robustness of hif_generic_confirm()
  staging: wfx: wfx_init_common() returns NULL on error
  staging: wfx: standardize the error when vif does not exist
  staging: wfx: check memory allocation
  staging: wfx: improve error handling of hif_join()
  staging: dpaa2-switch: add a dpaa2_switch prefix to all functions in ethsw.c
  staging: dpaa2-switch: add a dpaa2_switch_ prefix to all functions in ethsw-ethtool.c
  staging: rtl8188eu: Fix long lines
  dt-bindings: staging: wfx: silabs,wfx yaml conversion
  staging: wfx: update copyrights dates
  staging: wfx: fix QoS priority for slow buses
  staging: wfx: fix BA sessions for older firmwares
  staging: wfx: remove remaining code of 'secure link' feature
  staging: wfx: fix handling of MMIC error
  staging: vchiq: Fix list_for_each exit tests
  staging: greybus: use __force when assigning __u8 value to snd_ctl_elem_type_t
  ...

5 years agoMerge tag 'spdx-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Wed, 14 Oct 2020 23:19:42 +0000 (16:19 -0700)]
Merge tag 'spdx-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

Pull SPDX updates from Greg KH:
 "Here are some SPDX-specific changes for 5.10-rc1.

  They include:

   - driver fixes to make spdxcheck.pl work properly

   - add GFDL licenses as "deprecated" but required due to some of our
     documentation using them

   - add Zlib license as "deprecated" but required because we have code
     with this license in the tree.

   - convert some drivers to have SPDX identifiers that previously
     didn't have them.

  All have been in linux-next for a very long time with no reported
  issues"

* tag 'spdx-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
  scripts/spdxcheck.py: handle license identifiers in XML comments
  net/mlx5: IPsec: make spdxcheck.py happy
  LICENSES/deprecated: add Zlib license text
  LICENSE: add GFDL deprecated licenses
  net/qla3xxx: Convert to SPDX license identifiers
  net/qlge: Convert to SPDX license identifiers
  net/qlcnic: Convert to SPDX license identifiers
  scsi/qla2xxx: Convert to SPDX license identifiers
  scsi/qla4xxx: Convert to SPDX license identifiers

5 years agoMerge tag 'driver-core-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 14 Oct 2020 23:09:32 +0000 (16:09 -0700)]
Merge tag 'driver-core-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the "big" set of driver core patches for 5.10-rc1

  They include a lot of different things, all related to the driver core
  and/or some driver logic:

   - sysfs common write functions to make it easier to audit sysfs
     attributes

   - device connection cleanups and fixes

   - devm helpers for a few functions

   - NOIO allocations for when devices are being removed

   - minor cleanups and fixes

  All have been in linux-next for a while with no reported issues"

* tag 'driver-core-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (31 commits)
  regmap: debugfs: use semicolons rather than commas to separate statements
  platform/x86: intel_pmc_core: do not create a static struct device
  drivers core: node: Use a more typical macro definition style for ACCESS_ATTR
  drivers core: Use sysfs_emit for shared_cpu_map_show and shared_cpu_list_show
  mm: and drivers core: Convert hugetlb_report_node_meminfo to sysfs_emit
  drivers core: Miscellaneous changes for sysfs_emit
  drivers core: Reindent a couple uses around sysfs_emit
  drivers core: Remove strcat uses around sysfs_emit and neaten
  drivers core: Use sysfs_emit and sysfs_emit_at for show(device *...) functions
  sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output
  dyndbg: use keyword, arg varnames for query term pairs
  driver core: force NOIO allocations during unplug
  platform_device: switch to simpler IDA interface
  driver core: platform: Document return type of more functions
  Revert "driver core: Annotate dev_err_probe() with __must_check"
  Revert "test_firmware: Test platform fw loading on non-EFI systems"
  iio: adc: xilinx-xadc: use devm_krealloc()
  hwmon: pmbus: use more devres helpers
  devres: provide devm_krealloc()
  syscore: Use pm_pr_dbg() for syscore_{suspend,resume}()
  ...

5 years agoMerge tag 'tty-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Wed, 14 Oct 2020 23:05:52 +0000 (16:05 -0700)]
Merge tag 'tty-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial updates from Greg KH:
 "Here is the big set of tty and serial driver patches for 5.10-rc1.

  Lots of little things in here, including:

   - tasklet_setup api conversions

   - sysrq support for capital letters

   - vt and vc cleanups and unwinding the mess some more

   - serial driver updates and minor tweaks

   - new device ids

   - rs485 support for some drivers

   - serial binding documentation updates

   - lots of small serial driver changes for reported issues

  All have been in linux-next for a while with no reported issues"

* tag 'tty-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (79 commits)
  serial: mcf: add sysrq capability
  serial: fsl_lpuart: add sysrq support when using dma
  fbcon: remove no-op fbcon_set_origin()
  tty/sysrq: Extend the sysrq_key_table to cover capital letters
  serial: max310x: rework RX interrupt handling
  serial: 8250_dw: Fix clk-notifier/port suspend deadlock
  serial: 8250: Skip uninitialized TTY port baud rate update
  serial: 8250: Discard RTS/DTS setting from clock update method
  tty: serial: imx: disable TXDC IRQ in imx_uart_shutdown() to avoid IRQ storm
  serial: 8250_fsl: Fix TX interrupt handling condition
  serial: pl011: Fix lockdep splat when handling magic-sysrq interrupt
  tty: serial: fsl_lpuart: fix lpuart32_poll_get_char
  tty: serial: lpuart: fix lpuart32_write usage
  serial: qcom_geni_serial: To correct QUP Version detection logic
  serial: mvebu-uart: fix unused variable warning
  vt_ioctl: make VT_RESIZEX behave like VT_RESIZE
  serial: mvebu-uart: simplify the return expression of mvebu_uart_probe()
  tty: serial: imx: fix link error with CONFIG_SERIAL_CORE_CONSOLE=n
  tty: hvc: fix link error with CONFIG_SERIAL_CORE_CONSOLE=n
  pch_uart: drop double zeroing
  ...

5 years agopowerpc32: don't adjust unmoved stack pointer in csum_partial_copy_generic() epilogue
Jason A. Donenfeld [Wed, 14 Oct 2020 23:02:09 +0000 (01:02 +0200)]
powerpc32: don't adjust unmoved stack pointer in csum_partial_copy_generic() epilogue

A recent change to the checksum code removed usage of some extra
arguments, alongside with storage on the stack for those, and the stack
pointer no longer needed to be adjusted in the function prologue.

But a left over subtraction wasn't removed in the function epilogue,
causing the function to return with the stack pointer moved 16 bytes
away from where it should have.  This corrupted local state and lead to
weird crashes.

This simply removes the leftover instruction from the epilogue.

Fixes: 70d65cd555c5 ("ppc: propagate the calling conventions change down to csum_partial_copy_generic()")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoMerge tag 'backlight-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 14 Oct 2020 22:59:42 +0000 (15:59 -0700)]
Merge tag 'backlight-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight updates from Lee Jones:
 "New Drivers:
   - Add support for KTD253

  Fix-ups:
   - Add Device Tree documentation; common, kinetic,ktd253
   - Use correct header(s); tosa_lcd, tosa_bl

  Bug Fixes:
   - Fix refcount imbalance; sky81452-backlight"

* tag 'backlight-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: tosa_bl: Include the right header
  backlight: tosa_lcd: Include the right header
  backlight: Add Kinetic KTD253 backlight driver
  dt-bindings: backlight: Add Kinetic KTD253 bindings
  dt-bindings: backlight: Add some common backlight properties
  backlight: sky81452-backlight: Fix refcount imbalance on error

5 years agoMerge tag 'mfd-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Linus Torvalds [Wed, 14 Oct 2020 22:56:58 +0000 (15:56 -0700)]
Merge tag 'mfd-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
 "New Drivers:
   - Add support for initialising shared (between children) Regmaps
   - Add support for Kontron SL28CPLD
   - Add support for ENE KB3930 Embedded Controller
   - Add support for Intel FPGA PAC MAX 10 BMC

  New Device Support:
   - Add support for Power to Ricoh RN5T618
   - Add support for UART to Intel Lakefield
   - Add support for LP87524_Q1 to Texas Instruments LP87565

  New Functionality:
   - Device Tree; ene-kb3930, sl28cpld, syscon, lp87565, lp87524-q1
   - Use new helper dev_err_probe(); madera-core, stmfx, wcd934x
   - Use new GPIOD API; dm355evm_msp
   - Add wake-up capability; sprd-sc27xx-spi
   - Add ACPI support; kempld-core

  Fix-ups:
   - Trivial (spelling/whitespace); Kconfig, ab8500
   - Fix for unused variables; khadas-mcu, kempld-core
   - Remove unused header file(s); mt6360-core
   - Use correct IRQ flags in docs; act8945a, gateworks-gsc, rohm,bd70528-pmic
   - Add COMPILE_TEST support; asic3, tmio_core
   - Add dependency on I2C; SL28CPLD

  Bug Fixes:
   - Fix memory leak(s); sm501
   - Do not free regmap_config's 'name' until exit; syscon"

* tag 'mfd-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (34 commits)
  mfd: kempld-core: Fix unused variable 'kempld_acpi_table' when !ACPI
  mfd: sl28cpld: Depend on I2C
  mfd: asic3: Build if COMPILE_TEST=y
  dt-bindings: mfd: Correct interrupt flags in examples
  mfd: Add ACPI support to Kontron PLD driver
  mfd: intel-m10-bmc: Add Intel MAX 10 BMC chip support for Intel FPGA PAC
  mfd: lp87565: Add LP87524-Q1 variant
  dt-bindings: mfd: Add LP87524-Q1
  dt-bindings: mfd: lp87565: Convert to yaml
  mfd: mt6360: Remove unused include <linux/version.h>
  mfd: sm501: Fix leaks in probe()
  mfd: syscon: Don't free allocated name for regmap_config
  dt-bindings: mfd: syscon: Document Exynos3 and Exynos5433 compatibles
  dt-bindings: mfd: syscon: Merge Samsung Exynos Sysreg bindings
  dt-bindings: mfd: ab8500: Remove weird Unicode characters
  mfd: sprd: Add wakeup capability for PMIC IRQ
  mfd: intel-lpss: Add device IDs for UART ports for Lakefield
  mfd: dm355evm_msp: Convert LEDs to GPIO descriptor table
  mfd: wcd934x: Simplify with dev_err_probe()
  mfd: stmfx: Simplify with dev_err_probe()
  ...

5 years agoMerge tag 'devicetree-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 14 Oct 2020 22:31:58 +0000 (15:31 -0700)]
Merge tag 'devicetree-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:

 - Update dtc to upstream version v1.6.0-31-gcbca977ea121

 - dtx_diff help text reformatting

 - Speed-up validation time for binding and dtb checks using json for
   intermediate files

 - Add support for running yamllint on DT schema files

 - Remove old booting-without-of.rst

 - Extend the example schema to address common issues

 - Cleanup handling of additionalProperties/unevaluatedProperties

 - Ensure all DSI controller schemas reference dsi-controller.yaml

 - Vendor prefixes for Zealz, Wandbord/Technexion, Embest RIoT, Rex,
   DFI, and Cisco Meraki

 - Convert at25, SPMI bus, TI hwlock, HiSilicon Hi3660 USB3 PHY, Arm
   SP805 watchdog, Arm SP804, and Samsung 11-pin USB connector to DT
   schema

 - Convert HiSilicon SoC and syscon bindings to DT schema

 - Convert SiFive Risc-V L2 cache, PLIC, PRCI, and PWM to DT schema

 - Convert i.MX bindings for w1, crypto, rng, SIM, PM, DDR, SATA, vf610
   GPIO, and UART to DT schema

 - Add i.MX 8M compatible strings

 - Add LM81 and DS1780 as trivial devices

 - Various missing properties added to fix dtb validation warnings

* tag 'devicetree-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (111 commits)
  dt-bindings: misc: explicitly add #address-cells for slave mode
  spi: dt-bindings: spi-controller: explicitly require #address-cells=<0> for slave mode
  dt: Remove booting-without-of.rst
  dt-bindings: update usb-c-connector example
  dt-bindings: arm: hisilicon: add missing properties into cpuctrl.yaml
  dt-bindings: arm: hisilicon: add missing properties into sysctrl.yaml
  dt-bindings: pwm: imx: document i.MX compatibles
  scripts/dtc: Update to upstream version v1.6.0-31-gcbca977ea121
  dt-bindings: Add running yamllint to dt_binding_check
  dt-bindings: powerpc: Add a schema for the 'sleep' property
  dt-bindings: pinctrl: sirf: Fix typo abitrary
  dt-bindings: pinctrl: qcom: Fix typo abitrary
  dt-bindings: Explicitly allow additional properties in common schemas
  dt-bindings: Use 'additionalProperties' instead of 'unevaluatedProperties'
  dt-bindings: Add missing 'unevaluatedProperties'
  Docs: Fixing spelling errors in Documentation/devicetree/bindings/
  dt-bindings: arm: hisilicon: convert Hi6220 domain controller bindings to json-schema
  dt-bindings: riscv: convert pwm bindings to json-schema
  dt-bindings: riscv: convert plic bindings to json-schema
  dt-bindings: fu540: prci: convert PRCI bindings to json-schema
  ...

5 years agoMerge tag 'pinctrl-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Wed, 14 Oct 2020 22:25:04 +0000 (15:25 -0700)]
Merge tag 'pinctrl-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control updates from Linus Walleij:
 "Core changes:

   - NONE whatsoever, we don't even touch the core files this time
     around.

  New drivers:

   - New driver for the Toshiba Visconti SoC.

   - New subdriver for the Qualcomm MSM8226 SoC.

   - New subdriver for the Actions Semiconductor S500 SoC.

   - New subdriver for the Mediatek MT8192 SoC.

   - New subdriver for the Microchip SAMA7G5 SoC.

  Driver enhancements:

   - Intel Cherryview and Baytrail cleanups and refactorings.

   - Enhanced support for the Renesas R8A7790, more pins and groups.

   - Some optimizations for the MCP23S08 MCP23x17 variant.

   - Some cleanups around the Actions Semiconductor subdrivers.

   - A bunch of cleanups around the SH-PFC and Emma Mobile drivers.

   - The "SH-PFC" (literally SuperH pin function controller, I think)
     subdirectory is now renamed to the more neutral "renesas", as these
     are not very much centered around SuperH anymore.

   - Non-critical fixes for the Aspeed driver.

   - Non-critical fixes for the Ingenic (MIPS!) driver.

   - Fix a bunch of missing pins on the AMD pinctrl driver"

* tag 'pinctrl-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (78 commits)
  pinctrl: amd: Add missing pins to the pin group list
  dt-bindings: pinctrl: sunxi: Allow pinctrl with more interrupt banks
  pinctrl: visconti: PINCTRL_TMPV7700 should depend on ARCH_VISCONTI
  pinctrl: mediatek: Free eint data on failure
  pinctrl: single: fix debug output when #pinctrl-cells = 2
  pinctrl: single: fix pinctrl_spec.args_count bounds check
  pinctrl: sunrisepoint: Modify COMMUNITY macros to be consistent
  pinctrl: cannonlake: Modify COMMUNITY macros to be consistent
  pinctrl: tigerlake: Fix register offsets for TGL-H variant
  pinctrl: Document pinctrl-single,pins when #pinctrl-cells = 2
  pinctrl: mediatek: use devm_platform_ioremap_resource_byname()
  pinctrl: nuvoton: npcm7xx: Constify static ops structs
  pinctrl: mediatek: mt7622: add antsel pins/groups
  pinctrl: ocelot: simplify the return expression of ocelot_gpiochip_register()
  pinctrl: at91-pio4: add support for sama7g5 SoC
  dt-bindings: pinctrl: at91-pio4: add microchip,sama7g5
  pinctrl: spear: simplify the return expression of tvc_connect()
  pinctrl: spear: simplify the return expression of spear310_pinctrl_probe
  pinctrl: sprd: use module_platform_driver to simplify the code
  pinctrl: Ingenic: Add I2S pins support for Ingenic SoCs.
  ...

5 years agoMerge tag 'leds-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel...
Linus Torvalds [Wed, 14 Oct 2020 22:22:07 +0000 (15:22 -0700)]
Merge tag 'leds-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds

Pull LED updates from Pavel Machek:
 "Quite a lot of stuff is going on here. Great cleanups/fixes from Marek
  and others are biggest part.

  I limited CPU LED trigger to 8 LEDs, because it was willing to
  register 1024 'triggers' on machine with 1024 CPUs. I don't believe it
  will cause any problems, but we can raise the limit if it does"

* tag 'leds-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds: (84 commits)
  leds: pwm: Remove platform_data support
  leds: lm3697: Fix out-of-bound access
  leds: ns2: do not guard OF match pointer with of_match_ptr
  leds: ns2: convert to fwnode API
  leds: tlc591xx: fix leak of device node iterator
  leds: pca963x: use struct led_init_data when registering
  leds: pca963x: register LEDs immediately after parsing, get rid of platdata
  leds: tca6507: remove binding comment
  leds: tca6507: cosmetic change: use helper variable
  leds: tca6507: do not set GPIO names
  dt-bindings: leds: tca6507: convert to YAML
  ledtrig-cpu: Limit to 8 CPUs
  leds: TODO: Add documentation about possible subsystem improvements
  leds: pca9532: read pwm settings from device tree
  leds: pca9532: correct shift computation in pca9532_getled
  leds: lm36274: Fix warning for undefined parameters
  leds: lm3532: Fix warnings for undefined parameters
  leds: pca963x: use flexible array
  leds: pca963x: cosmetic: rename variables
  leds: pca963x: cosmetic: rename variables
  ...

5 years agoMerge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Wed, 14 Oct 2020 22:15:35 +0000 (15:15 -0700)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "The usual driver updates (ufs, qla2xxx, tcmu, ibmvfc, lpfc, smartpqi,
  hisi_sas, qedi, qedf, mpt3sas) and minor bug fixes.

  There are only three core changes: adding sense codes, cleaning up
  noretry and adding an option for limitless retries"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (226 commits)
  scsi: hisi_sas: Recover PHY state according to the status before reset
  scsi: hisi_sas: Filter out new PHY up events during suspend
  scsi: hisi_sas: Add device link between SCSI devices and hisi_hba
  scsi: hisi_sas: Add check for methods _PS0 and _PR0
  scsi: hisi_sas: Add controller runtime PM support for v3 hw
  scsi: hisi_sas: Switch to new framework to support suspend and resume
  scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq()
  scsi: qedf: Remove redundant assignment to variable 'rc'
  scsi: lpfc: Remove unneeded variable 'status' in lpfc_fcp_cpu_map_store()
  scsi: snic: Convert to use DEFINE_SEQ_ATTRIBUTE macro
  scsi: qla4xxx: Delete unneeded variable 'status' in qla4xxx_process_ddb_changed
  scsi: sun_esp: Use module_platform_driver to simplify the code
  scsi: sun3x_esp: Use module_platform_driver to simplify the code
  scsi: sni_53c710: Use module_platform_driver to simplify the code
  scsi: qlogicpti: Use module_platform_driver to simplify the code
  scsi: mac_esp: Use module_platform_driver to simplify the code
  scsi: jazz_esp: Use module_platform_driver to simplify the code
  scsi: mvumi: Fix error return in mvumi_io_attach()
  scsi: lpfc: Drop nodelist reference on error in lpfc_gen_req()
  scsi: be2iscsi: Fix a theoretical leak in beiscsi_create_eqs()
  ...

5 years agoMerge tag 'for-5.10/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 14 Oct 2020 22:05:38 +0000 (15:05 -0700)]
Merge tag 'for-5.10/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - Improve DM core's bio splitting to use blk_max_size_offset(). Also
   fix bio splitting for bios that were deferred to the worker thread
   due to a DM device being suspended.

 - Remove DM core's special handling of NVMe devices now that block core
   has internalized efficiencies drivers previously needed to be
   concerned about (via now removed direct_make_request).

 - Fix request-based DM to not bounce through indirect dm_submit_bio;
   instead have block core make direct call to blk_mq_submit_bio().

 - Various DM core cleanups to simplify and improve code.

 - Update DM cryot to not use drivers that set
   CRYPTO_ALG_ALLOCATES_MEMORY.

 - Fix DM raid's raid1 and raid10 discard limits for the purposes of
   linux-stable. But then remove DM raid's discard limits settings now
   that MD raid can efficiently handle large discards.

 - A couple small cleanups across various targets.

* tag 'for-5.10/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: fix request-based DM to not bounce through indirect dm_submit_bio
  dm: remove special-casing of bio-based immutable singleton target on NVMe
  dm: export dm_copy_name_and_uuid
  dm: fix comment in __dm_suspend()
  dm: fold dm_process_bio() into dm_submit_bio()
  dm: fix missing imposition of queue_limits from dm_wq_work() thread
  dm snap persistent: simplify area_io()
  dm thin metadata: Remove unused local variable when create thin and snap
  dm raid: remove unnecessary discard limits for raid10
  dm raid: fix discard limits for raid1 and raid10
  dm crypt: don't use drivers that have CRYPTO_ALG_ALLOCATES_MEMORY
  dm: use dm_table_get_device_name() where appropriate in targets
  dm table: make 'struct dm_table' definition accessible to all of DM core
  dm: eliminate need for start_io_acct() forward declaration
  dm: simplify __process_abnormal_io()
  dm: push use of on-stack flush_bio down to __send_empty_flush()
  dm: optimize max_io_len() by inlining max_io_len_target_boundary()
  dm: push md->immutable_target optimization down to __process_bio()
  dm: change max_io_len() to use blk_max_size_offset()
  dm table: stack 'chunk_sectors' limit to account for target-specific splitting

5 years agoMerge tag 'for-linus-5.10-1' of git://github.com/cminyard/linux-ipmi
Linus Torvalds [Wed, 14 Oct 2020 22:00:20 +0000 (15:00 -0700)]
Merge tag 'for-linus-5.10-1' of git://github.com/cminyard/linux-ipmi

Pull IPMI updates from Corey Minyard:
 "Some minor bug fixes, return values, cleanups of prints, conversion of
  tasklets to the new API.

  The biggest change is retrying the initial information fetch from the
  management controller. If that fails, the iterface is not operational,
  and one group was having trouble with the management controller not
  being ready when the OS started up. So a retry was added"

* tag 'for-linus-5.10-1' of git://github.com/cminyard/linux-ipmi:
  ipmi_si: Fix wrong return value in try_smi_init()
  ipmi: msghandler: Fix a signedness bug
  ipmi: add retry in try_get_dev_id()
  ipmi: Clean up some printks
  ipmi:msghandler: retry to get device id on an error
  ipmi:sm: Print current state when the state is invalid
  ipmi: Reset response handler when failing to send the command
  ipmi: add a newline when printing parameter 'panic_op' by sysfs
  char: ipmi: convert tasklets to use new tasklet_setup() API

5 years agoMerge branch 'for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Linus Torvalds [Wed, 14 Oct 2020 21:58:32 +0000 (14:58 -0700)]
Merge branch 'for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:
 "Two minor changes.

  One makes cgroup interface files ignore zero-sized writes rather than
  triggering -EINVAL on them. The other change is a cleanup which
  doesn't cause any behavior changes"

* 'for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Zero sized write should be no-op
  cgroup: remove redundant kernfs_activate in cgroup_setup_root()

5 years agofs: fix NULL dereference due to data race in prepend_path()
Andrii Nakryiko [Wed, 14 Oct 2020 20:45:28 +0000 (13:45 -0700)]
fs: fix NULL dereference due to data race in prepend_path()

Fix data race in prepend_path() with re-reading mnt->mnt_ns twice
without holding the lock.

is_mounted() does check for NULL, but is_anon_ns(mnt->mnt_ns) might
re-read the pointer again which could be NULL already, if in between
reads one of kern_unmount()/kern_unmount_array()/umount_tree() sets
mnt->mnt_ns to NULL.

This is seen in production with the following stack trace:

  BUG: kernel NULL pointer dereference, address: 0000000000000048
  ...
  RIP: 0010:prepend_path.isra.4+0x1ce/0x2e0
  Call Trace:
    d_path+0xe6/0x150
    proc_pid_readlink+0x8f/0x100
    vfs_readlink+0xf8/0x110
    do_readlinkat+0xfd/0x120
    __x64_sys_readlinkat+0x1a/0x20
    do_syscall_64+0x42/0x110
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: f2683bd8d5bd ("[PATCH] fix d_absolute_path() interplay with fsmount()")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoMerge tag 'threads-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner...
Linus Torvalds [Wed, 14 Oct 2020 21:39:20 +0000 (14:39 -0700)]
Merge tag 'threads-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull pidfd updates from Christian Brauner:
 "This introduces a new extension to the pidfd_open() syscall. Users can
  now raise the new PIDFD_NONBLOCK flag to support non-blocking pidfd
  file descriptors. This has been requested for uses in async process
  management libraries such as async-pidfd in Rust.

  Ever since the introduction of pidfds and more advanced async io
  various programming languages such as Rust have grown support for
  async event libraries. These libraries are created to help build
  epoll-based event loops around file descriptors. A common pattern is
  to automatically make all file descriptors they manage to O_NONBLOCK.

  For such libraries the EAGAIN error code is treated specially. When a
  function is called that returns EAGAIN the function isn't called again
  until the event loop indicates the the file descriptor is ready.
  Supporting EAGAIN when waiting on pidfds makes such libraries just
  work with little effort.

  This introduces a new flag PIDFD_NONBLOCK that is equivalent to
  O_NONBLOCK. This follows the same patterns we have for other (anon
  inode) file descriptors such as EFD_NONBLOCK, IN_NONBLOCK,
  SFD_NONBLOCK, TFD_NONBLOCK and the same for close-on-exec flags.

  Passing a non-blocking pidfd to waitid() currently has no effect, i.e.
  is not supported. There are users which would like to use waitid() on
  pidfds that are O_NONBLOCK and mix it with pidfds that are blocking
  and both pass them to waitid().

  The expected behavior is to have waitid() return -EAGAIN for
  non-blocking pidfds and to block for blocking pidfds without needing
  to perform any additional checks for flags set on the pidfd before
  passing it to waitid(). Non-blocking pidfds will return EAGAIN from
  waitid() when no child process is ready yet. Returning -EAGAIN for
  non-blocking pidfds makes it easier for event loops that handle EAGAIN
  specially.

  It also makes the API more consistent and uniform. In essence,
  waitid() is treated like a read on a non-blocking pidfd or a recvmsg()
  on a non-blocking socket.

  With the addition of support for non-blocking pidfds we support the
  same functionality that sockets do. For sockets() recvmsg() supports
  MSG_DONTWAIT for pidfds waitid() supports WNOHANG. Both flags are
  per-call options. In contrast non-blocking pidfds and non-blocking
  sockets are a setting on an open file description affecting all
  threads in the calling process as well as other processes that hold
  file descriptors referring to the same open file description. Both
  behaviors, per call and per open file description, have genuine
  use-cases.

  The interaction with the WNOHANG flag is documented as follows:

   - If a non-blocking pidfd is passed and WNOHANG is not raised we
     simply raise the WNOHANG flag internally. When do_wait() returns
     indicating that there are eligible child processes but none have
     exited yet we set EAGAIN. If no child process exists we continue
     returning ECHILD.

   - If a non-blocking pidfd is passed and WNOHANG is raised waitid()
     will continue returning 0, i.e. it will not set EAGAIN. This ensure
     backwards compatibility with applications passing WNOHANG
     explicitly with pidfds"

* tag 'threads-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  tests: remove O_NONBLOCK before waiting for WSTOPPED
  tests: add waitid() tests for non-blocking pidfds
  tests: port pidfd_wait to kselftest harness
  pidfd: support PIDFD_NONBLOCK in pidfd_open()
  exit: support non-blocking pidfds

5 years agoMerge tag 'kernel-clone-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/braune...
Linus Torvalds [Wed, 14 Oct 2020 21:32:52 +0000 (14:32 -0700)]
Merge tag 'kernel-clone-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull kernel_clone() updates from Christian Brauner:
 "During the v5.9 merge window we reworked the process creation
  codepaths across multiple architectures. After this work we were only
  left with the _do_fork() helper based on the struct kernel_clone_args
  calling convention. As was pointed out _do_fork() isn't valid
  kernelese especially for a helper that isn't just static.

  This series removes the _do_fork() helper and introduces the new
  kernel_clone() helper. The process creation cleanup didn't change the
  name to something more reasonable mainly because _do_fork() was used
  in quite a few places. So sending this as a separate series seemed the
  better strategy.

  I originally intended to send this early in the v5.9 development cycle
  after the merge window had closed but given that this was touching
  quite a few places I decided to defer this until the v5.10 merge
  window"

* tag 'kernel-clone-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  sched: remove _do_fork()
  tracing: switch to kernel_clone()
  kgdbts: switch to kernel_clone()
  kprobes: switch to kernel_clone()
  x86: switch to kernel_clone()
  sparc: switch to kernel_clone()
  nios2: switch to kernel_clone()
  m68k: switch to kernel_clone()
  ia64: switch to kernel_clone()
  h8300: switch to kernel_clone()
  fork: introduce kernel_clone()

5 years agoMerge tag 'linux-kselftest-fixes-5.10-rc1' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Wed, 14 Oct 2020 21:23:51 +0000 (14:23 -0700)]
Merge tag 'linux-kselftest-fixes-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest updates from Shuah Khan:

 - a selftests harness fix to flush stdout before forking to avoid
   parent and child printing duplicates messages. This is evident when
   test output is redirected to a file.

 - a tools/ wide change to avoid comma separated statements from Joe
   Perches. This fix spans tools/lib, tools/power/cpupower, and
   selftests.

* tag 'linux-kselftest-fixes-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  tools: Avoid comma separated statements
  selftests/harness: Flush stdout before forking

5 years agoMerge tag 'xfs-5.10-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Wed, 14 Oct 2020 21:06:06 +0000 (14:06 -0700)]
Merge tag 'xfs-5.10-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Darrick Wong:
 "The biggest changes are two new features for the ondisk metadata: one
  to record the sizes of the inode btrees in the AG to increase
  redundancy checks and to improve mount times; and a second new feature
  to support timestamps until the year 2486.

  We also fixed a problem where reflinking into a file that requires
  synchronous writes wouldn't actually flush the updates to disk; clean
  up a fair amount of cruft; and started fixing some bugs in the
  realtime volume code.

  Summary:

   - Clean up the buffer ioend calling path so that the retry strategy
     isn't quite so scattered everywhere.

   - Clean up m_sb_bp handling.

   - New feature: storing inode btree counts in the AGI to speed up
     certain mount time per-AG block reservation operatoins and add a
     little more metadata redundancy.

   - New feature: Widen inode timestamps and quota grace expiration
     timestamps to support dates through the year 2486.

   - Get rid of more of our custom buffer allocation API wrappers.

   - Use a proper VLA for shortform xattr structure namevals.

   - Force the log after reflinking or deduping into a file that is
     opened with O_SYNC or O_DSYNC.

   - Fix some math errors in the realtime allocator"

* tag 'xfs-5.10-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (42 commits)
  xfs: ensure that fpunch, fcollapse, and finsert operations are aligned to rt extent size
  xfs: make sure the rt allocator doesn't run off the end
  xfs: Remove unneeded semicolon
  xfs: force the log after remapping a synchronous-writes file
  xfs: Convert xfs_attr_sf macros to inline functions
  xfs: Use variable-size array for nameval in xfs_attr_sf_entry
  xfs: Remove typedef xfs_attr_shortform_t
  xfs: remove typedef xfs_attr_sf_entry_t
  xfs: Remove kmem_zalloc_large()
  xfs: enable big timestamps
  xfs: trace timestamp limits
  xfs: widen ondisk quota expiration timestamps to handle y2038+
  xfs: widen ondisk inode timestamps to deal with y2038+
  xfs: redefine xfs_ictimestamp_t
  xfs: redefine xfs_timestamp_t
  xfs: move xfs_log_dinode_to_disk to the log recovery code
  xfs: refactor quota timestamp coding
  xfs: refactor default quota grace period setting code
  xfs: refactor quota expiration timer modification
  xfs: explicitly define inode timestamp range
  ...

5 years agoMerge tag 'iomap-5.10-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Wed, 14 Oct 2020 19:23:00 +0000 (12:23 -0700)]
Merge tag 'iomap-5.10-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull iomap updates from Darrick Wong:
 "There's not a lot of new stuff going on here -- a little bit of code
  refactoring to make iomap workable with btrfs' fsync locking model,
  cleanups in preparation for adding THP support for filesystems, and
  fixing a data corruption issue for blocksize < pagesize filesystems.

  Summary:

   - Don't WARN_ON weird states that unprivileged users can create.

   - Don't invalidate page cache when direct writes want to fall back to
     buffered.

   - Fix some problems when readahead ios fail.

   - Fix a problem where inline data pages weren't getting flushed
     during an unshare operation.

   - Rework iomap to support arbitrarily many blocks per page in
     preparation to support THP for the page cache.

   - Fix a bug in the blocksize < pagesize buffered io path where we
     could fail to initialize the many-blocks-per-page uptodate bitmap
     correctly when the backing page is actually up to date. This could
     cause us to forget to write out dirty pages.

   - Split out the generic_write_sync at the end of the directio write
     path so that btrfs can drop the inode lock before sync'ing the
     file.

   - Call inode_dio_end before trying to sync the file after a O_DSYNC
     direct write (instead of afterwards) to match the behavior of the
     old directio code"

* tag 'iomap-5.10-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: Call inode_dio_end() before generic_write_sync()
  iomap: Allow filesystem to call iomap_dio_complete without i_rwsem
  iomap: Set all uptodate bits for an Uptodate page
  iomap: Change calling convention for zeroing
  iomap: Convert iomap_write_end types
  iomap: Convert write_count to write_bytes_pending
  iomap: Convert read_count to read_bytes_pending
  iomap: Support arbitrarily many blocks per page
  iomap: Use bitmap ops to set uptodate bits
  iomap: Use kzalloc to allocate iomap_page
  fs: Introduce i_blocks_per_page
  iomap: Fix misplaced page flushing
  iomap: Use round_down/round_up macros in __iomap_write_begin
  iomap: Mark read blocks uptodate in write_begin
  iomap: Clear page error before beginning a write
  iomap: Fix direct I/O write consistency check
  iomap: fix WARN_ON_ONCE() from unprivileged users

5 years agoMerge tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 14 Oct 2020 19:08:34 +0000 (12:08 -0700)]
Merge tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - ARM-SMMU Updates from Will:

      - Continued SVM enablement, where page-table is shared with CPU

      - Groundwork to support integrated SMMU with Adreno GPU

      - Allow disabling of MSI-based polling on the kernel command-line

      - Minor driver fixes and cleanups (octal permissions, error
        messages, ...)

 - Secure Nested Paging Support for AMD IOMMU. The IOMMU will fault when
   a device tries DMA on memory owned by a guest. This needs new
   fault-types as well as a rewrite of the IOMMU memory semaphore for
   command completions.

 - Allow broken Intel IOMMUs (wrong address widths reported) to still be
   used for interrupt remapping.

 - IOMMU UAPI updates for supporting vSVA, where the IOMMU can access
   address spaces of processes running in a VM.

 - Support for the MT8167 IOMMU in the Mediatek IOMMU driver.

 - Device-tree updates for the Renesas driver to support r8a7742.

 - Several smaller fixes and cleanups all over the place.

* tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (57 commits)
  iommu/vt-d: Gracefully handle DMAR units with no supported address widths
  iommu/vt-d: Check UAPI data processed by IOMMU core
  iommu/uapi: Handle data and argsz filled by users
  iommu/uapi: Rename uapi functions
  iommu/uapi: Use named union for user data
  iommu/uapi: Add argsz for user filled data
  docs: IOMMU user API
  iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()
  iommu/arm-smmu-v3: Add SVA device feature
  iommu/arm-smmu-v3: Check for SVA features
  iommu/arm-smmu-v3: Seize private ASID
  iommu/arm-smmu-v3: Share process page tables
  iommu/arm-smmu-v3: Move definitions to a header
  iommu/io-pgtable-arm: Move some definitions to a header
  iommu/arm-smmu-v3: Ensure queue is read after updating prod pointer
  iommu/amd: Re-purpose Exclusion range registers to support SNP CWWB
  iommu/amd: Add support for RMP_PAGE_FAULT and RMP_HW_ERR
  iommu/amd: Use 4K page for completion wait write-back semaphore
  iommu/tegra-smmu: Allow to group clients in same swgroup
  iommu/tegra-smmu: Fix iova->phys translation
  ...

5 years agoMerge branch 'stable/for-linus-5.10' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 14 Oct 2020 19:00:02 +0000 (12:00 -0700)]
Merge branch 'stable/for-linus-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb

Pull swiotlb updates from Konrad Rzeszutek Wilk:
 "Minor enhancement of using %p to print phys_addr_r and also compiler
  warnings"

* 'stable/for-linus-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: Mark max_segment with static keyword
  swiotlb: Declare swiotlb_late_init_with_default_size() in header
  swiotlb: Use %pa to print phys_addr_t variables