www.infradead.org Git - users/jedix/linux-maple.git/commit

vpda: try to fix the potential crash due to misusing __GFP_NOFAIL

Patch series "mm: clarify nofail memory allocation", v2.

__GFP_NOFAIL carries the semantics of never failing, so its callers
do not check the return value:
  %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
  cannot handle allocation failures. The allocation could block
  indefinitely but will never return with failure. Testing for
  failure is pointless.

However, __GFP_NOFAIL can sometimes fail if it exceeds size limits or is
used with GFP_ATOMIC/GFP_NOWAIT in a non-sleepable context.  This can
expose security vulnerabilities due to potential NULL dereferences.

Since __GFP_NOFAIL does not support non-blocking allocation, we introduce
GFP_NOFAIL with inclusive blocking semantics and encourage using
GFP_NOFAIL as a replacement for __GFP_NOFAIL in non-mm.

If we must still fail a nofail allocation, we should trigger a BUG rather
than exposing NULL dereferences to callers who do not check the return
value.

* The discussion started from this topic:
[PATCH RFC] mm: warn potential return NULL for kmalloc_array and
             kvmalloc_array with __GFP_NOFAIL

https://lore.kernel.org/linux-mm/20240717230025.77361-1-21cnbao@gmail.com/

This patch (of 4):

mm doesn't support non-blockable __GFP_NOFAIL allocation.  Because
__GFP_NOFAIL without direct reclamation may just result in a busy loop
within non-sleepable contexts.

static inline struct page *
__alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
                                                struct alloc_context *ac)
{
        ...
        /*
         * Make sure that __GFP_NOFAIL request doesn't leak out and make sure
         * we always retry
         */
        if (gfp_mask & __GFP_NOFAIL) {
                /*
                 * All existing users of the __GFP_NOFAIL are blockable, so warn
                 * of any new users that actually require GFP_NOWAIT
                 */
                if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask))
                        goto fail;
                ...
        }
        ...
fail:
        warn_alloc(gfp_mask, ac->nodemask,
                        "page allocation failure: order:%u", order);
got_pg:
        return page;
}

Let's move the memory allocation out of the atomic context and use the
normal sleepable context to get pages.

Link: https://lkml.kernel.org/r/20240731000155.109583-1-21cnbao@gmail.com
Link: https://lkml.kernel.org/r/20240731000155.109583-2-21cnbao@gmail.com
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: "Eugenio Pérez" <eperezma@redhat.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hailong.Liu <hailong.liu@oppo.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Kees Cook <kees@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

author	Barry Song <v-songbaohua@oppo.com>
	Wed, 31 Jul 2024 00:01:52 +0000 (12:01 +1200)
committer	Andrew Morton <akpm@linux-foundation.org>
	Sat, 17 Aug 2024 00:52:43 +0000 (17:52 -0700)
commit	da04c5689e88a86c3d6fbc3a6ad65d1ca36a7043
tree	3ef43e231ff47b86ec7bb52b6c84064808462548	tree
parent	6c639cb9b2c08d79380e1a5fc4a00e5a4d43b9d3	commit \| diff

drivers/vdpa/vdpa_user/iova_domain.c		diff \| blob \| history
drivers/vdpa/vdpa_user/iova_domain.h		diff \| blob \| history
drivers/vdpa/vdpa_user/vduse_dev.c		diff \| blob \| history