]> www.infradead.org Git - users/hch/dma-mapping.git/log
users/hch/dma-mapping.git
5 years agoio-wq: re-set NUMA node affinities if CPUs come online
Jens Axboe [Thu, 22 Oct 2020 15:02:50 +0000 (09:02 -0600)]
io-wq: re-set NUMA node affinities if CPUs come online

We correctly set io-wq NUMA node affinities when the io-wq context is
setup, but if an entire node CPU set is offlined and then brought back
online, the per node affinities are broken. Ensure that we set them
again whenever a CPU comes online. This ensures that we always track
the right node affinity. The usual cpuhp notifiers are used to drive it.

Reported-by: Zhang Qiang <qiang.zhang@windriver.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: don't reuse linked_timeout
Pavel Begunkov [Tue, 20 Oct 2020 22:50:27 +0000 (23:50 +0100)]
io_uring: don't reuse linked_timeout

Clear linked_timeout for next requests in __io_queue_sqe() so we won't
queue it up unnecessary when it's going to be punted.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Cc: stable@vger.kernel.org # v5.9
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: unify fsize with def->work_flags
Jens Axboe [Tue, 20 Oct 2020 20:28:41 +0000 (14:28 -0600)]
io_uring: unify fsize with def->work_flags

This one was missed in the earlier conversion, should be included like
any of the other IO identity flags. Make sure we restore to RLIM_INIFITY
when dropping the personality again.

Fixes: 98447d65b4a7 ("io_uring: move io identity items into separate struct")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: fix racy REQ_F_LINK_TIMEOUT clearing
Pavel Begunkov [Mon, 19 Oct 2020 15:39:16 +0000 (16:39 +0100)]
io_uring: fix racy REQ_F_LINK_TIMEOUT clearing

io_link_timeout_fn() removes REQ_F_LINK_TIMEOUT from the link head's
flags, it's not atomic and may race with what the head is doing.

If io_link_timeout_fn() doesn't clear the flag, as forced by this patch,
then it may happen that for "req -> link_timeout1 -> link_timeout2",
__io_kill_linked_timeout() would find link_timeout2 and try to cancel
it, so miscounting references. Teach it to ignore such double timeouts
by marking the active one with a new flag in io_prep_linked_timeout().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: do poll's hash_node init in common code
Pavel Begunkov [Sun, 18 Oct 2020 09:17:43 +0000 (10:17 +0100)]
io_uring: do poll's hash_node init in common code

Move INIT_HLIST_NODE(&req->hash_node) into __io_arm_poll_handler(), so
that it doesn't duplicated and common poll code would be responsible for
it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: inline io_poll_task_handler()
Pavel Begunkov [Sun, 18 Oct 2020 09:17:42 +0000 (10:17 +0100)]
io_uring: inline io_poll_task_handler()

io_poll_task_handler() doesn't add clarity, inline it in its only user.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: remove extra ->file check in poll prep
Pavel Begunkov [Sun, 18 Oct 2020 09:17:41 +0000 (10:17 +0100)]
io_uring: remove extra ->file check in poll prep

io_poll_add_prep() doesn't need to verify ->file because it's already
done in io_init_req().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: make cached_cq_overflow non atomic_t
Pavel Begunkov [Sun, 18 Oct 2020 09:17:40 +0000 (10:17 +0100)]
io_uring: make cached_cq_overflow non atomic_t

ctx->cached_cq_overflow is changed only under completion_lock. Convert
it from atomic_t to just int, and mark all places when it's read without
lock with READ_ONCE, which guarantees atomicity (relaxed ordering).

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: inline io_fail_links()
Pavel Begunkov [Sun, 18 Oct 2020 09:17:39 +0000 (10:17 +0100)]
io_uring: inline io_fail_links()

Inline io_fail_links() and kill extra io_cqring_ev_posted().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: kill ref get/drop in personality init
Pavel Begunkov [Sun, 18 Oct 2020 09:17:38 +0000 (10:17 +0100)]
io_uring: kill ref get/drop in personality init

Don't take an identity on personality/creds init only to drop it a few
lines after. Extract a function which prepares req->work but leaves it
without identity.

Note: it's safe to not check REQ_F_WORK_INITIALIZED there because it's
nobody had a chance to init it before io_init_req().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: flags-based creds init in queue
Pavel Begunkov [Sun, 18 Oct 2020 09:17:37 +0000 (10:17 +0100)]
io_uring: flags-based creds init in queue

Use IO_WQ_WORK_CREDS to figure out if req has creds to be used.
Since recently it should rely only on flags, but not value of
work.creds.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: use blk_queue_nowait() to check if NOWAIT supported
Jeffle Xu [Mon, 19 Oct 2020 08:59:42 +0000 (16:59 +0800)]
io_uring: use blk_queue_nowait() to check if NOWAIT supported

commit 021a24460dc2 ("block: add QUEUE_FLAG_NOWAIT") adds a new helper
function blk_queue_nowait() to check if the bdev supports handling of
REQ_NOWAIT or not. Since then bio-based dm device can also support
REQ_NOWAIT, and currently only dm-linear supports that since
commit 6abc49468eea ("dm: add support for REQ_NOWAIT and enable it for
linear target").

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agomm: use limited read-ahead to satisfy read
Jens Axboe [Sat, 17 Oct 2020 15:25:52 +0000 (09:25 -0600)]
mm: use limited read-ahead to satisfy read

For the case where read-ahead is disabled on the file, or if the cgroup
is congested, ensure that we can at least do 1 page of read-ahead to
make progress on the read in an async fashion. This could potentially be
larger, but it's not needed in terms of functionality, so let's error on
the side of caution as larger counts of pages may run into reclaim
issues (particularly if we're congested).

This makes sure we're not hitting the potentially sync ->readpage() path
for IO that is marked IOCB_WAITQ, which could cause us to block. It also
means we'll use the same path for IO, regardless of whether or not
read-ahead happens to be disabled on the lower level device.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Hao_Xu <haoxu@linux.alibaba.com>
[axboe: updated for new ractl API]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agomm: mark async iocb read as NOWAIT once some data has been copied
Jens Axboe [Sat, 17 Oct 2020 14:31:29 +0000 (08:31 -0600)]
mm: mark async iocb read as NOWAIT once some data has been copied

Once we've copied some data for an iocb that is marked with IOCB_WAITQ,
we should no longer attempt to async lock a new page. Instead make sure
we return the copied amount, and let the caller retry, instead of
returning -EIOCBQUEUED for a new page.

This should only be possible with read-ahead disabled on the below
device, and multiple threads racing on the same file. Haven't been able
to reproduce on anything else.

Cc: stable@vger.kernel.org # v5.9
Fixes: 1a0a7853b901 ("mm: support async buffered reads in generic_file_buffered_read()")
Reported-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: fix double poll mask init
Pavel Begunkov [Fri, 16 Oct 2020 19:55:56 +0000 (20:55 +0100)]
io_uring: fix double poll mask init

__io_queue_proc() is used by both, poll reqs and apoll. Don't use
req->poll.events to copy poll mask because for apoll it aliases with
private data of the request.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio-wq: inherit audit loginuid and sessionid
Jens Axboe [Thu, 15 Oct 2020 19:46:44 +0000 (13:46 -0600)]
io-wq: inherit audit loginuid and sessionid

Make sure the async io-wq workers inherit the loginuid and sessionid from
the original task, and restore them to unset once we're done with the
async work item.

While at it, disable the ability for kernel threads to write to their own
loginuid.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: use percpu counters to track inflight requests
Jens Axboe [Thu, 15 Oct 2020 22:24:45 +0000 (16:24 -0600)]
io_uring: use percpu counters to track inflight requests

Even though we place the req_issued and req_complete in separate
cachelines, there's considerable overhead in doing the atomics
particularly on the completion side.

Get rid of having the two counters, and just use a percpu_counter for
this. That's what it was made for, after all. This considerably
reduces the overhead in __io_free_req().

Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: assign new io_identity for task if members have changed
Jens Axboe [Thu, 15 Oct 2020 23:38:03 +0000 (17:38 -0600)]
io_uring: assign new io_identity for task if members have changed

This avoids doing a copy for each new async IO, if some parts of the
io_identity has changed. We avoid reference counting for the normal
fast path of nothing ever changing.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: store io_identity in io_uring_task
Jens Axboe [Thu, 15 Oct 2020 15:02:33 +0000 (09:02 -0600)]
io_uring: store io_identity in io_uring_task

This is, by definition, a per-task structure. So store it in the
task context, instead of doing carrying it in each io_kiocb. We're being
a bit inefficient if members have changed, as that requires an alloc and
copy of a new io_identity struct. The next patch will fix that up.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: COW io_identity on mismatch
Jens Axboe [Thu, 15 Oct 2020 14:46:24 +0000 (08:46 -0600)]
io_uring: COW io_identity on mismatch

If the io_identity doesn't completely match the task, then create a
copy of it and use that. The existing copy remains valid until the last
user of it has gone away.

This also changes the personality lookup to be indexed by io_identity,
instead of creds directly.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: move io identity items into separate struct
Jens Axboe [Wed, 14 Oct 2020 16:48:51 +0000 (10:48 -0600)]
io_uring: move io identity items into separate struct

io-wq contains a pointer to the identity, which we just hold in io_kiocb
for now. This is in preparation for putting this outside io_kiocb. The
only exception is struct files_struct, which we'll need different rules
for to avoid a circular dependency.

No functional changes in this patch.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: rely solely on work flags to determine personality.
Jens Axboe [Wed, 14 Oct 2020 16:12:37 +0000 (10:12 -0600)]
io_uring: rely solely on work flags to determine personality.

We solely rely on work->work_flags now, so use that for proper checking
and clearing/dropping of various identity items.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: pass required context in as flags
Jens Axboe [Wed, 14 Oct 2020 15:23:55 +0000 (09:23 -0600)]
io_uring: pass required context in as flags

We have a number of bits that decide what context to inherit. Set up
io-wq flags for these instead. This is in preparation for always having
the various members set, but not always needing them for all requests.

No intended functional changes in this patch.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio-wq: assign NUMA node locality if appropriate
Jens Axboe [Thu, 15 Oct 2020 16:13:07 +0000 (10:13 -0600)]
io-wq: assign NUMA node locality if appropriate

There was an assumption that kthread_create_on_node() would properly set
NUMA affinities in terms of CPUs allowed, but it doesn't. Make sure we
do this when creating an io-wq context on NUMA.

Cc: stable@vger.kernel.org
Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: fix error path cleanup in io_sqe_files_register()
Jens Axboe [Wed, 14 Oct 2020 13:35:57 +0000 (07:35 -0600)]
io_uring: fix error path cleanup in io_sqe_files_register()

syzbot reports the following crash:

general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 PID: 8927 Comm: syz-executor.3 Not tainted 5.9.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:io_file_from_index fs/io_uring.c:5963 [inline]
RIP: 0010:io_sqe_files_register fs/io_uring.c:7369 [inline]
RIP: 0010:__io_uring_register fs/io_uring.c:9463 [inline]
RIP: 0010:__do_sys_io_uring_register+0x2fd2/0x3ee0 fs/io_uring.c:9553
Code: ec 03 49 c1 ee 03 49 01 ec 49 01 ee e8 57 61 9c ff 41 80 3c 24 00 0f 85 9b 09 00 00 4d 8b af b8 01 00 00 4c 89 e8 48 c1 e8 03 <80> 3c 28 00 0f 85 76 09 00 00 49 8b 55 00 89 d8 c1 f8 09 48 98 4c
RSP: 0018:ffffc90009137d68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc9000ef2a000
RDX: 0000000000040000 RSI: ffffffff81d81dd9 RDI: 0000000000000005
RBP: dffffc0000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffed1012882a37
R13: 0000000000000000 R14: ffffed1012882a38 R15: ffff888094415000
FS:  00007f4266f3c700(0000) GS:ffff8880ae500000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000118c000 CR3: 000000008e57d000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45de59
Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f4266f3bc78 EFLAGS: 00000246 ORIG_RAX: 00000000000001ab
RAX: ffffffffffffffda RBX: 00000000000083c0 RCX: 000000000045de59
RDX: 0000000020000280 RSI: 0000000000000002 RDI: 0000000000000005
RBP: 000000000118bf68 R08: 0000000000000000 R09: 0000000000000000
R10: 40000000000000a1 R11: 0000000000000246 R12: 000000000118bf2c
R13: 00007fff2fa4f12f R14: 00007f4266f3c9c0 R15: 000000000118bf2c
Modules linked in:
---[ end trace 2a40a195e2d5e6e6 ]---
RIP: 0010:io_file_from_index fs/io_uring.c:5963 [inline]
RIP: 0010:io_sqe_files_register fs/io_uring.c:7369 [inline]
RIP: 0010:__io_uring_register fs/io_uring.c:9463 [inline]
RIP: 0010:__do_sys_io_uring_register+0x2fd2/0x3ee0 fs/io_uring.c:9553
Code: ec 03 49 c1 ee 03 49 01 ec 49 01 ee e8 57 61 9c ff 41 80 3c 24 00 0f 85 9b 09 00 00 4d 8b af b8 01 00 00 4c 89 e8 48 c1 e8 03 <80> 3c 28 00 0f 85 76 09 00 00 49 8b 55 00 89 d8 c1 f8 09 48 98 4c
RSP: 0018:ffffc90009137d68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc9000ef2a000
RDX: 0000000000040000 RSI: ffffffff81d81dd9 RDI: 0000000000000005
RBP: dffffc0000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffed1012882a37
R13: 0000000000000000 R14: ffffed1012882a38 R15: ffff888094415000
FS:  00007f4266f3c700(0000) GS:ffff8880ae400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000074a918 CR3: 000000008e57d000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

which is a copy of fget failure condition jumping to cleanup, but the
cleanup requires ctx->file_data to be assigned. Assign it when setup,
and ensure that we clear it again for the error path exit.

Fixes: 5398ae698525 ("io_uring: clean file_data access in files_register")
Reported-by: syzbot+f4ebcc98223dafd8991e@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoRevert "io_uring: mark io_uring_fops/io_op_defs as __read_mostly"
Jens Axboe [Tue, 13 Oct 2020 21:01:40 +0000 (15:01 -0600)]
Revert "io_uring: mark io_uring_fops/io_op_defs as __read_mostly"

This reverts commit 738277adc81929b3e7c9b63fec6693868cc5f931.

This change didn't make a lot of sense, and as Linus reports, it actually
fails on clang:

   /tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section
   attributes for .data..read_mostly

The arrays are already marked const so, by definition, they are not
just read-mostly, they are read-only.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: fix REQ_F_COMP_LOCKED by killing it
Pavel Begunkov [Tue, 13 Oct 2020 08:44:00 +0000 (09:44 +0100)]
io_uring: fix REQ_F_COMP_LOCKED by killing it

REQ_F_COMP_LOCKED is used and implemented in a buggy way. The problem is
that the flag is set before io_put_req() but not cleared after, and if
that wasn't the final reference, the request will be freed with the flag
set from some other context, which may not hold a spinlock. That means
possible races with removing linked timeouts and unsynchronised
completion (e.g. access to CQ).

Instead of fixing REQ_F_COMP_LOCKED, kill the flag and use
task_work_add() to move such requests to a fresh context to free from
it, as was done with __io_free_req_finish().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: dig out COMP_LOCK from deep call chain
Pavel Begunkov [Tue, 13 Oct 2020 08:43:59 +0000 (09:43 +0100)]
io_uring: dig out COMP_LOCK from deep call chain

io_req_clean_work() checks REQ_F_COMP_LOCK to pass this two layers up.
Move the check up into __io_free_req(), so at least it doesn't looks so
ugly and would facilitate further changes.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: don't put a poll req under spinlock
Pavel Begunkov [Tue, 13 Oct 2020 08:43:58 +0000 (09:43 +0100)]
io_uring: don't put a poll req under spinlock

Move io_put_req() in io_poll_task_handler() from under spinlock. This
eliminates the need to use REQ_F_COMP_LOCKED, at the expense of
potentially having to grab the lock again. That's still a better trade
off than relying on the locked flag.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: don't unnecessarily clear F_LINK_TIMEOUT
Pavel Begunkov [Tue, 13 Oct 2020 08:43:57 +0000 (09:43 +0100)]
io_uring: don't unnecessarily clear F_LINK_TIMEOUT

If a request had REQ_F_LINK_TIMEOUT it would've been cleared in
__io_kill_linked_timeout() by the time of __io_fail_links(), so no need
to care about it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: don't set COMP_LOCKED if won't put
Pavel Begunkov [Tue, 13 Oct 2020 08:43:56 +0000 (09:43 +0100)]
io_uring: don't set COMP_LOCKED if won't put

__io_kill_linked_timeout() sets REQ_F_COMP_LOCKED for a linked timeout
even if it can't cancel it, e.g. it's already running. It not only races
with io_link_timeout_fn() for ->flags field, but also leaves the flag
set and so io_link_timeout_fn() may find it and decide that it holds the
lock. Hopefully, the second problem is potential.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoio_uring: Fix sizeof() mismatch
Colin Ian King [Mon, 12 Oct 2020 14:03:41 +0000 (15:03 +0100)]
io_uring: Fix sizeof() mismatch

An incorrect sizeof() is being used, sizeof(file_data->table) is not
correct, it should be sizeof(*file_data->table).

Fixes: 5398ae698525 ("io_uring: clean file_data access in files_register")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Addresses-Coverity: ("Sizeof not portable (SIZEOF_MISMATCH)")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoMerge tag 'ovl-update-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Linus Torvalds [Fri, 16 Oct 2020 22:29:46 +0000 (15:29 -0700)]
Merge tag 'ovl-update-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs updates from Miklos Szeredi:

 - Improve performance for certain container setups by introducing a
   "volatile" mode

 - ioctl improvements

 - continue preparation for unprivileged overlay mounts

* tag 'ovl-update-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: use generic vfs_ioc_setflags_prepare() helper
  ovl: support [S|G]ETFLAGS and FS[S|G]ETXATTR ioctls for directories
  ovl: rearrange ovl_can_list()
  ovl: enumerate private xattrs
  ovl: pass ovl_fs down to functions accessing private xattrs
  ovl: drop flags argument from ovl_do_setxattr()
  ovl: adhere to the vfs_ vs. ovl_do_ conventions for xattrs
  ovl: use ovl_do_getxattr() for private xattr
  ovl: fold ovl_getxattr() into ovl_get_redirect_xattr()
  ovl: clean up ovl_getxattr() in copy_up.c
  duplicate ovl_getxattr()
  ovl: provide a mount option "volatile"
  ovl: check for incompatible features in work dir

5 years agoMerge tag 'afs-fixes-20201016' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowe...
Linus Torvalds [Fri, 16 Oct 2020 22:22:41 +0000 (15:22 -0700)]
Merge tag 'afs-fixes-20201016' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull afs updates from David Howells:
 "A collection of fixes to fix afs_cell struct refcounting, thereby
  fixing a slew of related syzbot bugs:

   - Fix the cell tree in the netns to use an rwsem rather than RCU.

     There seem to be some problems deriving from the use of RCU and a
     seqlock to walk the rbtree, but it's not entirely clear what since
     there are several different failures being seen.

     Changing things to use an rwsem instead makes it more robust. The
     extra performance derived from using RCU isn't necessary in this
     case since the only time we're looking up a cell is during mount or
     when cells are being manually added.

   - Fix the refcounting by splitting the usage counter into a memory
     refcount and an active users counter. The usage counter was doing
     double duty, keeping track of whether a cell is still in use and
     keeping track of when it needs to be destroyed - but this makes the
     clean up tricky. Separating these out simplifies the logic.

   - Fix purging a cell that has an alias. A cell alias pins the cell
     it's an alias of, but the alias is always later in the list. Trying
     to purge in a single pass causes rmmod to hang in such a case.

   - Fix cell removal. If a cell's manager is requeued whilst it's
     removing itself, the manager will run again and re-remove itself,
     causing problems in various places. Follow Hillf Danton's
     suggestion to insert a more terminal state that causes the manager
     to do nothing post-removal.

  In additional to the above, two other changes:

   - Add a tracepoint for the cell refcount and active users count. This
     helped with debugging the above and may be useful again in future.

   - Downgrade an assertion to a print when a still-active server is
     seen during purging. This was happening as a consequence of
     incomplete cell removal before the servers were cleaned up"

* tag 'afs-fixes-20201016' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Don't assert on unpurgeable server records
  afs: Add tracing for cell refcount and active user count
  afs: Fix cell removal
  afs: Fix cell purging with aliases
  afs: Fix cell refcounting by splitting the usage counter
  afs: Fix rapid cell addition/removal by not using RCU on cells tree

5 years agoMerge tag 'f2fs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeu...
Linus Torvalds [Fri, 16 Oct 2020 22:14:43 +0000 (15:14 -0700)]
Merge tag 'f2fs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've added new features such as zone capacity for ZNS
  and a new GC policy, ATGC, along with in-memory segment management. In
  addition, we could improve the decompression speed significantly by
  changing virtual mapping method. Even though we've fixed lots of small
  bugs in compression support, I feel that it becomes more stable so
  that I could give it a try in production.

  Enhancements:
   - suport zone capacity in NVMe Zoned Namespace devices
   - introduce in-memory current segment management
   - add standart casefolding support
   - support age threshold based garbage collection
   - improve decompression speed by changing virtual mapping method

  Bug fixes:
   - fix condition checks in some ioctl() such as compression, move_range, etc
   - fix 32/64bits support in data structures
   - fix memory allocation in zstd decompress
   - add some boundary checks to avoid kernel panic on corrupted image
   - fix disallowing compression for non-empty file
   - fix slab leakage of compressed block writes

  In addition, it includes code refactoring for better readability and
  minor bug fixes for compression and zoned device support"

* tag 'f2fs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (51 commits)
  f2fs: code cleanup by removing unnecessary check
  f2fs: wait for sysfs kobject removal before freeing f2fs_sb_info
  f2fs: fix writecount false positive in releasing compress blocks
  f2fs: introduce check_swap_activate_fast()
  f2fs: don't issue flush in f2fs_flush_device_cache() for nobarrier case
  f2fs: handle errors of f2fs_get_meta_page_nofail
  f2fs: fix to set SBI_NEED_FSCK flag for inconsistent inode
  f2fs: reject CASEFOLD inode flag without casefold feature
  f2fs: fix memory alignment to support 32bit
  f2fs: fix slab leak of rpages pointer
  f2fs: compress: fix to disallow enabling compress on non-empty file
  f2fs: compress: introduce cic/dic slab cache
  f2fs: compress: introduce page array slab cache
  f2fs: fix to do sanity check on segment/section count
  f2fs: fix to check segment boundary during SIT page readahead
  f2fs: fix uninit-value in f2fs_lookup
  f2fs: remove unneeded parameter in find_in_block()
  f2fs: fix wrong total_sections check and fsmeta check
  f2fs: remove duplicated code in sanity_check_area_boundary
  f2fs: remove unused check on version_bitmap
  ...

5 years agoMerge tag 'docs/v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Fri, 16 Oct 2020 22:02:21 +0000 (15:02 -0700)]
Merge tag 'docs/v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull documentation updates from Mauro Carvalho Chehab:
 "A series of patches addressing warnings produced by make htmldocs.
  This includes:

   - kernel-doc markup fixes

   - ReST fixes

   - Updates at the build system in order to support newer versions of
     the docs build toolchain (Sphinx)

  After this series, the number of html build warnings should reduce
  significantly, and building with Sphinx 3.1 or later should now be
  supported (although it is still recommended to use Sphinx 2.4.4).

  As agreed with Jon, I should be sending you a late pull request by the
  end of the merge window addressing remaining issues with docs build,
  as there are a number of warning fixes that depends on pull requests
  that should be happening along the merge window.

  The end goal is to have a clean htmldocs build on Kernel 5.10.

  PS. It should be noticed that Sphinx 3.0 is not currently supported,
  as it lacks support for C domain namespaces. Such feature, needed in
  order to document uAPI system calls with Sphinx 3.x, was added only on
  Sphinx 3.1"

* tag 'docs/v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (75 commits)
  PM / devfreq: remove a duplicated kernel-doc markup
  mm/doc: fix a literal block markup
  workqueue: fix a kernel-doc warning
  docs: virt: user_mode_linux_howto_v2.rst: fix a literal block markup
  Input: sparse-keymap: add a description for @sw
  rcu/tree: docs: document bkvcache new members at struct kfree_rcu_cpu
  nl80211: docs: add a description for s1g_cap parameter
  usb: docs: document altmode register/unregister functions
  kunit: test.h: fix a bad kernel-doc markup
  drivers: core: fix kernel-doc markup for dev_err_probe()
  docs: bio: fix a kerneldoc markup
  kunit: test.h: solve kernel-doc warnings
  block: bio: fix a warning at the kernel-doc markups
  docs: powerpc: syscall64-abi.rst: fix a malformed table
  drivers: net: hamradio: fix document location
  net: appletalk: Kconfig: Fix docs location
  dt-bindings: fix references to files converted to yaml
  memblock: get rid of a :c:type leftover
  math64.h: kernel-docs: Convert some markups into normal comments
  media: uAPI: buffer.rst: remove a left-over documentation
  ...

5 years agoMerge tag 'trace-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Fri, 16 Oct 2020 21:56:52 +0000 (14:56 -0700)]
Merge tag 'trace-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Fix mismatch section of adding early trace events.

  Fixes the issue of a mismatch section that was missed due to gcc
  inlining the offending function, while clang did not (and reported the
  issue)"

* tag 'trace-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Remove __init from __trace_early_add_new_event()

5 years agoMerge tag 'printk-for-5.10-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 16 Oct 2020 19:52:37 +0000 (12:52 -0700)]
Merge tag 'printk-for-5.10-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk fix from Petr Mladek:
 "Prevent overflow in the new lockless ringbuffer"

* tag 'printk-for-5.10-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: ringbuffer: Wrong data pointer when appending small string

5 years agoMerge tag 'kgdb-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt...
Linus Torvalds [Fri, 16 Oct 2020 19:47:18 +0000 (12:47 -0700)]
Merge tag 'kgdb-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux

Pull kgdb updates from Daniel Thompson:
 "A fairly modest set of changes for this cycle.

  Of particular note are an earlycon fix from Doug Anderson and my own
  changes to get kgdb/kdb to honour the kprobe blocklist. The later
  creates a safety rail that strongly encourages developers not to place
  breakpoints in, for example, arch specific trap handling code.

  Also included are a couple of small fixes and tweaks: an API update,
  eliminate a coverity dead code warning, improved handling of search
  during multi-line printk and a couple of typo corrections"

* tag 'kgdb-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
  kdb: Fix pager search for multi-line strings
  kernel: debug: Centralize dbg_[de]activate_sw_breakpoints
  kgdb: Add NOKPROBE labels on the trap handler functions
  kgdb: Honour the kprobe blocklist when setting breakpoints
  kernel/debug: Fix spelling mistake in debug_core.c
  kdb: Use newer api for tasklist scanning
  kgdb: Make "kgdbcon" work properly with "kgdb_earlycon"
  kdb: remove unnecessary null check of dbg_io_ops

5 years agoMerge tag 'mips_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Linus Torvalds [Fri, 16 Oct 2020 19:40:55 +0000 (12:40 -0700)]
Merge tag 'mips_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS updates from Thomas Bogendoerfer:

 - removed support for PNX833x alias NXT_STB22x

 - included Ingenic SoC support into generic MIPS kernels

 - added support for new Ingenic SoCs

 - converted workaround selection to use Kconfig

 - replaced old boot mem functions by memblock_*

 - enabled COP2 usage in kernel for Loongson64 to make use
   of 16byte load/stores possible

 - cleanups and fixes

* tag 'mips_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (92 commits)
  MIPS: DEC: Restore bootmem reservation for firmware working memory area
  MIPS: dec: fix section mismatch
  bcm963xx_tag.h: fix duplicated word
  mips: ralink: enable zboot support
  MIPS: ingenic: Remove CPU_SUPPORTS_HUGEPAGES
  MIPS: cpu-probe: remove MIPS_CPU_BP_GHIST option bit
  MIPS: cpu-probe: introduce exclusive R3k CPU probe
  MIPS: cpu-probe: move fpu probing/handling into its own file
  MIPS: replace add_memory_region with memblock
  MIPS: Loongson64: Clean up numa.c
  MIPS: Loongson64: Select SMP in Kconfig to avoid build error
  mips: octeon: Add Ubiquiti E200 and E220 boards
  MIPS: SGI-IP28: disable use of ll/sc in kernel
  MIPS: tx49xx: move tx4939_add_memory_regions into only user
  MIPS: pgtable: Remove used PAGE_USERIO define
  MIPS: alchemy: Share prom_init implementation
  MIPS: alchemy: Fix build breakage, if TOUCHSCREEN_WM97XX is disabled
  MIPS: process: include exec.h header in process.c
  MIPS: process: Add prototype for function arch_dup_task_struct
  MIPS: idle: Add prototype for function check_wait
  ...

5 years agoMerge tag 's390-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 16 Oct 2020 19:36:38 +0000 (12:36 -0700)]
Merge tag 's390-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Remove address space overrides using set_fs()

 - Convert to generic vDSO

 - Convert to generic page table dumper

 - Add ARCH_HAS_DEBUG_WX support

 - Add leap seconds handling support

 - Add NVMe firmware-assisted kernel dump support

 - Extend NVMe boot support with memory clearing control and addition of
   kernel parameters

 - AP bus and zcrypt api code rework. Add adapter configure/deconfigure
   interface. Extend debug features. Add failure injection support

 - Add ECC secure private keys support

 - Add KASan support for running protected virtualization host with
   4-level paging

 - Utilize destroy page ultravisor call to speed up secure guests
   shutdown

 - Implement ioremap_wc() and ioremap_prot() with MIO in PCI code

 - Various checksum improvements

 - Other small various fixes and improvements all over the code

* tag 's390-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (85 commits)
  s390/uaccess: fix indentation
  s390/uaccess: add default cases for __put_user_fn()/__get_user_fn()
  s390/zcrypt: fix wrong format specifications
  s390/kprobes: move insn_page to text segment
  s390/sie: fix typo in SIGP code description
  s390/lib: fix kernel doc for memcmp()
  s390/zcrypt: Introduce Failure Injection feature
  s390/zcrypt: move ap_msg param one level up the call chain
  s390/ap/zcrypt: revisit ap and zcrypt error handling
  s390/ap: Support AP card SCLP config and deconfig operations
  s390/sclp: Add support for SCLP AP adapter config/deconfig
  s390/ap: add card/queue deconfig state
  s390/ap: add error response code field for ap queue devices
  s390/ap: split ap queue state machine state from device state
  s390/zcrypt: New config switch CONFIG_ZCRYPT_DEBUG
  s390/zcrypt: introduce msg tracking in zcrypt functions
  s390/startup: correct early pgm check info formatting
  s390: remove orphaned extern variables declarations
  s390/kasan: make sure int handler always run with DAT on
  s390/ipl: add support to control memory clearing for nvme re-IPL
  ...

5 years agoMerge tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Fri, 16 Oct 2020 19:21:15 +0000 (12:21 -0700)]
Merge tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting
   it for powerpc, as well as a related fix for sparc.

 - Remove support for PowerPC 601.

 - Some fixes for watchpoints & addition of a new ptrace flag for
   detecting ISA v3.1 (Power10) watchpoint features.

 - A fix for kernels using 4K pages and the hash MMU on bare metal
   Power9 systems with > 16TB of RAM, or RAM on the 2nd node.

 - A basic idle driver for shallow stop states on Power10.

 - Tweaks to our sched domains code to better inform the scheduler about
   the hardware topology on Power9/10, where two SMT4 cores can be
   presented by firmware as an SMT8 core.

 - A series doing further reworks & cleanups of our EEH code.

 - Addition of a filter for RTAS (firmware) calls done via sys_rtas(),
   to prevent root from overwriting kernel memory.

 - Other smaller features, fixes & cleanups.

Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
Athira Rajeev, Biwen Li, Cameron Berkenpas, Cédric Le Goater, Christophe
Leroy, Christoph Hellwig, Colin Ian King, Daniel Axtens, David Dai, Finn
Thain, Frederic Barrat, Gautham R. Shenoy, Greg Kurz, Gustavo Romero,
Ira Weiny, Jason Yan, Joel Stanley, Jordan Niethe, Kajol Jain, Konrad
Rzeszutek Wilk, Laurent Dufour, Leonardo Bras, Liu Shixin, Luca
Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar, Nathan Lynch, Nicholas
Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Pedro
Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai, Qinglang
Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott Cheloha,
Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt,
Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain,
Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang
Yingliang, zhengbin.

* tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (228 commits)
  Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed"
  selftests/powerpc: Fix eeh-basic.sh exit codes
  cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier
  powerpc/time: Make get_tb() common to PPC32 and PPC64
  powerpc/time: Make get_tbl() common to PPC32 and PPC64
  powerpc/time: Remove get_tbu()
  powerpc/time: Avoid using get_tbl() and get_tbu() internally
  powerpc/time: Make mftb() common to PPC32 and PPC64
  powerpc/time: Rename mftbl() to mftb()
  powerpc/32s: Remove #ifdef CONFIG_PPC_BOOK3S_32 in head_book3s_32.S
  powerpc/32s: Rename head_32.S to head_book3s_32.S
  powerpc/32s: Setup the early hash table at all time.
  powerpc/time: Remove ifdef in get_dec() and set_dec()
  powerpc: Remove get_tb_or_rtc()
  powerpc: Remove __USE_RTC()
  powerpc: Tidy up a bit after removal of PowerPC 601.
  powerpc: Remove support for PowerPC 601
  powerpc: Remove PowerPC 601
  powerpc: Drop SYNC_601() ISYNC_601() and SYNC()
  powerpc: Remove CONFIG_PPC601_SYNC_FIX
  ...

5 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Fri, 16 Oct 2020 18:31:55 +0000 (11:31 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge more updates from Andrew Morton:
 "155 patches.

  Subsystems affected by this patch series: mm (dax, debug, thp,
  readahead, page-poison, util, memory-hotplug, zram, cleanups), misc,
  core-kernel, get_maintainer, MAINTAINERS, lib, bitops, checkpatch,
  binfmt, ramfs, autofs, nilfs, rapidio, panic, relay, kgdb, ubsan,
  romfs, and fault-injection"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (155 commits)
  lib, uaccess: add failure injection to usercopy functions
  lib, include/linux: add usercopy failure capability
  ROMFS: support inode blocks calculation
  ubsan: introduce CONFIG_UBSAN_LOCAL_BOUNDS for Clang
  sched.h: drop in_ubsan field when UBSAN is in trap mode
  scripts/gdb/tasks: add headers and improve spacing format
  scripts/gdb/proc: add struct mount & struct super_block addr in lx-mounts command
  kernel/relay.c: drop unneeded initialization
  panic: dump registers on panic_on_warn
  rapidio: fix the missed put_device() for rio_mport_add_riodev
  rapidio: fix error handling path
  nilfs2: fix some kernel-doc warnings for nilfs2
  autofs: harden ioctl table
  ramfs: fix nommu mmap with gaps in the page cache
  mm: remove the now-unnecessary mmget_still_valid() hack
  mm/gup: take mmap_lock in get_dump_page()
  binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot
  coredump: rework elf/elf_fdpic vma_dump_size() into common helper
  coredump: refactor page range dumping into common helper
  coredump: let dump_emit() bail out on short writes
  ...

5 years agolib, uaccess: add failure injection to usercopy functions
Albert van der Linde [Fri, 16 Oct 2020 03:13:50 +0000 (20:13 -0700)]
lib, uaccess: add failure injection to usercopy functions

To test fault-tolerance of user memory access functions, introduce fault
injection to usercopy functions.

If a failure is expected return either -EFAULT or the total amount of
bytes that were not copied.

Signed-off-by: Albert van der Linde <alinde@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Marco Elver <elver@google.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20200831171733.955393-3-alinde@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib, include/linux: add usercopy failure capability
Albert van der Linde [Fri, 16 Oct 2020 03:13:46 +0000 (20:13 -0700)]
lib, include/linux: add usercopy failure capability

Patch series "add fault injection to user memory access", v3.

The goal of this series is to improve testing of fault-tolerance in usages
of user memory access functions, by adding support for fault injection.

syzkaller/syzbot are using the existing fault injection modes and will use
this particular feature also.

The first patch adds failure injection capability for usercopy functions.
The second changes usercopy functions to use this new failure capability
(copy_from_user, ...).  The third patch adds get/put/clear_user failures
to x86.

This patch (of 3):

Add a failure injection capability to improve testing of fault-tolerance
in usages of user memory access functions.

Add CONFIG_FAULT_INJECTION_USERCOPY to enable faults in usercopy
functions.  The should_fail_usercopy function is to be called by these
functions (copy_from_user, get_user, ...) in order to fail or not.

Signed-off-by: Albert van der Linde <alinde@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20200831171733.955393-1-alinde@google.com
Link: http://lkml.kernel.org/r/20200831171733.955393-2-alinde@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoROMFS: support inode blocks calculation
Libing Zhou [Fri, 16 Oct 2020 03:13:42 +0000 (20:13 -0700)]
ROMFS: support inode blocks calculation

When use 'stat' tool to display file status, the 'Blocks' field always in
'0', this is not good for tool 'du'(e.g.: busybox 'du'), it always output
'0' size for the files under ROMFS since such tool calculates number of
512B Blocks.

This patch calculates approx.  number of 512B blocks based on inode size.

Signed-off-by: Libing Zhou <libing.zhou@nokia-sbell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: http://lkml.kernel.org/r/20200811052606.4243-1-libing.zhou@nokia-sbell.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoubsan: introduce CONFIG_UBSAN_LOCAL_BOUNDS for Clang
George Popescu [Fri, 16 Oct 2020 03:13:38 +0000 (20:13 -0700)]
ubsan: introduce CONFIG_UBSAN_LOCAL_BOUNDS for Clang

When the kernel is compiled with Clang, -fsanitize=bounds expands to
-fsanitize=array-bounds and -fsanitize=local-bounds.

Enabling -fsanitize=local-bounds with Clang has the unfortunate
side-effect of inserting traps; this goes back to its original intent,
which was as a hardening and not a debugging feature [1].  The same
feature made its way into -fsanitize=bounds, but the traps remained.  For
that reason, -fsanitize=bounds was split into 'array-bounds' and
'local-bounds' [2].

Since 'local-bounds' doesn't behave like a normal sanitizer, enable it
with Clang only if trapping behaviour was requested by
CONFIG_UBSAN_TRAP=y.

Add the UBSAN_BOUNDS_LOCAL config to Kconfig.ubsan to enable the
'local-bounds' option by default when UBSAN_TRAP is enabled.

[1] http://lists.llvm.org/pipermail/llvm-dev/2012-May/049972.html
[2] http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20131021/091536.html

Suggested-by: Marco Elver <elver@google.com>
Signed-off-by: George Popescu <georgepope@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Brazdil <dbrazdil@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lkml.kernel.org/r/20200922074330.2549523-1-georgepope@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agosched.h: drop in_ubsan field when UBSAN is in trap mode
Elena Petrova [Fri, 16 Oct 2020 03:13:35 +0000 (20:13 -0700)]
sched.h: drop in_ubsan field when UBSAN is in trap mode

in_ubsan field of task_struct is only used in lib/ubsan.c, which in its
turn is used only `ifneq ($(CONFIG_UBSAN_TRAP),y)`.

Removing unnecessary field from a task_struct will help preserve the ABI
between vanilla and CONFIG_UBSAN_TRAP'ed kernels.  In particular, this
will help enabling bounds sanitizer transparently for Android's GKI.

Signed-off-by: Elena Petrova <lenaptr@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Jann Horn <jannh@google.com>
Link: https://lkml.kernel.org/r/20200910134802.3160311-1-lenaptr@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoscripts/gdb/tasks: add headers and improve spacing format
Ritesh Harjani [Fri, 16 Oct 2020 03:13:32 +0000 (20:13 -0700)]
scripts/gdb/tasks: add headers and improve spacing format

With the patch.
<e.g. o/p>
      TASK          PID    COMM
0xffffffff82c2b8c0   0   swapper/0
0xffff888a0ba20040   1   systemd
0xffff888a0ba24040   2   kthreadd
0xffff888a0ba28040   3   rcu_gp

w/o
0xffffffff82c2b8c0 <init_task> 0 swapper/0
0xffff888a0ba20040 1 systemd
0xffff888a0ba24040 2 kthreadd
0xffff888a0ba28040 3 rcu_gp

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Link: http://lkml.kernel.org/r/54c868c79b5fc364a8be7799891934a6fe6d1464.1597742951.git.riteshh@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoscripts/gdb/proc: add struct mount & struct super_block addr in lx-mounts command
Ritesh Harjani [Fri, 16 Oct 2020 03:13:29 +0000 (20:13 -0700)]
scripts/gdb/proc: add struct mount & struct super_block addr in lx-mounts command

This is many times found useful while debugging some FS related
issue.

<e.g. output>
      mount          super_block     devname pathname fstype options
0xffff888a0bfa4b40 0xffff888a0bfc1000 none / rootfs rw 0 0
0xffff888a033f75c0 0xffff8889fcf65000 /dev/root / ext4 rw,relatime 0 0
0xffff8889fc8ce040 0xffff888a0bb51000 devtmpfs /dev devtmpfs rw,relatime 0 0

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Link: http://lkml.kernel.org/r/a3c4177e1597b3e06d66d55e07d72c0c46a03571.1597742951.git.riteshh@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agokernel/relay.c: drop unneeded initialization
Sudip Mukherjee [Fri, 16 Oct 2020 03:13:25 +0000 (20:13 -0700)]
kernel/relay.c: drop unneeded initialization

The variable 'consumed' is initialized with the consumed count but
immediately after that the consumed count is updated and assigned to
'consumed' again thus overwriting the previous value.  So, drop the
unneeded initialization.

Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20201005205727.1147-1-sudipm.mukherjee@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agopanic: dump registers on panic_on_warn
Alexey Kardashevskiy [Fri, 16 Oct 2020 03:13:22 +0000 (20:13 -0700)]
panic: dump registers on panic_on_warn

Currently we print stack and registers for ordinary warnings but we do not
for panic_on_warn which looks as oversight - panic() will reboot the
machine but won't print registers.

This moves printing of registers and modules earlier.

This does not move the stack dumping as panic() dumps it.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Link: https://lkml.kernel.org/r/20200804095054.68724-1-aik@ozlabs.ru
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agorapidio: fix the missed put_device() for rio_mport_add_riodev
Jing Xiangfeng [Fri, 16 Oct 2020 03:13:18 +0000 (20:13 -0700)]
rapidio: fix the missed put_device() for rio_mport_add_riodev

rio_mport_add_riodev() misses to call put_device() when the device already
exists.  Add the missed function call to fix it.

Fixes: e8de370188d0 ("rapidio: add mport char device driver")
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Link: https://lkml.kernel.org/r/20200922072525.42330-1-jingxiangfeng@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agorapidio: fix error handling path
Souptick Joarder [Fri, 16 Oct 2020 03:13:15 +0000 (20:13 -0700)]
rapidio: fix error handling path

rio_dma_transfer() attempts to clamp the return value of
pin_user_pages_fast() to be >= 0.  However, the attempt fails because
nr_pages is overridden a few lines later, and restored to the undesirable
-ERRNO value.

The return value is ultimately stored in nr_pages, which in turn is passed
to unpin_user_pages(), which expects nr_pages >= 0, else, disaster.

Fix this by fixing the nesting of the assignment to nr_pages: nr_pages
should be clamped to zero if pin_user_pages_fast() returns -ERRNO, or set
to the return value of pin_user_pages_fast(), otherwise.

[jhubbard@nvidia.com: new changelog]

Fixes: e8de370188d09 ("rapidio: add mport char device driver")
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lkml.kernel.org/r/1600227737-20785-1-git-send-email-jrdr.linux@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agonilfs2: fix some kernel-doc warnings for nilfs2
Wang Hai [Fri, 16 Oct 2020 03:13:11 +0000 (20:13 -0700)]
nilfs2: fix some kernel-doc warnings for nilfs2

Fixes the following W=1 kernel build warning(s):

fs/nilfs2/bmap.c:378: warning: Excess function parameter 'bhp' description in 'nilfs_bmap_assign'
fs/nilfs2/cpfile.c:907: warning: Excess function parameter 'status' description in 'nilfs_cpfile_change_cpmode'
fs/nilfs2/cpfile.c:946: warning: Excess function parameter 'stat' description in 'nilfs_cpfile_get_stat'
fs/nilfs2/page.c:76: warning: Excess function parameter 'inode' description in 'nilfs_forget_buffer'
fs/nilfs2/sufile.c:563: warning: Excess function parameter 'stat' description in 'nilfs_sufile_get_stat'

Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/1601386269-2423-1-git-send-email-konishi.ryusuke@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoautofs: harden ioctl table
Matthew Wilcox [Fri, 16 Oct 2020 03:13:08 +0000 (20:13 -0700)]
autofs: harden ioctl table

The table of ioctl functions should be marked const in order to put them
in read-only memory, and we should use array_index_nospec() to avoid
speculation disclosing the contents of kernel memory to userspace.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Ian Kent <raven@themaw.net>
Link: https://lkml.kernel.org/r/20200818122203.GO17456@casper.infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoramfs: fix nommu mmap with gaps in the page cache
Matthew Wilcox (Oracle) [Fri, 16 Oct 2020 03:13:04 +0000 (20:13 -0700)]
ramfs: fix nommu mmap with gaps in the page cache

ramfs needs to check that pages are both physically contiguous and
contiguous in the file.  If the page cache happens to have, eg, page A for
index 0 of the file, no page for index 1, and page A+1 for index 2, then
an mmap of the first two pages of the file will succeed when it should
fail.

Fixes: 642fb4d1f1dd ("[PATCH] NOMMU: Provide shared-writable mmap support on ramfs")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Link: https://lkml.kernel.org/r/20200914122239.GO6583@casper.infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: remove the now-unnecessary mmget_still_valid() hack
Jann Horn [Fri, 16 Oct 2020 03:13:00 +0000 (20:13 -0700)]
mm: remove the now-unnecessary mmget_still_valid() hack

The preceding patches have ensured that core dumping properly takes the
mmap_lock.  Thanks to that, we can now remove mmget_still_valid() and all
its users.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Link: http://lkml.kernel.org/r/20200827114932.3572699-8-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/gup: take mmap_lock in get_dump_page()
Jann Horn [Fri, 16 Oct 2020 03:12:57 +0000 (20:12 -0700)]
mm/gup: take mmap_lock in get_dump_page()

Properly take the mmap_lock before calling into the GUP code from
get_dump_page(); and play nice, allowing the GUP code to drop the
mmap_lock if it has to sleep.

As Linus pointed out, we don't actually need the VMA because
__get_user_pages() will flush the dcache for us if necessary.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Link: http://lkml.kernel.org/r/20200827114932.3572699-7-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agobinfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot
Jann Horn [Fri, 16 Oct 2020 03:12:54 +0000 (20:12 -0700)]
binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot

In both binfmt_elf and binfmt_elf_fdpic, use a new helper
dump_vma_snapshot() to take a snapshot of the VMA list (including the gate
VMA, if we have one) while protected by the mmap_lock, and then use that
snapshot instead of walking the VMA list without locking.

An alternative approach would be to keep the mmap_lock held across the
entire core dumping operation; however, keeping the mmap_lock locked while
we may be blocked for an unbounded amount of time (e.g.  because we're
dumping to a FUSE filesystem or so) isn't really optimal; the mmap_lock
blocks things like the ->release handler of userfaultfd, and we don't
really want critical system daemons to grind to a halt just because
someone "gifted" them SCM_RIGHTS to an eternally-locked userfaultfd, or
something like that.

Since both the normal ELF code and the FDPIC ELF code need this
functionality (and if any other binfmt wants to add coredump support in
the future, they'd probably need it, too), implement this with a common
helper in fs/coredump.c.

A downside of this approach is that we now need a bigger amount of kernel
memory per userspace VMA in the normal ELF case, and that we need O(n)
kernel memory in the FDPIC ELF case at all; but 40 bytes per VMA shouldn't
be terribly bad.

There currently is a data race between stack expansion and anything that
reads ->vm_start or ->vm_end under the mmap_lock held in read mode; to
mitigate that for core dumping, take the mmap_lock in write mode when
taking a snapshot of the VMA hierarchy.  (If we only took the mmap_lock in
read mode, we could end up with a corrupted core dump if someone does
get_user_pages_remote() concurrently.  Not really a major problem, but
taking the mmap_lock either way works here, so we might as well avoid the
issue.) (This doesn't do anything about the existing data races with stack
expansion in other mm code.)

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Link: http://lkml.kernel.org/r/20200827114932.3572699-6-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocoredump: rework elf/elf_fdpic vma_dump_size() into common helper
Jann Horn [Fri, 16 Oct 2020 03:12:50 +0000 (20:12 -0700)]
coredump: rework elf/elf_fdpic vma_dump_size() into common helper

At the moment, the binfmt_elf and binfmt_elf_fdpic code have slightly
different code to figure out which VMAs should be dumped, and if so,
whether the dump should contain the entire VMA or just its first page.

Eliminate duplicate code by reworking the binfmt_elf version into a
generic core dumping helper in coredump.c.

As part of that, change the heuristic for detecting executable/library
header pages to check whether the inode is executable instead of looking
at the file mode.

This is less problematic in terms of locking because it lets us avoid
get_user() under the mmap_sem.  (And arguably it looks nicer and makes
more sense in generic code.)

Adjust a little bit based on the binfmt_elf_fdpic version: ->anon_vma is
only meaningful under CONFIG_MMU, otherwise we have to assume that the VMA
has been written to.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Link: http://lkml.kernel.org/r/20200827114932.3572699-5-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocoredump: refactor page range dumping into common helper
Jann Horn [Fri, 16 Oct 2020 03:12:46 +0000 (20:12 -0700)]
coredump: refactor page range dumping into common helper

Both fs/binfmt_elf.c and fs/binfmt_elf_fdpic.c need to dump ranges of
pages into the coredump file.  Extract that logic into a common helper.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Link: http://lkml.kernel.org/r/20200827114932.3572699-4-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocoredump: let dump_emit() bail out on short writes
Jann Horn [Fri, 16 Oct 2020 03:12:43 +0000 (20:12 -0700)]
coredump: let dump_emit() bail out on short writes

dump_emit() has a retry loop, but there seems to be no way for that retry
logic to actually be used; and it was also buggy, writing the same data
repeatedly after a short write.

Let's just bail out on a short write.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Link: http://lkml.kernel.org/r/20200827114932.3572699-3-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agobinfmt_elf_fdpic: stop using dump_emit() on user pointers on !MMU
Jann Horn [Fri, 16 Oct 2020 03:12:40 +0000 (20:12 -0700)]
binfmt_elf_fdpic: stop using dump_emit() on user pointers on !MMU

Patch series "Fix ELF / FDPIC ELF core dumping, and use mmap_lock properly in there", v5.

At the moment, we have that rather ugly mmget_still_valid() helper to work
around <https://crbug.com/project-zero/1790>: ELF core dumping doesn't
take the mmap_sem while traversing the task's VMAs, and if anything (like
userfaultfd) then remotely messes with the VMA tree, fireworks ensue.  So
at the moment we use mmget_still_valid() to bail out in any writers that
might be operating on a remote mm's VMAs.

With this series, I'm trying to get rid of the need for that as cleanly as
possible.  ("cleanly" meaning "avoid holding the mmap_lock across
unbounded sleeps".)

Patches 1, 2, 3 and 4 are relatively unrelated cleanups in the core
dumping code.

Patches 5 and 6 implement the main change: Instead of repeatedly accessing
the VMA list with sleeps in between, we snapshot it at the start with
proper locking, and then later we just use our copy of the VMA list.  This
ensures that the kernel won't crash, that VMA metadata in the coredump is
consistent even in the presence of concurrent modifications, and that any
virtual addresses that aren't being concurrently modified have their
contents show up in the core dump properly.

The disadvantage of this approach is that we need a bit more memory during
core dumping for storing metadata about all VMAs.

At the end of the series, patch 7 removes the old workaround for this
issue (mmget_still_valid()).

I have tested:

 - Creating a simple core dump on X86-64 still works.
 - The created coredump on X86-64 opens in GDB and looks plausible.
 - X86-64 core dumps contain the first page for executable mappings at
   offset 0, and don't contain the first page for non-executable file
   mappings or executable mappings at offset !=0.
 - NOMMU 32-bit ARM can still generate plausible-looking core dumps
   through the FDPIC implementation. (I can't test this with GDB because
   GDB is missing some structure definition for nommu ARM, but I've
   poked around in the hexdump and it looked decent.)

This patch (of 7):

dump_emit() is for kernel pointers, and VMAs describe userspace memory.
Let's be tidy here and avoid accessing userspace pointers under KERNEL_DS,
even if it probably doesn't matter much on !MMU systems - especially given
that it looks like we can just use the same get_dump_page() as on MMU if
we move it out of the CONFIG_MMU block.

One small change we have to make in get_dump_page() is to use
__get_user_pages_locked() instead of __get_user_pages(), since the latter
doesn't exist on nommu.  On mmu builds, __get_user_pages_locked() will
just call __get_user_pages() for us.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Link: http://lkml.kernel.org/r/20200827114932.3572699-1-jannh@google.com
Link: http://lkml.kernel.org/r/20200827114932.3572699-2-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agotools/testing/selftests: add self-test for verifying load alignment
Chris Kennelly [Fri, 16 Oct 2020 03:12:36 +0000 (20:12 -0700)]
tools/testing/selftests: add self-test for verifying load alignment

This produces a PIE binary with a variety of p_align requirements,
suitable for verifying that the load address meets that alignment
requirement.

Signed-off-by: Chris Kennelly <ckennelly@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Hugh Dickens <hughd@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Link: https://lkml.kernel.org/r/20200820170541.1132271-3-ckennelly@google.com
Link: https://lkml.kernel.org/r/20200821233848.3904680-3-ckennelly@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agofs/binfmt_elf: use PT_LOAD p_align values for suitable start address
Chris Kennelly [Fri, 16 Oct 2020 03:12:32 +0000 (20:12 -0700)]
fs/binfmt_elf: use PT_LOAD p_align values for suitable start address

Patch series "Selecting Load Addresses According to p_align", v3.

The current ELF loading mechancism provides page-aligned mappings.  This
can lead to the program being loaded in a way unsuitable for file-backed,
transparent huge pages when handling PIE executables.

While specifying -z,max-page-size=0x200000 to the linker will generate
suitably aligned segments for huge pages on x86_64, the executable needs
to be loaded at a suitably aligned address as well.  This alignment
requires the binary's cooperation, as distinct segments need to be
appropriately paddded to be eligible for THP.

For binaries built with increased alignment, this limits the number of
bits usable for ASLR, but provides some randomization over using fixed
load addresses/non-PIE binaries.

This patch (of 2):

The current ELF loading mechancism provides page-aligned mappings.  This
can lead to the program being loaded in a way unsuitable for file-backed,
transparent huge pages when handling PIE executables.

For binaries built with increased alignment, this limits the number of
bits usable for ASLR, but provides some randomization over using fixed
load addresses/non-PIE binaries.

Tested by verifying program with -Wl,-z,max-page-size=0x200000 loading.

[akpm@linux-foundation.org: fix max() warning]
[ckennelly@google.com: augment comment]
Link: https://lkml.kernel.org/r/20200821233848.3904680-2-ckennelly@google.com
Signed-off-by: Chris Kennelly <ckennelly@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Hugh Dickens <hughd@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Link: https://lkml.kernel.org/r/20200820170541.1132271-1-ckennelly@google.com
Link: https://lkml.kernel.org/r/20200820170541.1132271-2-ckennelly@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: add new warnings to author signoff checks.
Dwaipayan Ray [Fri, 16 Oct 2020 03:12:28 +0000 (20:12 -0700)]
checkpatch: add new warnings to author signoff checks.

The author signed-off-by checks are currently very vague.  Cases like same
name or same address are not handled separately.

For example, running checkpatch on commit be6577af0cef ("parisc: Add
atomic64_set_release() define to avoid CPU soft lockups"), gives:

WARNING: Missing Signed-off-by: line by nominal patch author
'John David Anglin <dave.anglin@bell.net>'

The signoff line was:
"Signed-off-by: Dave Anglin <dave.anglin@bell.net>"

Clearly the author has signed off but with a slightly different version
of his name. A more appropriate warning would have been to point out
at the name mismatch instead.

Previously, the values assumed by $authorsignoff were either 0 or 1
to indicate whether a proper sign off by author is present.
Extended the checks to handle four new cases.

$authorsignoff values now denote the following:

0: Missing sign off by patch author.

1: Sign off present and identical.

2: Addresses and names match, but comments differ.
   "James Watson(JW) <james@gmail.com>", "James Watson <james@gmail.com>"

3: Addresses match, but names are different.
   "James Watson <james@gmail.com>", "James <james@gmail.com>"

4: Names match, but addresses are different.
   "James Watson <james@watson.com>", "James Watson <james@gmail.com>"

5: Names match, addresses excluding subaddress details (RFC 5233) match.
   "James Watson <james@gmail.com>", "James Watson <james+a@gmail.com>"

Also introduced a new message type FROM_SIGN_OFF_MISMATCH
for cases 2, 3, 4 and 5.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/linux-kernel-mentees/c1ca28e77e8e3bfa7aadf3efa8ed70f97a9d369c.camel@perches.com/
Link: https://lkml.kernel.org/r/20201007192029.551744-1-dwaipayanray1@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: fix false positive on empty block comment lines
Łukasz Stelmach [Fri, 16 Oct 2020 03:12:25 +0000 (20:12 -0700)]
checkpatch: fix false positive on empty block comment lines

To avoid false positives in presence of SPDX-License-Identifier in
networking files it is required to increase the leeway for empty block
comment lines by one line.

For example, checking drivers/net/loopback.c which starts with

    // SPDX-License-Identifier: GPL-2.0-or-later
    /*
     * INET          An implementation of the TCP/IP protocol suite for the LINUX

rsults in an unnecessary warning

    WARNING: networking block comments don't use an empty /* line, use /* Comment...
    +/*
    + * INET                An implementation of the TCP/IP protocol suite for the LINUX

Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Joe Perches <joe@perches.com>
Cc: Bartłomiej Żolnierkiewicz <b.zolnierkie@samsung.co>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lkml.kernel.org/r/20201006083509.19934-1-l.stelmach@samsung.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: fix multi-statement macro checks for while blocks.
Dwaipayan Ray [Fri, 16 Oct 2020 03:12:22 +0000 (20:12 -0700)]
checkpatch: fix multi-statement macro checks for while blocks.

Checkpatch.pl doesn't have a check for excluding while (...) {...} blocks
from MULTISTATEMENT_MACRO_USE_DO_WHILE error.

For example, running checkpatch.pl on the file mm/maccess.c in the kernel
generates the following error:

ERROR: Macros with complex values should be enclosed in parentheses
+#define copy_from_kernel_nofault_loop(dst, src, len, type, err_label)  \
+       while (len >= sizeof(type)) {                                   \
+               __get_kernel_nofault(dst, src, type, err_label);        \
+               dst += sizeof(type);                                    \
+               src += sizeof(type);                                    \
+               len -= sizeof(type);                                    \
+       }

The error is misleading for this case.  Enclosing it in parentheses
doesn't make any sense.

Checkpatch already has an exception list for such common macro types.
Added a new exception for while (...) {...} style blocks to the same.

In addition, the brace flatten logic was modified by changing the
substitution characters from "1" to "1u".  This was done to ensure that
macros in the form "#define foo(bar) while(bar){bar--;}" were also
correctly procecssed.

Link: https://lore.kernel.org/linux-kernel-mentees/dc985938aa3986702815a0bd68dfca8a03c85447.camel@perches.com/
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20201001171903.312021-1-dwaipayanray1@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: emit a warning on embedded filenames
Joe Perches [Fri, 16 Oct 2020 03:12:19 +0000 (20:12 -0700)]
checkpatch: emit a warning on embedded filenames

Embedding the complete filename path inside the file isn't particularly
useful as often the path is moved around and becomes incorrect.

Emit a warning when the source contains the filename.

[akpm@linux-foundation.org: remove stray " di"]

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/1fd5f9188a14acdca703ca00301ee323de672a8d.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: extend author Signed-off-by check for split From: header
Dwaipayan Ray [Fri, 16 Oct 2020 03:12:15 +0000 (20:12 -0700)]
checkpatch: extend author Signed-off-by check for split From: header

Checkpatch did not handle cases where the author From: header was split
into multiple lines.  The author identity could not be resolved and
checkpatch generated a false NO_AUTHOR_SIGN_OFF warning.

A typical example is commit e33bcbab16d1 ("tee: add support for session's
client UUID generation").  When checkpatch was run on this commit, it
displayed:

"WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal
patch author ''"

This was due to split header lines not being handled properly and the
author himself wrote in commit cd2614967d8b ("checkpatch: warn if missing
author Signed-off-by"):

"Split From: headers are not fully handled: only the first part
is compared."

Support split From: headers by correctly parsing the header extension
lines.  RFC 5322, Section-2.2.3 stated that each extended line must start
with a WSP character (a space or htab).  The solution was therefore to
concatenate the lines which start with a WSP to get the correct long
header.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/linux-kernel-mentees/f5d8124e54a50480b0a9fa638787bc29b6e09854.camel@perches.com/
Link: https://lkml.kernel.org/r/20200921085436.63003-1-dwaipayanray1@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: allow not using -f with files that are in git
Joe Perches [Fri, 16 Oct 2020 03:12:12 +0000 (20:12 -0700)]
checkpatch: allow not using -f with files that are in git

If a file exists in git and checkpatch is used without the -f flag for
scanning a file, then checkpatch will scan the file assuming it's a patch
and emit:

ERROR: Does not appear to be a unified-diff format patch

Change the behavior to assume the -f flag if the file exists in git.

[joe@perches.com: fix git "fatal" warning if file argument outside kernel tree]
Link: https://lkml.kernel.org/r/b6afa04112d450c2fc120a308d706acd60cee294.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Julia Lawall <julia.lawall@inria.fr>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://lkml.kernel.org/r/45b81a48e1568bd0126a96f5046eb7aaae9b83c9.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: warn on self-assignments
Joe Perches [Fri, 16 Oct 2020 03:12:09 +0000 (20:12 -0700)]
checkpatch: warn on self-assignments

The uninitialized_var() macro was removed recently via commit 63a0895d960a
("compiler: Remove uninitialized_var() macro") as it's not a particularly
useful warning and its use can "paper over real bugs".

Add a checkpatch test to warn on self-assignments as a means to avoid
compiler warnings and as a back-door mechanism to reproduce the old
uninitialized_var macro behavior.

[akpm@linux-foundation.org: coding style fixes]

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Denis Efremov <efremov@linux.com>
Cc: Julia Lawall <julia.lawall@inria.fr>
Link: https://lkml.kernel.org/r/afc2cffdd315d3e4394af149278df9e8af7f49f4.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoconst_structs.checkpatch: add pinctrl_ops and pinmux_ops
Rikard Falkeborn [Fri, 16 Oct 2020 03:12:05 +0000 (20:12 -0700)]
const_structs.checkpatch: add pinctrl_ops and pinmux_ops

All usages of include/linux of these are const pointers, and all instances
in the kernel except one, that are not const can be made const (patches
have been posted for those separately).

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Joe Perches <joe@perches.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lkml.kernel.org/r/20200830224352.37114-1-rikard.falkeborn@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: warn if trace_printk and friends are called
Nicolas Boichat [Fri, 16 Oct 2020 03:12:02 +0000 (20:12 -0700)]
checkpatch: warn if trace_printk and friends are called

trace_printk is meant as a debugging tool, and should not be compiled into
production code without specific debug Kconfig options enabled, or source
code changes, as indicated by the warning that shows up on boot if any
trace_printk is called:

 **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
 **                                                      **
 ** trace_printk() being used. Allocating extra memory.  **
 **                                                      **
 ** This means that this is a DEBUG kernel and it is     **
 ** unsafe for production use.                           **

Let's warn developers when they try to submit such a change.

Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Joe Perches <joe@perches.com>
Link: https://lkml.kernel.org/r/20200825193600.v2.1.I723c43c155f02f726c97501be77984f1e6bb740a@changeid
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoconst_structs.checkpatch: add phy_ops
Rikard Falkeborn [Fri, 16 Oct 2020 03:11:59 +0000 (20:11 -0700)]
const_structs.checkpatch: add phy_ops

All usages of phy_ops in include/linux uses const phy_ops * and all
instances of phy_ops in the kernel that are not const already can be made
const (patches have been posted for those separately).

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Cc: Vinod Koul <vkoul@kernel.org>
Link: https://lkml.kernel.org/r/20200824214132.9072-1-rikard.falkeborn@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: add test for comma use that should be semicolon
Joe Perches [Fri, 16 Oct 2020 03:11:56 +0000 (20:11 -0700)]
checkpatch: add test for comma use that should be semicolon

There are commas used as statement terminations that should typically have
used semicolons instead.  Only direct assignments or use of a single
function or value on a single line are detected by this test.

e.g.:
foo = bar(), /* typical use is semicolon not comma */
bar = baz();

Add an imperfect test to detect these comma uses.

No false positives were found in testing, but many types of false
negatives are possible.

e.g.:
foo = bar() + 1, /* comma use, but not direct assignment */
bar = baz();

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/3bf27caf462007dfa75647b040ab3191374a59de.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: move repeated word test
Joe Perches [Fri, 16 Oct 2020 03:11:52 +0000 (20:11 -0700)]
checkpatch: move repeated word test

Currently this test only works on .[ch] files.

Move the test to check more file types and the commit log.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/180b3b5677771c902b2e2f7a2b7090ede65fe004.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agocheckpatch: add --kconfig-prefix
Jerome Forissier [Fri, 16 Oct 2020 03:11:49 +0000 (20:11 -0700)]
checkpatch: add --kconfig-prefix

Kconfig allows to customize the CONFIG_ prefix via the $CONFIG_
environment variable.  Out-of-tree projects may therefore use Kconfig with
a different prefix, or they may use a custom configuration tool which does
not use the CONFIG_ prefix at all.  Such projects may still want to adhere
to the Linux kernel coding style and run checkpatch.pl.

One example is OP-TEE [1] which does not use Kconfig but does have
configuration options prefixed with CFG_.  It also mostly follows the
kernel coding style and therefore being able to use checkpatch is quite
valuable.

To make this possible, add the --kconfig-prefix command line option.

[1] https://github.com/OP-TEE/optee_os

Signed-off-by: Jerome Forissier <jerome@forissier.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/20200818081732.800449-1-jerome@forissier.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agobitops: use the same mechanism for get_count_order[_long]
Wei Yang [Fri, 16 Oct 2020 03:11:46 +0000 (20:11 -0700)]
bitops: use the same mechanism for get_count_order[_long]

These two functions share the same logic.

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lkml.kernel.org/r/20200807085837.11697-3-richard.weiyang@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agobitops: simplify get_count_order_long()
Wei Yang [Fri, 16 Oct 2020 03:11:41 +0000 (20:11 -0700)]
bitops: simplify get_count_order_long()

These two cases could be unified into one.

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lkml.kernel.org/r/20200807085837.11697-2-richard.weiyang@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/crc32.c: fix trivial typo in preprocessor condition
Tobias Jordan [Fri, 16 Oct 2020 03:11:38 +0000 (20:11 -0700)]
lib/crc32.c: fix trivial typo in preprocessor condition

Whether crc32_be needs a lookup table is chosen based on CRC_LE_BITS.
Obviously, the _be function should be governed by the _BE_ define.

This probably never pops up as it's hard to come up with a configuration
where CRC_BE_BITS isn't the same as CRC_LE_BITS and as nobody is using
bitwise CRC anyway.

Fixes: 46c5801eaf86 ("crc32: bolt on crc32c")
Signed-off-by: Tobias Jordan <kernel@cdqe.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lkml.kernel.org/r/20200923182122.GA3338@agrajag.zerfleddert.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/test_hmm.c: fix an error code in dmirror_allocate_chunk()
Dan Carpenter [Fri, 16 Oct 2020 03:11:34 +0000 (20:11 -0700)]
lib/test_hmm.c: fix an error code in dmirror_allocate_chunk()

This is supposed to return false on failure, not a negative error code.

Fixes: 170e38548b81 ("mm/hmm/test: use after free in dmirror_allocate_chunk()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Dan Williams <dan.j.williams@intel.com>
Link: https://lkml.kernel.org/r/20201010200812.GA1886610@mwanda
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoinclude/linux/list.h: add a macro to test if entry is pointing to the head
Andy Shevchenko [Fri, 16 Oct 2020 03:11:31 +0000 (20:11 -0700)]
include/linux/list.h: add a macro to test if entry is pointing to the head

Add a macro to test if entry is pointing to the head of the list which is
useful in cases like:

  list_for_each_entry(pos, &head, member) {
    if (cond)
      break;
  }
  if (list_entry_is_head(pos, &head, member))
    return -ERRNO;

that allows to avoid additional variable to be added to track if loop has
not been stopped in the middle.

While here, convert list_for_each_entry*() family of macros to use a new one.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://lkml.kernel.org/r/20200929134342.51489-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/percpu_counter.c: use helper macro abs()
Miaohe Lin [Fri, 16 Oct 2020 03:11:28 +0000 (20:11 -0700)]
lib/percpu_counter.c: use helper macro abs()

Use helper macro abs() to simplify the "x >= t || x <= -t" cmp.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200927122746.5964-1-linmiaohe@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/scatterlist.c: avoid a double memset
Christophe JAILLET [Fri, 16 Oct 2020 03:11:25 +0000 (20:11 -0700)]
lib/scatterlist.c: avoid a double memset

'sgl' is zeroed a few lines below in 'sg_init_table()'. There is no need to
clear it twice.

Remove the redundant initialization.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200920071544.368841-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/idr.c: document that ida_simple_{get,remove}() are deprecated
Stephen Boyd [Fri, 16 Oct 2020 03:11:21 +0000 (20:11 -0700)]
lib/idr.c: document that ida_simple_{get,remove}() are deprecated

These two functions are deprecated.  Users should call ida_alloc() or
ida_free() respectively instead.  Add documentation to this effect until
the macro can be removed.

Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Tri Vo <trong@android.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Link: https://lkml.kernel.org/r/20200910055246.2297797-2-swboyd@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/idr.c: document calling context for IDA APIs mustn't use locks
Stephen Boyd [Fri, 16 Oct 2020 03:11:17 +0000 (20:11 -0700)]
lib/idr.c: document calling context for IDA APIs mustn't use locks

The documentation for these functions indicates that callers don't need to
hold a lock while calling them, but that documentation is only in one
place under "IDA Usage".  Let's state the same information on each IDA
function so that it's clear what the calling context requires.
Furthermore, let's document ida_simple_get() with the same information so
that callers know how this API works.

Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tri Vo <trong@android.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Link: https://lkml.kernel.org/r/20200910055246.2297797-1-swboyd@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/mpi/mpi-bit.c: fix spello of "functions"
Randy Dunlap [Fri, 16 Oct 2020 03:11:14 +0000 (20:11 -0700)]
lib/mpi/mpi-bit.c: fix spello of "functions"

Fix typo/spello of "functions".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/8df15173-a6df-9426-7cad-a2d279bf1170@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib: test_sysctl: delete duplicated words
Randy Dunlap [Fri, 16 Oct 2020 03:11:10 +0000 (20:11 -0700)]
lib: test_sysctl: delete duplicated words

Drop the repeated word "the".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200823040520.1999-1-rdunlap@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib: syscall: delete duplicated words
Randy Dunlap [Fri, 16 Oct 2020 03:11:07 +0000 (20:11 -0700)]
lib: syscall: delete duplicated words

Drop the repeated word "the".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200823040514.26136-1-rdunlap@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib: radix-tree: delete duplicated words
Randy Dunlap [Fri, 16 Oct 2020 03:11:04 +0000 (20:11 -0700)]
lib: radix-tree: delete duplicated words

Drop the repeated word "be".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200823040508.26086-1-rdunlap@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib: earlycpio: delete duplicated words
Randy Dunlap [Fri, 16 Oct 2020 03:11:01 +0000 (20:11 -0700)]
lib: earlycpio: delete duplicated words

Drop the repeated word "the".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200823040455.25995-1-rdunlap@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib: dynamic_queue_limits: delete duplicated words + fix typo
Randy Dunlap [Fri, 16 Oct 2020 03:10:57 +0000 (20:10 -0700)]
lib: dynamic_queue_limits: delete duplicated words + fix typo

Drop the repeated word "the".
Fix spelling of "excess".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200823040449.25946-1-rdunlap@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib: decompress_bunzip2: delete duplicated words
Randy Dunlap [Fri, 16 Oct 2020 03:10:51 +0000 (20:10 -0700)]
lib: decompress_bunzip2: delete duplicated words

Drop the repeated word "how".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200823040436.25852-1-rdunlap@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib: libcrc32c: delete duplicated words
Randy Dunlap [Fri, 16 Oct 2020 03:10:48 +0000 (20:10 -0700)]
lib: libcrc32c: delete duplicated words

Drop the repeated word "the".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200823040430.25807-1-rdunlap@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib: bitmap: delete duplicated words
Randy Dunlap [Fri, 16 Oct 2020 03:10:45 +0000 (20:10 -0700)]
lib: bitmap: delete duplicated words

Drop the repeated word "an".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200823040424.25760-1-rdunlap@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoMAINTAINERS: jarkko.sakkinen@linux.intel.com -> jarkko@kernel.org
Jarkko Sakkinen [Fri, 16 Oct 2020 03:10:41 +0000 (20:10 -0700)]
MAINTAINERS: jarkko.sakkinen@linux.intel.com -> jarkko@kernel.org

Use @kernel.org address as the main communications end point.  Update the
corresponding M-entries and .mailmap (for git shortlog translation).

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Rob Herring <robh@kernel.org>
Link: https://lkml.kernel.org/r/20201015142710.8371-1-jarkko.sakkinen@linux.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoget_maintainer: exclude MAINTAINERS file(s) from --git-fallback
Joe Perches [Fri, 16 Oct 2020 03:10:37 +0000 (20:10 -0700)]
get_maintainer: exclude MAINTAINERS file(s) from --git-fallback

MAINTAINERS files generally have no specific maintainer but are updated by
individuals for subsystems all over the source tree.

Exclude MAINTAINERS file(s) from --git-fallback searches so the unlucky
individuals that update the files the most are not shown by default.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Link: https://lkml.kernel.org/r/2bacb0a9c06fbb6d56a43bf930e808c74243c908.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoget_maintainer: add test for file in VCS
Joe Perches [Fri, 16 Oct 2020 03:10:34 +0000 (20:10 -0700)]
get_maintainer: add test for file in VCS

It's somewhat common for me to ask get_maintainer to tell me who maintains
a patch file rather than the files modified by the patch.

Emit a warning if using get_maintainer.pl -f <patchfile>

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/f63229c051567041819f25e76f49d83c6e4c0f71.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>