]> www.infradead.org Git - nvme.git/log
nvme.git
4 years agodm: introduce zone append emulation
Damien Le Moal [Tue, 25 May 2021 21:25:00 +0000 (06:25 +0900)]
dm: introduce zone append emulation

For zoned targets that cannot support zone append operations, implement
an emulation using regular write operations. If the original BIO
submitted by the user is a zone append operation, change its clone into
a regular write operation directed at the target zone write pointer
position.

To do so, an array of write pointer offsets (write pointer position
relative to the start of a zone) is added to struct mapped_device. All
operations that modify a sequential zone write pointer (writes, zone
reset, zone finish and zone append) are intersepted in __map_bio() and
processed using the new functions dm_zone_map_bio().

Detection of the target ability to natively support zone append
operations is done from dm_table_set_restrictions() by calling the
function dm_set_zones_restrictions(). A target that does not support
zone append operation, either by explicitly declaring it using the new
struct dm_target field zone_append_not_supported, or because the device
table contains a non-zoned device, has its mapped device marked with the
new flag DMF_ZONE_APPEND_EMULATED. The helper function
dm_emulate_zone_append() is introduced to test a mapped device for this
new flag.

Atomicity of the zones write pointer tracking and updates is done using
a zone write locking mechanism based on a bitmap. This is similar to
the block layer method but based on BIOs rather than struct request.
A zone write lock is taken in dm_zone_map_bio() for any clone BIO with
an operation type that changes the BIO target zone write pointer
position. The zone write lock is released if the clone BIO is failed
before submission or when dm_zone_endio() is called when the clone BIO
completes.

The zone write lock bitmap of the mapped device, together with a bitmap
indicating zone types (conv_zones_bitmap) and the write pointer offset
array (zwp_offset) are allocated and initialized with a full device zone
report in dm_set_zones_restrictions() using the function
dm_revalidate_zones().

For failed operations that may have modified a zone write pointer, the
zone write pointer offset is marked as invalid in dm_zone_endio().
Zones with an invalid write pointer offset are checked and the write
pointer updated using an internal report zone operation when the
faulty zone is accessed again by the user.

All functions added for this emulation have a minimal overhead for
zoned targets natively supporting zone append operations. Regular
device targets are also not affected. The added code also does not
impact builds with CONFIG_BLK_DEV_ZONED disabled by stubbing out all
dm zone related functions.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm: rearrange core declarations for extended use from dm-zone.c
Damien Le Moal [Tue, 25 May 2021 21:24:59 +0000 (06:24 +0900)]
dm: rearrange core declarations for extended use from dm-zone.c

Move the definitions of struct dm_target_io, struct dm_io and the bits
of the flags field of struct mapped_device from dm.c to dm-core.h to
make them usable from dm-zone.c. For the same reason, declare
dec_pending() in dm-core.h after renaming it to dm_io_dec_pending().
And for symmetry of the function names, introduce the inline helper
dm_io_inc_pending() instead of directly using atomic_inc() calls.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agoblock: introduce BIO_ZONE_WRITE_LOCKED bio flag
Damien Le Moal [Tue, 25 May 2021 21:24:53 +0000 (06:24 +0900)]
block: introduce BIO_ZONE_WRITE_LOCKED bio flag

Introduce the BIO flag BIO_ZONE_WRITE_LOCKED to indicate that a BIO owns
the write lock of the zone it is targeting. This is the counterpart of
the struct request flag RQF_ZONE_WRITE_LOCKED.

This new BIO flag is reserved for now for zone write locking control
for device mapper targets exposing a zoned block device. Since in this
case, the lock flag must not be propagated to the struct request that
will be used to process the BIO, a BIO private flag is used rather than
changing the RQF_ZONE_WRITE_LOCKED request flag into a common REQ_XXX
flag that could be used for both BIO and request. This avoids conflicts
down the stack with the block IO scheduler zone write locking
(in mq-deadline).

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agoblock: introduce bio zone helpers
Damien Le Moal [Tue, 25 May 2021 21:24:52 +0000 (06:24 +0900)]
block: introduce bio zone helpers

Introduce the helper functions bio_zone_no() and bio_zone_is_seq().
Both are the BIO counterparts of the request helpers blk_rq_zone_no()
and blk_rq_zone_is_seq(), respectively returning the number of the
target zone of a bio and true if the BIO target zone is sequential.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agoblock: improve handling of all zones reset operation
Damien Le Moal [Tue, 25 May 2021 21:24:51 +0000 (06:24 +0900)]
block: improve handling of all zones reset operation

SCSI, ZNS and null_blk zoned devices support resetting all zones using
a single command (REQ_OP_ZONE_RESET_ALL), as indicated using the device
request queue flag QUEUE_FLAG_ZONE_RESETALL. This flag is not set for
device mapper targets creating zoned devices. In this case, a user
request for resetting all zones of a device is processed in
blkdev_zone_mgmt() by issuing a REQ_OP_ZONE_RESET operation for each
zone of the device. This leads to different behaviors of the
BLKRESETZONE ioctl() depending on the target device support for the
reset all operation. E.g.

blkzone reset /dev/sdX

will reset all zones of a SCSI device using a single command that will
ignore conventional, read-only or offline zones.

But a dm-linear device including conventional, read-only or offline
zones cannot be reset in the same manner as some of the single zone
reset operations issued by blkdev_zone_mgmt() will fail. E.g.:

blkzone reset /dev/dm-Y
blkzone: /dev/dm-0: BLKRESETZONE ioctl failed: Remote I/O error

To simplify applications and tools development, unify the behavior of
the all-zone reset operation by modifying blkdev_zone_mgmt() to not
issue a zone reset operation for conventional, read-only and offline
zones, thus mimicking what an actual reset-all device command does on a
device supporting REQ_OP_ZONE_RESET_ALL. This emulation is done using
the new function blkdev_zone_reset_all_emulated(). The zones needing a
reset are identified using a bitmap that is initialized using a zone
report. Since empty zones do not need a reset, also ignore these zones.
The function blkdev_zone_reset_all() is introduced for block devices
natively supporting reset all operations. blkdev_zone_mgmt() is modified
to call either function to execute an all zone reset request.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
[hch: split into multiple functions]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm: Forbid requeue of writes to zones
Damien Le Moal [Tue, 25 May 2021 21:24:58 +0000 (06:24 +0900)]
dm: Forbid requeue of writes to zones

A target map method requesting the requeue of a bio with
DM_MAPIO_REQUEUE or completing it with DM_ENDIO_REQUEUE can cause
unaligned write errors if the bio is a write operation targeting a
sequential zone. If a zoned target request such a requeue, warn about
it and kill the IO.

The function dm_is_zone_write() is introduced to detect write operations
to zoned targets.

This change does not affect the target drivers supporting zoned devices
and exposing a zoned device, namely dm-crypt, dm-linear and dm-flakey as
none of these targets ever request a requeue.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm: Introduce dm_report_zones()
Damien Le Moal [Tue, 25 May 2021 21:24:57 +0000 (06:24 +0900)]
dm: Introduce dm_report_zones()

To simplify the implementation of the report_zones operation of a zoned
target, introduce the function dm_report_zones() to set a target
mapping start sector in struct dm_report_zones_args and call
blkdev_report_zones(). This new function is exported and the report
zones callback function dm_report_zones_cb() is not.

dm-linear, dm-flakey and dm-crypt are modified to use dm_report_zones().

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm: move zone related code to dm-zone.c
Damien Le Moal [Tue, 25 May 2021 21:24:56 +0000 (06:24 +0900)]
dm: move zone related code to dm-zone.c

Move core and table code used for zoned targets and conditionally
defined with #ifdef CONFIG_BLK_DEV_ZONED to the new file dm-zone.c.
This file is conditionally compiled depending on CONFIG_BLK_DEV_ZONED.
The small helper dm_set_zones_restrictions() is introduced to
initialize a mapped device request queue zone attributes in
dm_table_set_restrictions().

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm: cleanup device_area_is_invalid()
Damien Le Moal [Tue, 25 May 2021 21:24:55 +0000 (06:24 +0900)]
dm: cleanup device_area_is_invalid()

In device_area_is_invalid(), use bdev_is_zoned() instead of open
coding the test on the zoned model returned by bdev_zoned_model().

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm: Fix dm_accept_partial_bio() relative to zone management commands
Damien Le Moal [Tue, 25 May 2021 21:24:54 +0000 (06:24 +0900)]
dm: Fix dm_accept_partial_bio() relative to zone management commands

Fix dm_accept_partial_bio() to actually check that zone management
commands are not passed as explained in the function documentation
comment. Also, since a zone append operation cannot be split, add
REQ_OP_ZONE_APPEND as a forbidden command.

White lines are added around the group of BUG_ON() calls to make the
code more legible.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm zoned: check zone capacity
Damien Le Moal [Wed, 19 May 2021 01:26:16 +0000 (10:26 +0900)]
dm zoned: check zone capacity

The dm-zoned target cannot support zoned block devices with zones that
have a capacity smaller than the zone size (e.g. NVMe zoned namespaces)
due to the current chunk zone mapping implementation as it is assumed
that zones and chunks have the same size with all blocks usable.
If a zoned drive is found to have zones with a capacity different from
the zone size, fail the target initialization.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Cc: stable@vger.kernel.org # v5.9+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm table: Constify static struct blk_ksm_ll_ops
Rikard Falkeborn [Wed, 26 May 2021 21:06:37 +0000 (23:06 +0200)]
dm table: Constify static struct blk_ksm_ll_ops

The only usage of dm_ksm_ll_ops is to make a copy of it to the ksm_ll_ops
field in the blk_keyslot_manager struct. Make it const to allow the
compiler to put it in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm writecache: interrupt writeback if suspended
Mikulas Patocka [Wed, 26 May 2021 19:49:03 +0000 (15:49 -0400)]
dm writecache: interrupt writeback if suspended

If the DM device is suspended, interrupt the writeback sequence so
that there is no excessive suspend delay.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm writecache: don't split bios when overwriting contiguous cache content
Mikulas Patocka [Wed, 26 May 2021 14:29:45 +0000 (10:29 -0400)]
dm writecache: don't split bios when overwriting contiguous cache content

If dm-writecache overwrites existing cached data, it splits the
incoming bio into many block-sized bios. The I/O scheduler does merge
these bios into one large request but this needless splitting and
merging causes performance degradation.

Fix this by avoiding bio splitting if the cache target area that is
being overwritten is contiguous.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm kcopyd: avoid spin_lock_irqsave from process context
Mikulas Patocka [Wed, 26 May 2021 14:18:06 +0000 (10:18 -0400)]
dm kcopyd: avoid spin_lock_irqsave from process context

The functions "pop", "push_head", "do_work" can only be called from
process context. Therefore, replace spin_lock_irq{save,restore} with
spin_{lock,unlock}_irq.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm kcopyd: avoid useless atomic operations
Mikulas Patocka [Wed, 26 May 2021 14:16:01 +0000 (10:16 -0400)]
dm kcopyd: avoid useless atomic operations

The functions set_bit and clear_bit are atomic. We don't need
atomicity when making flags for dm-kcopyd. So, change them to direct
manipulation of the flags.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm space map disk: cache a small number of index entries
Joe Thornber [Tue, 13 Apr 2021 12:09:32 +0000 (13:09 +0100)]
dm space map disk: cache a small number of index entries

The disk space map stores it's index entries in a btree, these are
accessed very frequently, so having a few cached makes a big difference
to performance.

With this change provisioning a new block takes roughly 20% less cpu.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm space maps: improve performance with inc/dec on ranges of blocks
Joe Thornber [Tue, 13 Apr 2021 10:03:45 +0000 (11:03 +0100)]
dm space maps: improve performance with inc/dec on ranges of blocks

When we break sharing on btree nodes we typically need to increment
the reference counts to every value held in the node.  This can
cause a lot of repeated calls to the space maps.  Fix this by changing
the interface to the space map inc/dec methods to take ranges of
adjacent blocks to be operated on.

For installations that are using a lot of snapshots this will reduce
cpu overhead of fundamental operations such as provisioning a new block,
or deleting a snapshot, by as much as 10 times.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm space maps: don't reset space map allocation cursor when committing
Joe Thornber [Tue, 13 Apr 2021 08:03:49 +0000 (09:03 +0100)]
dm space maps: don't reset space map allocation cursor when committing

Current commit code resets the place where the search for free blocks
will begin back to the start of the metadata device.  There are a couple
of repercussions to this:

- The first allocation after the commit is likely to take longer than
  normal as it searches for a free block in an area that is likely to
  have very few free blocks (if any).

- Any free blocks it finds will have been recently freed.  Reusing them
  means we have fewer old copies of the metadata to aid recovery from
  hardware error.

Fix these issues by leaving the cursor alone, only resetting when the
search hits the end of the metadata device.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm btree: improve btree residency
Joe Thornber [Thu, 8 Apr 2021 12:47:08 +0000 (13:47 +0100)]
dm btree: improve btree residency

This commit improves the residency of btrees built in the metadata for
dm-thin and dm-cache.

When inserting a new entry into a full btree node the current code
splits the node into two.  This can result in very many half full nodes,
particularly if the insertions are occurring in an ascending order (as
happens in dm-thin with large writes).

With this commit, when we insert into a full node we first try and move
some entries to a neighbouring node that has space, failing that it
tries to split two neighbouring nodes into three.

Results are given below.  'Residency' is how full nodes are on average
as a percentage.  Average instruction counts for the operations
are given to show the extra processing has little overhead.

                         +--------------------------+--------------------------+
                         |         Before           |         After            |
+------------+-----------+-----------+--------------+-----------+--------------+
|    Test    |   Phase   | Residency | Instructions | Residency | Instructions |
+------------+-----------+-----------+--------------+-----------+--------------+
| Ascending  | insert    |        50 |         1876 |        96 |         1930 |
|            | overwrite |        50 |         1789 |        96 |         1746 |
|            | lookup    |        50 |          778 |        96 |          778 |
| Descending | insert    |        50 |         3024 |        96 |         3181 |
|            | overwrite |        50 |         1789 |        96 |         1746 |
|            | lookup    |        50 |          778 |        96 |          778 |
| Random     | insert    |        68 |         3800 |        84 |         3736 |
|            | overwrite |        68 |         4254 |        84 |         3911 |
|            | lookup    |        68 |          779 |        84 |          779 |
| Runs       | insert    |        63 |         2546 |        82 |         2815 |
|            | overwrite |        63 |         2013 |        82 |         1986 |
|            | lookup    |        63 |          778 |        82 |          779 |
+------------+-----------+-----------+--------------+-----------+--------------+

   Ascending - keys are inserted in ascending order.
   Descending - keys are inserted in descending order.
   Random - keys are inserted in random order.
   Runs - keys are split into ascending runs of ~20 length.  Then
          the runs are shuffled.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com> # contains_key() fix
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm snapshot: properly fix a crash when an origin has no snapshots
Mikulas Patocka [Tue, 25 May 2021 17:17:19 +0000 (13:17 -0400)]
dm snapshot: properly fix a crash when an origin has no snapshots

If an origin target has no snapshots, o->split_boundary is set to 0.
This causes BUG_ON(sectors <= 0) in block/bio.c:bio_split().

Fix this by initializing chunk_size, and in turn split_boundary, to
rounddown_pow_of_two(UINT_MAX) -- the largest power of two that fits
into "unsigned" type.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm snapshot: revert "fix a crash when an origin has no snapshots"
Mikulas Patocka [Tue, 25 May 2021 17:16:21 +0000 (13:16 -0400)]
dm snapshot: revert "fix a crash when an origin has no snapshots"

Commit 7ee06ddc4038f936b0d4459d37a7d4d844fb03db ("dm snapshot: fix a
crash when an origin has no snapshots") introduced a regression in
snapshot merging - causing the lvm2 test lvcreate-cache-snapshot.sh
got stuck in an infinite loop.

Even though commit 7ee06ddc4038f936b0d4459d37a7d4d844fb03db was marked
for stable@ the stable team was notified to _not_ backport it.

Fixes: 7ee06ddc4038 ("dm snapshot: fix a crash when an origin has no snapshots")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm verity: fix require_signatures module_param permissions
John Keeping [Wed, 12 May 2021 11:14:21 +0000 (12:14 +0100)]
dm verity: fix require_signatures module_param permissions

The third parameter of module_param() is permissions for the sysfs node
but it looks like it is being used as the initial value of the parameter
here.  In fact, false here equates to omitting the file from sysfs and
does not affect the value of require_signatures.

Making the parameter writable is not simple because going from
false->true is fine but it should not be possible to remove the
requirement to verify a signature.  But it can be useful to inspect the
value of this parameter from userspace, so change the permissions to
make a read-only file in sysfs.

Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agoLinux 5.13-rc3
Linus Torvalds [Sun, 23 May 2021 21:42:48 +0000 (11:42 -1000)]
Linux 5.13-rc3

4 years agoMerge tag 'perf-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 23 May 2021 16:32:40 +0000 (06:32 -1000)]
Merge tag 'perf-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
 "Two perf fixes:

   - Do not check the LBR_TOS MSR when setting up unrelated LBR MSRs as
     this can cause malfunction when TOS is not supported

   - Allocate the LBR XSAVE buffers along with the DS buffers upfront
     because allocating them when adding an event can deadlock"

* tag 'perf-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/lbr: Remove cpuc->lbr_xsave allocation from atomic context
  perf/x86: Avoid touching LBR_TOS MSR for Arch LBR

4 years agoMerge tag 'locking-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 23 May 2021 16:30:08 +0000 (06:30 -1000)]
Merge tag 'locking-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
 "Two locking fixes:

   - Invoke the lockdep tracepoints in the correct place so the ordering
     is correct again

   - Don't leave the mutex WAITER bit stale when the last waiter is
     dropping out early due to a signal as that forces all subsequent
     lock operations needlessly into the slowpath until it's cleaned up
     again"

* tag 'locking-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to signal
  locking/lockdep: Correct calling tracepoints

4 years agoMerge tag 'irq-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 23 May 2021 16:28:20 +0000 (06:28 -1000)]
Merge tag 'irq-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "A few fixes for irqchip drivers:

   - Allocate interrupt descriptors correctly on Mainstone PXA when
     SPARSE_IRQ is enabled; otherwise the interrupt association fails

   - Make the APPLE AIC chip driver depend on APPLE

   - Remove redundant error output on devm_ioremap_resource() failure"

* tag 'irq-urgent-2021-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip: Remove redundant error printing
  irqchip/apple-aic: APPLE_AIC should depend on ARCH_APPLE
  ARM: PXA: Fix cplds irqdesc allocation when using legacy mode

4 years agoMerge tag 'x86_urgent_for_v5.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 23 May 2021 16:12:25 +0000 (06:12 -1000)]
Merge tag 'x86_urgent_for_v5.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Fix how SEV handles MMIO accesses by forwarding potential page faults
   instead of killing the machine and by using the accessors with the
   exact functionality needed when accessing memory.

 - Fix a confusion with Clang LTO compiler switches passed to the it

 - Handle the case gracefully when VMGEXIT has been executed in
   userspace

* tag 'x86_urgent_for_v5.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sev-es: Use __put_user()/__get_user() for data accesses
  x86/sev-es: Forward page-faults which happen during emulation
  x86/sev-es: Don't return NULL from sev_es_get_ghcb()
  x86/build: Fix location of '-plugin-opt=' flags
  x86/sev-es: Invalidate the GHCB after completing VMGEXIT
  x86/sev-es: Move sev_es_put_ghcb() in prep for follow on patch

4 years agoMerge tag 'powerpc-5.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 23 May 2021 16:07:33 +0000 (06:07 -1000)]
Merge tag 'powerpc-5.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix breakage of strace (and other ptracers etc.) when using the new
   scv ABI (Power9 or later with glibc >= 2.33).

 - Fix early_ioremap() on 64-bit, which broke booting on some machines.

Thanks to Dmitry V. Levin, Nicholas Piggin, Alexey Kardashevskiy, and
Christophe Leroy.

* tag 'powerpc-5.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s/syscall: Fix ptrace syscall info with scv syscalls
  powerpc/64s/syscall: Use pt_regs.trap to distinguish syscall ABI difference between sc and scv syscalls
  powerpc: Fix early setup to make early_ioremap() work

4 years agoMerge tag 'kbuild-fixes-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masah...
Linus Torvalds [Sun, 23 May 2021 05:53:56 +0000 (19:53 -1000)]
Merge tag 'kbuild-fixes-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Fix short log indentation for tools builds

 - Fix dummy-tools to adjust to the latest stackprotector check

* tag 'kbuild-fixes-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: dummy-tools: adjust to stricter stackprotector check
  scripts/jobserver-exec: Fix a typo ("envirnoment")
  tools build: Fix quiet cmd indentation

4 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sun, 23 May 2021 01:20:20 +0000 (15:20 -1000)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "10 patches.

  Subsystems affected by this patch series: mm (pagealloc, gup, kasan,
  and userfaultfd), ipc, selftests, watchdog, bitmap, procfs, and lib"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  userfaultfd: hugetlbfs: fix new flag usage in error path
  lib: kunit: suppress a compilation warning of frame size
  proc: remove Alexey from MAINTAINERS
  linux/bits.h: fix compilation error with GENMASK
  watchdog: reliable handling of timestamps
  kasan: slab: always reset the tag in get_freepointer_safe()
  tools/testing/selftests/exec: fix link error
  ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry
  Revert "mm/gup: check page posion status for coredump."
  mm/shuffle: fix section mismatch warning

4 years agouserfaultfd: hugetlbfs: fix new flag usage in error path
Mike Kravetz [Sun, 23 May 2021 00:42:11 +0000 (17:42 -0700)]
userfaultfd: hugetlbfs: fix new flag usage in error path

In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific
page flags") the use of PagePrivate to indicate a reservation count
should be restored at free time was changed to the hugetlb specific flag
HPageRestoreReserve.  Changes to a userfaultfd error path as well as a
VM_BUG_ON() in remove_inode_hugepages() were overlooked.

Users could see incorrect hugetlb reserve counts if they experience an
error with a UFFDIO_COPY operation.  Specifically, this would be the
result of an unlikely copy_huge_page_from_user error.  There is not an
increased chance of hitting the VM_BUG_ON.

Link: https://lkml.kernel.org/r/20210521233952.236434-1-mike.kravetz@oracle.com
Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mina Almasry <almasry.mina@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agolib: kunit: suppress a compilation warning of frame size
Zhen Lei [Sun, 23 May 2021 00:42:08 +0000 (17:42 -0700)]
lib: kunit: suppress a compilation warning of frame size

lib/bitfield_kunit.c: In function `test_bitfields_constants':
lib/bitfield_kunit.c:93:1: warning: the frame size of 7456 bytes is larger than 2048 bytes [-Wframe-larger-than=]
 }
 ^

As the description of BITFIELD_KUNIT in lib/Kconfig.debug, it "Only useful
for kernel devs running the KUnit test harness, and not intended for
inclusion into a production build".  Therefore, it is not worth modifying
variable 'test_bitfields_constants' to clear this warning.  Just suppress
it.

Link: https://lkml.kernel.org/r/20210518094533.7652-1-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Vitor Massaru Iha <vitor@massaru.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoproc: remove Alexey from MAINTAINERS
Alexey Dobriyan [Sun, 23 May 2021 00:42:05 +0000 (17:42 -0700)]
proc: remove Alexey from MAINTAINERS

People Cc me and I don't have time.

Link: https://lkml.kernel.org/r/YKarMxHJBIhMHQIh@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agolinux/bits.h: fix compilation error with GENMASK
Rikard Falkeborn [Sun, 23 May 2021 00:42:02 +0000 (17:42 -0700)]
linux/bits.h: fix compilation error with GENMASK

GENMASK() has an input check which uses __builtin_choose_expr() to
enable a compile time sanity check of its inputs if they are known at
compile time.

However, it turns out that __builtin_constant_p() does not always return
a compile time constant [0].  It was thought this problem was fixed with
gcc 4.9 [1], but apparently this is not the case [2].

Switch to use __is_constexpr() instead which always returns a compile time
constant, regardless of its inputs.

Link: https://lore.kernel.org/lkml/42b4342b-aefc-a16a-0d43-9f9c0d63ba7a@rasmusvillemoes.dk
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19449
Link: https://lore.kernel.org/lkml/1ac7bbc2-45d9-26ed-0b33-bf382b8d858b@I-love.SAKURA.ne.jp
Link: https://lkml.kernel.org/r/20210511203716.117010-1-rikard.falkeborn@gmail.com
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agowatchdog: reliable handling of timestamps
Petr Mladek [Sun, 23 May 2021 00:41:59 +0000 (17:41 -0700)]
watchdog: reliable handling of timestamps

Commit 9bf3bc949f8a ("watchdog: cleanup handling of false positives")
tried to handle a virtual host stopped by the host a more
straightforward and cleaner way.

But it introduced a risk of false softlockup reports.  The virtual host
might be stopped at any time, for example between
kvm_check_and_clear_guest_paused() and is_softlockup().  As a result,
is_softlockup() might read the updated jiffies and detects a softlockup.

A solution might be to put back kvm_check_and_clear_guest_paused() after
is_softlockup() and detect it.  But it would put back the cycle that
complicates the logic.

In fact, the handling of all the timestamps is not reliable.  The code
does not guarantee when and how many times the timestamps are read.  For
example, "period_ts" might be touched anytime also from NMI and re-read in
is_softlockup().  It works just by chance.

Fix all the problems by making the code even more explicit.

1. Make sure that "now" and "period_ts" timestamps are read only once.
   They might be changed at anytime by NMI or when the virtual guest is
   stopped by the host.  Note that "now" timestamp does this implicitly
   because "jiffies" is marked volatile.

2. "now" time must be read first.  The state of "period_ts" will
   decide whether it will be used or the period will get restarted.

3. kvm_check_and_clear_guest_paused() must be called before reading
   "period_ts".  It touches the variable when the guest was stopped.

As a result, "now" timestamp is used only when the watchdog was not
touched and the guest not stopped in the meantime.  "period_ts" is
restarted in all other situations.

Link: https://lkml.kernel.org/r/YKT55gw+RZfyoFf7@alley
Fixes: 9bf3bc949f8aeefeacea4b ("watchdog: cleanup handling of false positives")
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agokasan: slab: always reset the tag in get_freepointer_safe()
Alexander Potapenko [Sun, 23 May 2021 00:41:56 +0000 (17:41 -0700)]
kasan: slab: always reset the tag in get_freepointer_safe()

With CONFIG_DEBUG_PAGEALLOC enabled, the kernel should also untag the
object pointer, as done in get_freepointer().

Failing to do so reportedly leads to SLUB freelist corruptions that
manifest as boot-time crashes.

Link: https://lkml.kernel.org/r/20210514072228.534418-1-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Elliot Berman <eberman@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agotools/testing/selftests/exec: fix link error
Yang Yingliang [Sun, 23 May 2021 00:41:53 +0000 (17:41 -0700)]
tools/testing/selftests/exec: fix link error

Fix the link error by adding '-static':

  gcc -Wall  -Wl,-z,max-page-size=0x1000 -pie load_address.c -o /home/yang/linux/tools/testing/selftests/exec/load_address_4096
  /usr/bin/ld: /tmp/ccopEGun.o: relocation R_AARCH64_ADR_PREL_PG_HI21 against symbol `stderr@@GLIBC_2.17' which may bind externally can not be used when making a shared object; recompile with -fPIC
  /usr/bin/ld: /tmp/ccopEGun.o(.text+0x158): unresolvable R_AARCH64_ADR_PREL_PG_HI21 relocation against symbol `stderr@@GLIBC_2.17'
  /usr/bin/ld: final link failed: bad value
  collect2: error: ld returned 1 exit status
  make: *** [Makefile:25: tools/testing/selftests/exec/load_address_4096] Error 1

Link: https://lkml.kernel.org/r/20210514092422.2367367-1-yangyingliang@huawei.com
Fixes: 206e22f01941 ("tools/testing/selftests: add self-test for verifying load alignment")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Cc: Chris Kennelly <ckennelly@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry
Varad Gautam [Sun, 23 May 2021 00:41:49 +0000 (17:41 -0700)]
ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry

do_mq_timedreceive calls wq_sleep with a stack local address.  The
sender (do_mq_timedsend) uses this address to later call pipelined_send.

This leads to a very hard to trigger race where a do_mq_timedreceive
call might return and leave do_mq_timedsend to rely on an invalid
address, causing the following crash:

  RIP: 0010:wake_q_add_safe+0x13/0x60
  Call Trace:
   __x64_sys_mq_timedsend+0x2a9/0x490
   do_syscall_64+0x80/0x680
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f5928e40343

The race occurs as:

1. do_mq_timedreceive calls wq_sleep with the address of `struct
   ext_wait_queue` on function stack (aliased as `ewq_addr` here) - it
   holds a valid `struct ext_wait_queue *` as long as the stack has not
   been overwritten.

2. `ewq_addr` gets added to info->e_wait_q[RECV].list in wq_add, and
   do_mq_timedsend receives it via wq_get_first_waiter(info, RECV) to call
   __pipelined_op.

3. Sender calls __pipelined_op::smp_store_release(&this->state,
   STATE_READY).  Here is where the race window begins.  (`this` is
   `ewq_addr`.)

4. If the receiver wakes up now in do_mq_timedreceive::wq_sleep, it
   will see `state == STATE_READY` and break.

5. do_mq_timedreceive returns, and `ewq_addr` is no longer guaranteed
   to be a `struct ext_wait_queue *` since it was on do_mq_timedreceive's
   stack.  (Although the address may not get overwritten until another
   function happens to touch it, which means it can persist around for an
   indefinite time.)

6. do_mq_timedsend::__pipelined_op() still believes `ewq_addr` is a
   `struct ext_wait_queue *`, and uses it to find a task_struct to pass to
   the wake_q_add_safe call.  In the lucky case where nothing has
   overwritten `ewq_addr` yet, `ewq_addr->task` is the right task_struct.
   In the unlucky case, __pipelined_op::wake_q_add_safe gets handed a
   bogus address as the receiver's task_struct causing the crash.

do_mq_timedsend::__pipelined_op() should not dereference `this` after
setting STATE_READY, as the receiver counterpart is now free to return.
Change __pipelined_op to call wake_q_add_safe on the receiver's
task_struct returned by get_task_struct, instead of dereferencing `this`
which sits on the receiver's stack.

As Manfred pointed out, the race potentially also exists in
ipc/msg.c::expunge_all and ipc/sem.c::wake_up_sem_queue_prepare.  Fix
those in the same way.

Link: https://lkml.kernel.org/r/20210510102950.12551-1-varad.gautam@suse.com
Fixes: c5b2cbdbdac563 ("ipc/mqueue.c: update/document memory barriers")
Fixes: 8116b54e7e23ef ("ipc/sem.c: document and update memory barriers")
Fixes: 0d97a82ba830d8 ("ipc/msg.c: update and document memory barriers")
Signed-off-by: Varad Gautam <varad.gautam@suse.com>
Reported-by: Matthias von Faber <matthias.vonfaber@aox-tech.de>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoRevert "mm/gup: check page posion status for coredump."
Michal Hocko [Sun, 23 May 2021 00:41:46 +0000 (17:41 -0700)]
Revert "mm/gup: check page posion status for coredump."

While reviewing [1] I came across commit d3378e86d182 ("mm/gup: check
page posion status for coredump.") and noticed that this patch is broken
in two ways.  First it doesn't really prevent hwpoison pages from being
dumped because hwpoison pages can be marked asynchornously at any time
after the check.  Secondly, and more importantly, the patch introduces a
ref count leak because get_dump_page takes a reference on the page which
is not released.

It also seems that the patch was merged incorrectly because there were
follow up changes not included as well as discussions on how to address
the underlying problem [2]

Therefore revert the original patch.

Link: http://lkml.kernel.org/r/20210429122519.15183-4-david@redhat.com
Link: http://lkml.kernel.org/r/57ac524c-b49a-99ec-c1e4-ef5027bfb61b@redhat.com
Link: https://lkml.kernel.org/r/20210505135407.31590-1-mhocko@kernel.org
Fixes: d3378e86d182 ("mm/gup: check page posion status for coredump.")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Aili Yao <yaoaili@kingsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/shuffle: fix section mismatch warning
Arnd Bergmann [Sun, 23 May 2021 00:41:43 +0000 (17:41 -0700)]
mm/shuffle: fix section mismatch warning

clang sometimes decides not to inline shuffle_zone(), but it calls a
__meminit function.  Without the extra __meminit annotation we get this
warning:

  WARNING: modpost: vmlinux.o(.text+0x2a86d4): Section mismatch in reference from the function shuffle_zone() to the function .meminit.text:__shuffle_zone()
  The function shuffle_zone() references
  the function __meminit __shuffle_zone().
  This is often because shuffle_zone lacks a __meminit
  annotation or the annotation of __shuffle_zone is wrong.

shuffle_free_memory() did not show the same problem in my tests, but it
could happen in theory as well, so mark both as __meminit.

Link: https://lkml.kernel.org/r/20210514135952.2928094-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMerge tag 'block-5.13-2021-05-22' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 22 May 2021 17:40:34 +0000 (07:40 -1000)]
Merge tag 'block-5.13-2021-05-22' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - Fix BLKRRPART and deletion race (Gulam, Christoph)

 - NVMe pull request (Christoph):
      - nvme-tcp corruption and timeout fixes (Sagi Grimberg, Keith
        Busch)
      - nvme-fc teardown fix (James Smart)
      - nvmet/nvme-loop memory leak fixes (Wu Bo)"

* tag 'block-5.13-2021-05-22' of git://git.kernel.dk/linux-block:
  block: fix a race between del_gendisk and BLKRRPART
  block: prevent block device lookups at the beginning of del_gendisk
  nvme-fc: clear q_live at beginning of association teardown
  nvme-tcp: rerun io_work if req_list is not empty
  nvme-tcp: fix possible use-after-completion
  nvme-loop: fix memory leak in nvme_loop_create_ctrl()
  nvmet: fix memory leak in nvmet_alloc_ctrl()

4 years agoMerge tag 'io_uring-5.13-2021-05-22' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 22 May 2021 17:36:36 +0000 (07:36 -1000)]
Merge tag 'io_uring-5.13-2021-05-22' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "One fix for a regression with poll in this merge window, and another
  just hardens the io-wq exit path a bit"

* tag 'io_uring-5.13-2021-05-22' of git://git.kernel.dk/linux-block:
  io_uring: fortify tctx/io_wq cleanup
  io_uring: don't modify req->poll for rw

4 years agoMerge tag 'for-linus-5.13b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 22 May 2021 17:33:09 +0000 (07:33 -1000)]
Merge tag 'for-linus-5.13b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - a fix for a boot regression when running as PV guest on hardware
   without NX support

 - a small series fixing a bug in the Xen pciback driver when
   configuring a PCI card with multiple virtual functions

* tag 'for-linus-5.13b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen-pciback: reconfigure also from backend watch handler
  xen-pciback: redo VF placement in the virtual topology
  x86/Xen: swap NX determination and GDT setup on BSP

4 years agoMerge tag 'xfs-5.13-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Sat, 22 May 2021 04:45:09 +0000 (18:45 -1000)]
Merge tag 'xfs-5.13-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:

 - Fix some math errors in the realtime allocator when extent size hints
   are applied.

 - Fix unnecessary short writes to realtime files when free space is
   fragmented.

 - Fix a crash when using scrub tracepoints.

 - Restore ioctl uapi definitions that were accidentally removed in
   5.13-rc1.

* tag 'xfs-5.13-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: restore old ioctl definitions
  xfs: fix deadlock retry tracepoint arguments
  xfs: retry allocations when locality-based search fails
  xfs: adjust rt allocation minlen when extszhint > rtextsize

4 years agoMerge tag 'for-5.13-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Fri, 21 May 2021 23:24:12 +0000 (13:24 -1000)]
Merge tag 'for-5.13-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A few more fixes:

   - fix unaligned compressed writes in zoned mode

   - fix false positive lockdep warning when cloning inline extent

   - remove wrong BUG_ON in tree-log error handling"

* tag 'for-5.13-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: zoned: fix parallel compressed writes
  btrfs: zoned: pass start block to btrfs_use_zone_append
  btrfs: do not BUG_ON in link_to_fixup_dir
  btrfs: release path before starting transaction when cloning inline extent

4 years agoMerge tag '5.13-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Fri, 21 May 2021 23:12:51 +0000 (13:12 -1000)]
Merge tag '5.13-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Seven smb3 fixes: one for stable, three others fix problems found in
  testing handle leases, and a compounded request fix"

* tag '5.13-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6:
  Fix KASAN identified use-after-free issue.
  Defer close only when lease is enabled.
  Fix kernel oops when CONFIG_DEBUG_ATOMIC_SLEEP is enabled.
  cifs: Fix inconsistent indenting
  cifs: fix memory leak in smb2_copychunk_range
  SMB3: incorrect file id in requests compounded with open
  cifs: remove deadstore in cifs_close_all_deferred_files()

4 years agoMerge tag 'gpio-fixes-for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 21 May 2021 23:11:00 +0000 (13:11 -1000)]
Merge tag 'gpio-fixes-for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - add missing MODULE_DEVICE_TABLE in gpio-cadence

 - fix a kernel doc validator error in gpio-xilinx

 - don't set parent IRQ affinity in gpio-tegra186

* tag 'gpio-fixes-for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: tegra186: Don't set parent IRQ affinity
  gpio: xilinx: Correct kernel doc for xgpio_probe()
  gpio: cadence: Add missing MODULE_DEVICE_TABLE

4 years agoMerge tag 'mmc-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Fri, 21 May 2021 16:31:34 +0000 (06:31 -1000)]
Merge tag 'mmc-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC host fixes from Ulf Hansson:

 - Fix SD-card detection on Intel NUC10i3FNK4 (GL9755)

 - Replace WARN_ONCE with dev_warn_once for scatterlist offsets

 - Extend check of scatterlist size alignment with SD_IO_RW_EXTENDED

* tag 'mmc-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-pci-gli: increase 1.8V regulator wait
  mmc: meson-gx: also check SD_IO_RW_EXTENDED for scatterlist size alignment
  mmc: meson-gx: make replace WARN_ONCE with dev_warn_once about scatterlist offset alignment

4 years agoMerge tag 'devicetree-fixes-for-5.13-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 21 May 2021 16:24:45 +0000 (06:24 -1000)]
Merge tag 'devicetree-fixes-for-5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:

 - Another batch of removing unneeded type references in schemas

 - Fix some out of date filename references

 - Convert renesas,drif schema to use DT graph schema

* tag 'devicetree-fixes-for-5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: More removals of type references on common properties
  dt-bindings: media: renesas,drif: Use graph schema
  leds: Fix reference file name of documentation
  dt-bindings: phy: cadence-torrent: update reference file of docs

4 years agoMerge branch 'for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ebieder...
Linus Torvalds [Fri, 21 May 2021 16:12:52 +0000 (06:12 -1000)]
Merge branch 'for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

Pull siginfo fix from Eric Biederman:
 "During the merge window an issue with si_perf and the siginfo ABI came
  up. The alpha and sparc siginfo structure layout had changed with the
  addition of SIGTRAP TRAP_PERF and the new field si_perf.

  The reason only alpha and sparc were affected is that they are the
  only architectures that use si_trapno.

  Looking deeper it was discovered that si_trapno is used for only a few
  select signals on alpha and sparc, and that none of the other
  _sigfault fields past si_addr are used at all. Which means technically
  no regression on alpha and sparc.

  While the alignment concerns might be dismissed the abuse of si_errno
  by SIGTRAP TRAP_PERF does have the potential to cause regressions in
  existing userspace.

  While we still have time before userspace starts using and depending
  on the new definition siginfo for SIGTRAP TRAP_PERF this set of
  changes cleans up siginfo_t.

   - The si_trapno field is demoted from magic alpha and sparc status
     and made an ordinary union member of the _sigfault member of
     siginfo_t. Without moving it of course.

   - si_perf is replaced with si_perf_data and si_perf_type ending the
     abuse of si_errno.

   - Unnecessary additions to signalfd_siginfo are removed"

* 'for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  signalfd: Remove SIL_PERF_EVENT fields from signalfd_siginfo
  signal: Deliver all of the siginfo perf data in _perf
  signal: Factor force_sig_perf out of perf_sigtrap
  signal: Implement SIL_FAULT_TRAPNO
  siginfo: Move si_trapno inside the union inside _si_fault

4 years agoMerge tag 'modules-for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 21 May 2021 16:09:17 +0000 (06:09 -1000)]
Merge tag 'modules-for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull module fix from Jessica Yu:
 "When CONFIG_MODULE_UNLOAD=n, module exit sections get sorted into the
  init region of the module in order to satisfy the requirements of
  jump_labels and static_calls.

  Previously, the exit section check was done in module_init_section(),
  but the solution there is not completely arch-indepedent as ARM is a
  special case and supplies its own module_init_section() function.

  Instead of pushing this logic further to the arch-specific code,
  switch to an arch-independent solution to check for module exit
  sections in the core module loader code in layout_sections() instead"

* tag 'modules-for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: check for exit sections in layout_sections() instead of module_init_section()

4 years agoMerge tag 'for-linus' of git://github.com/openrisc/linux
Linus Torvalds [Fri, 21 May 2021 16:06:19 +0000 (06:06 -1000)]
Merge tag 'for-linus' of git://github.com/openrisc/linux

Pull OpenRISC fixes from Stafford Horne:
 "A few fixes that came in around the time of the merge window"

* tag 'for-linus' of git://github.com/openrisc/linux:
  openrisc: Define memory barrier mb
  openrisc: mm/init.c: remove unused variable 'end' in paging_init()
  openrisc: mm/init.c: remove unused memblock_region variable in map_ram()
  openrisc: Fix a memory leak

4 years agoxen-pciback: reconfigure also from backend watch handler
Jan Beulich [Tue, 18 May 2021 16:14:07 +0000 (18:14 +0200)]
xen-pciback: reconfigure also from backend watch handler

When multiple PCI devices get assigned to a guest right at boot, libxl
incrementally populates the backend tree. The writes for the first of
the devices trigger the backend watch. In turn xen_pcibk_setup_backend()
will set the XenBus state to Initialised, at which point no further
reconfigures would happen unless a device got hotplugged. Arrange for
reconfigure to also get triggered from the backend watch handler.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/2337cbd6-94b9-4187-9862-c03ea12e0c61@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
4 years agoxen-pciback: redo VF placement in the virtual topology
Jan Beulich [Tue, 18 May 2021 16:13:42 +0000 (18:13 +0200)]
xen-pciback: redo VF placement in the virtual topology

The commit referenced below was incomplete: It merely affected what
would get written to the vdev-<N> xenstore node. The guest would still
find the function at the original function number as long as
__xen_pcibk_get_pci_dev() wouldn't be in sync. The same goes for AER wrt
__xen_pcibk_get_pcifront_dev().

Undo overriding the function to zero and instead make sure that VFs at
function zero remain alone in their slot. This has the added benefit of
improving overall capacity, considering that there's only a total of 32
slots available right now (PCI segment and bus can both only ever be
zero at present).

Fixes: 8a5248fe10b1 ("xen PV passthru: assign SR-IOV virtual functions to separate virtual slots")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/8def783b-404c-3452-196d-3f3fd4d72c9e@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
4 years agox86/Xen: swap NX determination and GDT setup on BSP
Jan Beulich [Thu, 20 May 2021 11:42:42 +0000 (13:42 +0200)]
x86/Xen: swap NX determination and GDT setup on BSP

xen_setup_gdt(), via xen_load_gdt_boot(), wants to adjust page tables.
For this to work when NX is not available, x86_configure_nx() needs to
be called first.

[jgross] Note that this is a revert of 36104cb9012a82e73 ("x86/xen:
Delay get_cpu_cap until stack canary is established"), which is possible
now that we no longer support running as PV guest in 32-bit mode.

Cc: <stable.vger.kernel.org> # 5.9
Fixes: 36104cb9012a82e73 ("x86/xen: Delay get_cpu_cap until stack canary is established")
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/12a866b0-9e89-59f7-ebeb-a2a6cec0987a@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
4 years agoMerge tag 'drm-fixes-2021-05-21-1' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 21 May 2021 06:15:43 +0000 (20:15 -1000)]
Merge tag 'drm-fixes-2021-05-21-1' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Usual collection, mostly amdgpu and some i915 regression fixes. I
  nearly managed to hose my build/sign machine this week, but I
  recovered it just in time, and I even got clang12 built.

  dma-buf:
   - WARN fix

  amdgpu:
   - Fix downscaling ratio on DCN3.x
   - Fix for non-4K pages
   - PCO/RV compute hang fix
   - Dongle fix
   - Aldebaran codec query support
   - Refcount leak fix
   - Use after free fix
   - Navi12 golden settings updates
   - GPU reset fixes

  radeon:
   - Fix for imported BO handling

  i915:
   - Pin the L-shape quirked object as unshrinkable to fix crashes
   - Disable HiZ Raw Stall Optimization on broken gen7 to fix glitches,
     gfx corruption
   - GVT: Move mdev attribute groups into kvmgt module to fix kconfig
     deps issue

  exynos:
   - Correct kerneldoc of fimd_shadow_protect_win function
   - Drop redundant error messages"

* tag 'drm-fixes-2021-05-21-1' of git://anongit.freedesktop.org/drm/drm:
  dma-buf: fix unintended pin/unpin warnings
  drm/amdgpu: stop touching sched.ready in the backend
  drm/amd/amdgpu: fix a potential deadlock in gpu reset
  drm/amdgpu: update sdma golden setting for Navi12
  drm/amdgpu: update gc golden setting for Navi12
  drm/amdgpu: Fix a use-after-free
  drm/amdgpu: add video_codecs query support for aldebaran
  drm/amd/amdgpu: fix refcount leak
  drm/amd/display: Disconnect non-DP with no EDID
  drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
  drm/amdgpu: Fix GPU TLB update error when PAGE_SIZE > AMDGPU_PAGE_SIZE
  drm/radeon: use the dummy page for GART if needed
  drm/amd/display: Use the correct max downscaling value for DCN3.x family
  drm/i915/gt: Disable HiZ Raw Stall Optimization on broken gen7
  drm/i915/gem: Pin the L-shape quirked object as unshrinkable
  drm/exynos/decon5433: Remove redundant error printing in exynos5433_decon_probe()
  drm/exynos: Remove redundant error printing in exynos_dsi_probe()
  drm/exynos: correct exynos_drm_fimd kerneldoc
  drm/i915/gvt: Move mdev attribute groups into kvmgt module

4 years agoMerge tag 'amd-drm-fixes-5.13-2021-05-19' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 21 May 2021 04:08:04 +0000 (14:08 +1000)]
Merge tag 'amd-drm-fixes-5.13-2021-05-19' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.13-2021-05-19:

amdgpu:
- Fix downscaling ratio on DCN3.x
- Fix for non-4K pages
- PCO/RV compute hang fix
- Dongle fix
- Aldebaran codec query support
- Refcount leak fix
- Use after free fix
- Navi12 golden settings updates
- GPU reset fixes

radeon:
- Fix for imported BO handling

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210520022500.4023-1-alexander.deucher@amd.com
4 years agoMerge tag 'drm-intel-fixes-2021-05-20' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 21 May 2021 03:41:41 +0000 (13:41 +1000)]
Merge tag 'drm-intel-fixes-2021-05-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

drm/i915 fixes for v5.13-rc3:
- Pin the L-shape quirked object as unshrinkable to fix crashes
- Disable HiZ Raw Stall Optimization on broken gen7 to fix glitches, gfx corruption
- GVT: Move mdev attribute groups into kvmgt module to fix kconfig deps issue

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87a6opehx6.fsf@intel.com
4 years agoMerge tag 'drm-misc-fixes-2021-05-20' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 21 May 2021 01:10:04 +0000 (11:10 +1000)]
Merge tag 'drm-misc-fixes-2021-05-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Just a single fix for a dma-buf related WARN

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210520140808.ds6bk6i3oarmiea6@gilmour
4 years agoMerge tag 'exynos-drm-fixes-for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux...
Dave Airlie [Fri, 21 May 2021 01:02:24 +0000 (11:02 +1000)]
Merge tag 'exynos-drm-fixes-for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes

Fixup
- Correct kerneldoc of fimd_shadow_protect_win function.

Cleanup
- Drop redundant error messages.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210520034747.257687-1-inki.dae@samsung.com
4 years agoMerge tag 'arm-soc-fixes-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 21 May 2021 00:46:26 +0000 (14:46 -1000)]
Merge tag 'arm-soc-fixes-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Only a small number of fixes so far, including some that I had applied
  during the merge window, so this is based on the original merge of the
  other branches.

   - The largest change is a fix for a reference counting bug in the AMD
     TEE driver.

   - Neil Armstrong now co-maintains Amlogic SoC support

   - Two build warning fixes for renesas device tree files

   - A sign expansion bug for optee

   - A DT binding fix for a mismerge"

* tag 'arm-soc-fixes-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: npcm: wpcm450: select interrupt controller driver
  MAINTAINERS: ARM/Amlogic SoCs: add Neil as primary maintainer
  tee: amdtee: unload TA only when its refcount becomes 0
  dt-bindings: nvmem: mediatek: remove duplicate mt8192 line
  firmware: arm_scmi: Remove duplicate declaration of struct scmi_protocol_handle
  firmware: arm_scpi: Prevent the ternary sign expansion bug
  arm64: dts: renesas: Add port@0 node for all CSI-2 nodes to dtsi
  arm64: dts: renesas: aistarvision-mipi-adapter-2.1: Fix CSI40 ports

4 years agoMerge branch 'urgent.2021.05.20a' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 21 May 2021 00:43:33 +0000 (14:43 -1000)]
Merge branch 'urgent.2021.05.20a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull kcsan fix from Paul McKenney:
 "Fix for a regression introduced in this merge window by commit
  e36299efe7d7 ("kcsan, debugfs: Move debugfs file creation out of early
  init").

  The regression is not easy to trigger, requiring a KCSAN build using
  clang with CONFIG_LTO_CLANG=y. The fix is to simply make the
  kcsan_debugfs_init() function's type initcall-compatible. This has
  been posted to the relevant mailing lists:"

* 'urgent.2021.05.20a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  kcsan: Fix debugfs initcall return type

4 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Fri, 21 May 2021 00:41:35 +0000 (14:41 -1000)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Eight small fixes, all in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: pm80xx: Fix drives missing during rmmod/insmod loop
  scsi: qla2xxx: Fix error return code in qla82xx_write_flash_dword()
  scsi: qedf: Add pointer checks in qedf_update_link_speed()
  scsi: ufs: core: Increase the usable queue depth
  scsi: BusLogic: Fix 64-bit system enumeration error for Buslogic
  scsi: ufs: ufs-mediatek: Fix power down spec violation

4 years agoMerge tag 'for-5.13/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device...
Linus Torvalds [Fri, 21 May 2021 00:36:21 +0000 (14:36 -1000)]
Merge tag 'for-5.13/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Fix a couple DM snapshot target crashes exposed by user-error.

 - Fix DM integrity target to not use discard optimization, introduced
   during 5.13 merge, when recalulating.

 - Fix some sparse warnings in DM integrity target.

* tag 'for-5.13/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm integrity: fix sparse warnings
  dm integrity: revert to not using discard filler when recalulating
  dm snapshot: fix crash with transient storage and zero chunk size
  dm snapshot: fix a crash when an origin has no snapshots

4 years agoFix KASAN identified use-after-free issue.
Rohith Surabattula [Thu, 20 May 2021 16:45:01 +0000 (16:45 +0000)]
Fix KASAN identified use-after-free issue.

[  612.157429] ==================================================================
[  612.158275] BUG: KASAN: use-after-free in process_one_work+0x90/0x9b0
[  612.158801] Read of size 8 at addr ffff88810a31ca60 by task kworker/2:9/2382

[  612.159611] CPU: 2 PID: 2382 Comm: kworker/2:9 Tainted: G
OE     5.13.0-rc2+ #98
[  612.159623] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.14.0-1.fc33 04/01/2014
[  612.159640] Workqueue:  0x0 (deferredclose)
[  612.159669] Call Trace:
[  612.159685]  dump_stack+0xbb/0x107
[  612.159711]  print_address_description.constprop.0+0x18/0x140
[  612.159733]  ? process_one_work+0x90/0x9b0
[  612.159743]  ? process_one_work+0x90/0x9b0
[  612.159754]  kasan_report.cold+0x7c/0xd8
[  612.159778]  ? lock_is_held_type+0x80/0x130
[  612.159789]  ? process_one_work+0x90/0x9b0
[  612.159812]  kasan_check_range+0x145/0x1a0
[  612.159834]  process_one_work+0x90/0x9b0
[  612.159877]  ? pwq_dec_nr_in_flight+0x110/0x110
[  612.159914]  ? spin_bug+0x90/0x90
[  612.159967]  worker_thread+0x3b6/0x6c0
[  612.160023]  ? process_one_work+0x9b0/0x9b0
[  612.160038]  kthread+0x1dc/0x200
[  612.160051]  ? kthread_create_worker_on_cpu+0xd0/0xd0
[  612.160092]  ret_from_fork+0x1f/0x30

[  612.160399] Allocated by task 2358:
[  612.160757]  kasan_save_stack+0x1b/0x40
[  612.160768]  __kasan_kmalloc+0x9b/0xd0
[  612.160778]  cifs_new_fileinfo+0xb0/0x960 [cifs]
[  612.161170]  cifs_open+0xadf/0xf20 [cifs]
[  612.161421]  do_dentry_open+0x2aa/0x6b0
[  612.161432]  path_openat+0xbd9/0xfa0
[  612.161441]  do_filp_open+0x11d/0x230
[  612.161450]  do_sys_openat2+0x115/0x240
[  612.161460]  __x64_sys_openat+0xce/0x140

When mod_delayed_work is called to modify the delay of pending work,
it might return false and queue a new work when pending work is
already scheduled or when try to grab pending work failed.

So, Increase the reference count when new work is scheduled to
avoid use-after-free.

Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Thu, 20 May 2021 16:44:04 +0000 (06:44 -1000)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "A mixture of small bug fixes, most for longer standing problems:

   - NULL pointer crash in siw

   - Various error unwind bugs in siw, rxe, cm

   - User triggerable errors in uverbs

   - Minor bugs in mlx5 and rxe drivers"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/uverbs: Fix a NULL vs IS_ERR() bug
  RDMA/mlx5: Fix query DCT via DEVX
  RDMA/core: Don't access cm_id after its destruction
  RDMA/rxe: Return CQE error if invalid lkey was supplied
  RDMA/mlx5: Recover from fatal event in dual port mode
  RDMA/mlx5: Verify that DM operation is reasonable
  RDMA/rxe: Clear all QP fields if creation failed
  RDMA/core: Prevent divide-by-zero error triggered by the user
  RDMA/siw: Release xarray entry
  RDMA/siw: Properly check send and receive CQ pointers

4 years agoMerge tag 'sound-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 20 May 2021 16:42:21 +0000 (06:42 -1000)]
Merge tag 'sound-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "All small device-specific fixes here: a series of FireWire audio
  fixes, UAF and other fixes in USB-audio and co spotted by fuzzer,
  and a few HD-audio quirks as usual"

* tag 'sound-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: line6: Fix racy initialization of LINE6 MIDI
  ALSA: dice: fix stream format for TC Electronic Konnekt Live at high sampling transfer frequency
  ALSA: dice: disable double_pcm_frames mode for M-Audio Profire 610, 2626 and Avid M-Box 3 Pro
  ALSA: intel8x0: Don't update period unless prepared
  ALSA: hda/realtek: Add some CLOVE SSIDs of ALC293
  ALSA: firewire-lib: fix amdtp_packet tracepoints event for packet_index field
  ALSA: firewire-lib: fix calculation for size of IR context payload
  ALSA: firewire-lib: fix check for the size of isochronous packet payload
  ALSA: bebob/oxfw: fix Kconfig entry for Mackie d.2 Pro
  ALSA: dice: fix stream format at middle sampling rate for Alesis iO 26
  ALSA: hda/realtek: Add fixup for HP Spectre x360 15-df0xxx
  ALSA: usb-audio: Fix potential out-of-bounce access in MIDI EP parser
  ALSA: usb-audio: Validate MS endpoint descriptors
  ALSA: hda: fixup headset for ASUS GU502 laptop
  ALSA: hda/realtek: reset eapd coeff to default value for alc287

4 years agoMerge tag 'platform-drivers-x86-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 20 May 2021 16:40:20 +0000 (06:40 -1000)]
Merge tag 'platform-drivers-x86-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Hans de Goede:
 "Assorted pdx86 bug-fixes and model-specific quirks for 5.13"

* tag 'platform-drivers-x86-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: touchscreen_dmi: Add info for the Chuwi Hi10 Pro (CWI529) tablet
  platform/x86: touchscreen_dmi: Add info for the Mediacom Winpad 7.0 W700 tablet
  platform/x86: intel_punit_ipc: Append MODULE_DEVICE_TABLE for ACPI
  platform/x86: dell-smbios-wmi: Fix oops on rmmod dell_smbios
  platform/x86: hp-wireless: add AMD's hardware id to the supported list
  platform/x86: intel_int0002_vgpio: Only call enable_irq_wake() when using s2idle
  platform/x86: gigabyte-wmi: add support for B550 Aorus Elite
  platform/x86: gigabyte-wmi: add support for X570 UD
  platform/x86: gigabyte-wmi: streamline dmi matching
  platform/mellanox: mlxbf-tmfifo: Fix a memory barrier issue
  platform/surface: dtx: Fix poll function
  platform/surface: aggregator: Add platform-drivers-x86 list to MAINTAINERS entry
  platform/surface: aggregator: avoid clang -Wconstant-conversion warning
  platform/surface: aggregator: Do not mark interrupt as shared
  platform/x86: hp_accel: Avoid invoking _INI to speed up resume
  platform/x86: ideapad-laptop: fix method name typo
  platform/x86: ideapad-laptop: fix a NULL pointer dereference

4 years agoMerge tag 'char-misc-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Thu, 20 May 2021 16:31:52 +0000 (06:31 -1000)]
Merge tag 'char-misc-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here is a big set of char/misc/other driver fixes for 5.13-rc3.

  The majority here is the fallout of the umn.edu re-review of all prior
  submissions. That resulted in a bunch of reverts along with the
  "correct" changes made, such that there is no regression of any of the
  potential fixes that were made by those individuals. I would like to
  thank the over 80 different developers who helped with the review and
  fixes for this mess.

  Other than that, there's a few habanna driver fixes for reported
  issues, and some dyndbg fixes for reported problems.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'char-misc-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (82 commits)
  misc: eeprom: at24: check suspend status before disable regulator
  uio_hv_generic: Fix another memory leak in error handling paths
  uio_hv_generic: Fix a memory leak in error handling paths
  uio/uio_pci_generic: fix return value changed in refactoring
  Revert "Revert "ALSA: usx2y: Fix potential NULL pointer dereference""
  dyndbg: drop uninformative vpr_info
  dyndbg: avoid calling dyndbg_emit_prefix when it has no work
  binder: Return EFAULT if we fail BINDER_ENABLE_ONEWAY_SPAM_DETECTION
  cdrom: gdrom: initialize global variable at init time
  brcmfmac: properly check for bus register errors
  Revert "brcmfmac: add a check for the status of usb_register"
  video: imsttfb: check for ioremap() failures
  Revert "video: imsttfb: fix potential NULL pointer dereferences"
  net: liquidio: Add missing null pointer checks
  Revert "net: liquidio: fix a NULL pointer dereference"
  media: gspca: properly check for errors in po1030_probe()
  Revert "media: gspca: Check the return value of write_bridge for timeout"
  media: gspca: mt9m111: Check write_bridge for timeout
  Revert "media: gspca: mt9m111: Check write_bridge for timeout"
  media: dvb: Add check on sp8870_readreg return
  ...

4 years agoMerge tag 'quota_for_v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 20 May 2021 16:20:15 +0000 (06:20 -1000)]
Merge tag 'quota_for_v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull quota fixes from Jan Kara:
 "The most important part in the pull is disablement of the new syscall
  quotactl_path() which was added in rc1.

  The reason is some people at LWN discussion pointed out dirfd would be
  useful for this path based syscall and Christian Brauner agreed.

  Without dirfd it may be indeed problematic for containers. So let's
  just disable the syscall for now when it doesn't have users yet so
  that we have more time to mull over how to best specify the filesystem
  we want to work on"

* tag 'quota_for_v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: Disable quotactl_path syscall
  quota: Use 'hlist_for_each_entry' to simplify code

4 years agoxfs: restore old ioctl definitions
Darrick J. Wong [Wed, 12 May 2021 23:43:10 +0000 (16:43 -0700)]
xfs: restore old ioctl definitions

These ioctl definitions in xfs_fs.h are part of the userspace ABI and
were mistakenly removed during the 5.13 merge window.

Fixes: 9fefd5db08ce ("xfs: convert to fileattr")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
4 years agoxfs: fix deadlock retry tracepoint arguments
Darrick J. Wong [Wed, 12 May 2021 23:41:13 +0000 (16:41 -0700)]
xfs: fix deadlock retry tracepoint arguments

sc->ip is the inode that's being scrubbed, which means that it's not set
for scrub types that don't involve inodes.  If one of those scrubbers
(e.g. inode btrees) returns EDEADLOCK, we'll trip over the null pointer.
Fix that by reporting either the file being examined or the file that
was used to call scrub.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
4 years agoxfs: retry allocations when locality-based search fails
Darrick J. Wong [Sun, 9 May 2021 23:22:55 +0000 (16:22 -0700)]
xfs: retry allocations when locality-based search fails

If a realtime allocation fails because we can't find a sufficiently
large free extent satisfying locality rules, relax the locality rules
and try again.  This reduces the occurrence of short writes to realtime
files when the write size is large and the free space is fragmented.

This was originally discovered by running generic/186 with the realtime
reflink patchset and a 128k cow extent size hint, but the short write
symptoms can manifest with a 128k extent size hint and no reflink, so
apply the fix now.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
4 years agopowerpc/64s/syscall: Fix ptrace syscall info with scv syscalls
Nicholas Piggin [Thu, 20 May 2021 11:19:31 +0000 (21:19 +1000)]
powerpc/64s/syscall: Fix ptrace syscall info with scv syscalls

The scv implementation missed updating syscall return value and error
value get/set functions to deal with the changed register ABI. This
broke ptrace PTRACE_GET_SYSCALL_INFO as well as some kernel auditing
and tracing functions.

Fix. tools/testing/selftests/ptrace/get_syscall_info now passes when
scv is used.

Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions")
Cc: stable@vger.kernel.org # v5.9+
Reported-by: "Dmitry V. Levin" <ldv@altlinux.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210520111931.2597127-2-npiggin@gmail.com
4 years agopowerpc/64s/syscall: Use pt_regs.trap to distinguish syscall ABI difference between...
Nicholas Piggin [Thu, 20 May 2021 11:19:30 +0000 (21:19 +1000)]
powerpc/64s/syscall: Use pt_regs.trap to distinguish syscall ABI difference between sc and scv syscalls

The sc and scv 0 system calls have different ABI conventions, and
ptracers need to know which system call type is being used if they want
to look at the syscall registers.

Document that pt_regs.trap can be used for this, and fix one in-tree user
to work with scv 0 syscalls.

Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions")
Cc: stable@vger.kernel.org # v5.9+
Reported-by: "Dmitry V. Levin" <ldv@altlinux.org>
Suggested-by: "Dmitry V. Levin" <ldv@altlinux.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210520111931.2597127-1-npiggin@gmail.com
4 years agoblock: fix a race between del_gendisk and BLKRRPART
Gulam Mohamed [Fri, 14 May 2021 13:18:42 +0000 (15:18 +0200)]
block: fix a race between del_gendisk and BLKRRPART

When BLKRRPART is called concurrently with del_gendisk, the partitions
rescan can create a stale partition that will never be be cleaned up.

Fix this by checking the the disk is up before rescanning partitions
while under bd_mutex.

Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210514131842.1600568-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoblock: prevent block device lookups at the beginning of del_gendisk
Christoph Hellwig [Fri, 14 May 2021 13:18:41 +0000 (15:18 +0200)]
block: prevent block device lookups at the beginning of del_gendisk

As an artifact of how gendisk lookup used to work in earlier kernels,
GENHD_FL_UP is only cleared very late in del_gendisk, and a global lock
is used to prevent opens from succeeding while del_gendisk is tearing
down the gendisk.  Switch to clearing the flag early and under bd_mutex
so that callers can use bd_mutex to stabilize the flag, which removes
the need for the global mutex.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210514131842.1600568-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoMerge tag 'nvme-5.13-2021-05-20' of git://git.infradead.org/nvme into block-5.13
Jens Axboe [Thu, 20 May 2021 13:54:49 +0000 (07:54 -0600)]
Merge tag 'nvme-5.13-2021-05-20' of git://git.infradead.org/nvme into block-5.13

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.13:

 - nvme-tcp corruption and timeout fixes (Sagi Grimberg, Keith Busch)
 - nvme-fc teardown fix (James Smart)
 - nvmet/nvme-loop memory leak fixes (Wu Bo)"

* tag 'nvme-5.13-2021-05-20' of git://git.infradead.org/nvme:
  nvme-fc: clear q_live at beginning of association teardown
  nvme-tcp: rerun io_work if req_list is not empty
  nvme-tcp: fix possible use-after-completion
  nvme-loop: fix memory leak in nvme_loop_create_ctrl()
  nvmet: fix memory leak in nvmet_alloc_ctrl()

4 years agobtrfs: zoned: fix parallel compressed writes
Johannes Thumshirn [Tue, 18 May 2021 15:40:28 +0000 (00:40 +0900)]
btrfs: zoned: fix parallel compressed writes

When multiple processes write data to the same block group on a
compressed zoned filesystem, the underlying device could report I/O
errors and data corruption is possible.

This happens because on a zoned file system, compressed data writes
where sent to the device via a REQ_OP_WRITE instead of a
REQ_OP_ZONE_APPEND operation. But with REQ_OP_WRITE and parallel
submission it cannot be guaranteed that the data is always submitted
aligned to the underlying zone's write pointer.

The change to using REQ_OP_ZONE_APPEND instead of REQ_OP_WRITE on a
zoned filesystem is non intrusive on a regular file system or when
submitting to a conventional zone on a zoned filesystem, as it is
guarded by btrfs_use_zone_append.

Reported-by: David Sterba <dsterba@suse.com>
Fixes: 9d294a685fbc ("btrfs: zoned: enable to mount ZONED incompat flag")
CC: stable@vger.kernel.org # 5.12.x: e380adfc213a13: btrfs: zoned: pass start block to btrfs_use_zone_append
CC: stable@vger.kernel.org # 5.12.x
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
4 years agobtrfs: zoned: pass start block to btrfs_use_zone_append
Johannes Thumshirn [Tue, 18 May 2021 15:40:27 +0000 (00:40 +0900)]
btrfs: zoned: pass start block to btrfs_use_zone_append

btrfs_use_zone_append only needs the passed in extent_map's block_start
member, so there's no need to pass in the full extent map.

This also enables the use of btrfs_use_zone_append in places where we only
have a start byte but no extent_map.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
4 years agoio_uring: fortify tctx/io_wq cleanup
Pavel Begunkov [Thu, 20 May 2021 12:21:20 +0000 (13:21 +0100)]
io_uring: fortify tctx/io_wq cleanup

We don't want anyone poking into tctx->io_wq awhile it's being destroyed
by io_wq_put_and_exit(), and even though it shouldn't even happen, if
buggy would be preferable to get a NULL-deref instead of subtle delayed
failure or UAF.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/827b021de17926fd807610b3e53a5a5fa8530856.1621513214.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoplatform/x86: touchscreen_dmi: Add info for the Chuwi Hi10 Pro (CWI529) tablet
Hans de Goede [Thu, 20 May 2021 09:32:28 +0000 (11:32 +0200)]
platform/x86: touchscreen_dmi: Add info for the Chuwi Hi10 Pro (CWI529) tablet

Add touchscreen info for the Chuwi Hi10 Pro (CWI529) tablet. This includes
info for getting the firmware directly from the UEFI, so that the user does
not need to manually install the firmware in /lib/firmware/silead.

This change will make the touchscreen on these devices work OOTB,
without requiring any manual setup.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210520093228.7439-1-hdegoede@redhat.com
4 years agodma-buf: fix unintended pin/unpin warnings
Christian König [Mon, 17 May 2021 11:20:17 +0000 (13:20 +0200)]
dma-buf: fix unintended pin/unpin warnings

DMA-buf internal users call the pin/unpin functions without having a
dynamic attachment. Avoid the warning and backtrace in the logs.

Signed-off-by: Christian König <christian.koenig@amd.com>
Bugs: https://gitlab.freedesktop.org/drm/intel/-/issues/3481
Fixes: c545781e1c55 ("dma-buf: doc polish for pin/unpin")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
CC: stable@kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210517115705.2141-1-christian.koenig@amd.com
4 years agopowerpc: Fix early setup to make early_ioremap() work
Alexey Kardashevskiy [Thu, 20 May 2021 03:29:19 +0000 (13:29 +1000)]
powerpc: Fix early setup to make early_ioremap() work

The immediate problem is that after commit
0bd3f9e953bd ("powerpc/legacy_serial: Use early_ioremap()") the kernel
silently reboots on some systems.

The reason is that early_ioremap() returns broken addresses as it uses
slot_virt[] array which initialized with offsets from FIXADDR_TOP ==
IOREMAP_END+FIXADDR_SIZE == KERN_IO_END - FIXADDR_SIZ + FIXADDR_SIZE ==
__kernel_io_end which is 0 when early_ioremap_setup() is called.
__kernel_io_end is initialized little bit later in early_init_mmu().

This fixes the initialization by swapping early_ioremap_setup() and
early_init_mmu().

Fixes: 265c3491c4bc ("powerpc: Add support for GENERIC_EARLY_IOREMAP")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Drop unrelated cleanup & cleanup change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210520032919.358935-1-aik@ozlabs.ru
4 years agoDefer close only when lease is enabled.
Rohith Surabattula [Mon, 17 May 2021 11:28:34 +0000 (11:28 +0000)]
Defer close only when lease is enabled.

When smb2 lease parameter is disabled on server. Server grants
batch oplock instead of RHW lease by default on open, inode page cache
needs to be zapped immediatley upon close as cache is not valid.

Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
4 years agoFix kernel oops when CONFIG_DEBUG_ATOMIC_SLEEP is enabled.
Rohith Surabattula [Wed, 5 May 2021 10:56:47 +0000 (10:56 +0000)]
Fix kernel oops when CONFIG_DEBUG_ATOMIC_SLEEP is enabled.

Removed oplock_break_received flag which was added to achieve
synchronization between oplock handler and open handler by earlier commit.

It is not needed because there is an existing lock open_file_lock to achieve
the same. find_readable_file takes open_file_lock and then traverses the
openFileList. Similarly, cifs_oplock_break while closing the deferred
handle (i.e cifsFileInfo_put) takes open_file_lock and then sends close
to the server.

Added comments for better readability.

Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
4 years agocifs: Fix inconsistent indenting
Jiapeng Chong [Wed, 19 May 2021 10:47:07 +0000 (18:47 +0800)]
cifs: Fix inconsistent indenting

Eliminate the follow smatch warning:

fs/cifs/fs_context.c:1148 smb3_fs_context_parse_param() warn:
inconsistent indenting.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
4 years agocifs: fix memory leak in smb2_copychunk_range
Ronnie Sahlberg [Tue, 18 May 2021 22:40:11 +0000 (08:40 +1000)]
cifs: fix memory leak in smb2_copychunk_range

When using smb2_copychunk_range() for large ranges we will
run through several iterations of a loop calling SMB2_ioctl()
but never actually free the returned buffer except for the final
iteration.
This leads to memory leaks everytime a large copychunk is requested.

Fixes: 9bf0c9cd4314 ("CIFS: Fix SMB2/SMB3 Copy offload support (refcopy) for large files")
Cc: <stable@vger.kernel.org>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
4 years agodrm/amdgpu: stop touching sched.ready in the backend
Christian König [Tue, 18 May 2021 15:48:02 +0000 (17:48 +0200)]
drm/amdgpu: stop touching sched.ready in the backend

This unfortunately comes up in regular intervals and breaks
GPU reset for the engine in question.

The sched.ready flag controls if an engine can't get working
during hw_init, but should never be set to false during hw_fini.

v2: squash in unused variable fix (Alex)

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/amdgpu: fix a potential deadlock in gpu reset
Lang Yu [Mon, 17 May 2021 04:47:20 +0000 (12:47 +0800)]
drm/amd/amdgpu: fix a potential deadlock in gpu reset

When amdgpu_ib_ring_tests failed, the reset logic called
amdgpu_device_ip_suspend twice, then deadlock occurred.
Deadlock log:

[  805.655192] amdgpu 0000:04:00.0: amdgpu: ib ring test failed (-110).
[  806.290952] [drm] free PSP TMR buffer

[  806.319406] ============================================
[  806.320315] WARNING: possible recursive locking detected
[  806.321225] 5.11.0-custom #1 Tainted: G        W  OEL
[  806.322135] --------------------------------------------
[  806.323043] cat/2593 is trying to acquire lock:
[  806.323825] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.325668]
               but task is already holding lock:
[  806.326664] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.328430]
               other info that might help us debug this:
[  806.329539]  Possible unsafe locking scenario:

[  806.330549]        CPU0
[  806.330983]        ----
[  806.331416]   lock(&adev->dm.dc_lock);
[  806.332086]   lock(&adev->dm.dc_lock);
[  806.332738]
                *** DEADLOCK ***

[  806.333747]  May be due to missing lock nesting notation

[  806.334899] 3 locks held by cat/2593:
[  806.335537]  #0: ffff888100d3f1b8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_read+0x4e/0x110
[  806.337009]  #1: ffff888136b1fd78 (&adev->reset_sem){++++}-{3:3}, at: amdgpu_device_lock_adev+0x42/0x94 [amdgpu]
[  806.339018]  #2: ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.340869]
               stack backtrace:
[  806.341621] CPU: 6 PID: 2593 Comm: cat Tainted: G        W  OEL    5.11.0-custom #1
[  806.342921] Hardware name: AMD Celadon-CZN/Celadon-CZN, BIOS WLD0C23N_Weekly_20_12_2 12/23/2020
[  806.344413] Call Trace:
[  806.344849]  dump_stack+0x93/0xbd
[  806.345435]  __lock_acquire.cold+0x18a/0x2cf
[  806.346179]  lock_acquire+0xca/0x390
[  806.346807]  ? dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.347813]  __mutex_lock+0x9b/0x930
[  806.348454]  ? dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.349434]  ? amdgpu_device_indirect_rreg+0x58/0x70 [amdgpu]
[  806.350581]  ? _raw_spin_unlock_irqrestore+0x47/0x50
[  806.351437]  ? dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.352437]  ? rcu_read_lock_sched_held+0x4f/0x80
[  806.353252]  ? rcu_read_lock_sched_held+0x4f/0x80
[  806.354064]  mutex_lock_nested+0x1b/0x20
[  806.354747]  ? mutex_lock_nested+0x1b/0x20
[  806.355457]  dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.356427]  ? soc15_common_set_clockgating_state+0x17d/0x19 [amdgpu]
[  806.357736]  amdgpu_device_ip_suspend_phase1+0x78/0xd0 [amdgpu]
[  806.360394]  amdgpu_device_ip_suspend+0x21/0x70 [amdgpu]
[  806.362926]  amdgpu_device_pre_asic_reset+0xb3/0x270 [amdgpu]
[  806.365560]  amdgpu_device_gpu_recover.cold+0x679/0x8eb [amdgpu]

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Acked-by: Christian KÃnig <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: update sdma golden setting for Navi12
Guchun Chen [Mon, 17 May 2021 08:38:00 +0000 (16:38 +0800)]
drm/amdgpu: update sdma golden setting for Navi12

Current golden setting is out of date.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
4 years agodrm/amdgpu: update gc golden setting for Navi12
Guchun Chen [Mon, 17 May 2021 08:35:40 +0000 (16:35 +0800)]
drm/amdgpu: update gc golden setting for Navi12

Current golden setting is out of date.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
4 years agodrm/amdgpu: Fix a use-after-free
xinhui pan [Tue, 18 May 2021 02:56:07 +0000 (10:56 +0800)]
drm/amdgpu: Fix a use-after-free

looks like we forget to set ttm->sg to NULL.
Hit panic below

[ 1235.844104] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b7b4b: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
[ 1235.989074] Call Trace:
[ 1235.991751]  sg_free_table+0x17/0x20
[ 1235.995667]  amdgpu_ttm_backend_unbind.cold+0x4d/0xf7 [amdgpu]
[ 1236.002288]  amdgpu_ttm_backend_destroy+0x29/0x130 [amdgpu]
[ 1236.008464]  ttm_tt_destroy+0x1e/0x30 [ttm]
[ 1236.013066]  ttm_bo_cleanup_memtype_use+0x51/0xa0 [ttm]
[ 1236.018783]  ttm_bo_release+0x262/0xa50 [ttm]
[ 1236.023547]  ttm_bo_put+0x82/0xd0 [ttm]
[ 1236.027766]  amdgpu_bo_unref+0x26/0x50 [amdgpu]
[ 1236.032809]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x7aa/0xd90 [amdgpu]
[ 1236.040400]  kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu]
[ 1236.046912]  kfd_ioctl+0x463/0x690 [amdgpu]

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add video_codecs query support for aldebaran
James Zhu [Tue, 18 May 2021 12:44:23 +0000 (08:44 -0400)]
drm/amdgpu: add video_codecs query support for aldebaran

Add video_codecs query support for aldebaran.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/amdgpu: fix refcount leak
Jingwen Chen [Mon, 17 May 2021 08:16:10 +0000 (16:16 +0800)]
drm/amd/amdgpu: fix refcount leak

[Why]
the gem object rfb->base.obj[0] is get according to num_planes
in amdgpufb_create, but is not put according to num_planes

[How]
put rfb->base.obj[0] in amdgpu_fbdev_destroy according to num_planes

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Disconnect non-DP with no EDID
Chris Park [Tue, 4 May 2021 20:20:55 +0000 (16:20 -0400)]
drm/amd/display: Disconnect non-DP with no EDID

[Why]
Active DP dongles return no EDID when dongle
is connected, but VGA display is taken out.
Current driver behavior does not remove the
active display when this happens, and this is
a gap between dongle DTP and dongle behavior.

[How]
For active DP dongles and non-DP scenario,
disconnect sink on detection when no EDID
is read due to timeout.

Signed-off-by: Chris Park <Chris.Park@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Stylon Wang <stylon.wang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang
Changfeng [Fri, 14 May 2021 07:28:25 +0000 (15:28 +0800)]
drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang

There is problem with 3DCGCG firmware and it will cause compute test
hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid
compute hang.

Signed-off-by: Changfeng <Changfeng.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
4 years agodrm/amdgpu: Fix GPU TLB update error when PAGE_SIZE > AMDGPU_PAGE_SIZE
Yi Li [Fri, 14 May 2021 06:40:39 +0000 (14:40 +0800)]
drm/amdgpu: Fix GPU TLB update error when PAGE_SIZE > AMDGPU_PAGE_SIZE

When PAGE_SIZE is larger than AMDGPU_PAGE_SIZE, the number of GPU TLB
entries which need to update in amdgpu_map_buffer() should be multiplied
by AMDGPU_GPU_PAGES_IN_CPU_PAGE (PAGE_SIZE / AMDGPU_PAGE_SIZE).

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Yi Li <liyi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
4 years agodrm/radeon: use the dummy page for GART if needed
Christian König [Wed, 12 May 2021 08:36:43 +0000 (10:36 +0200)]
drm/radeon: use the dummy page for GART if needed

Imported BOs don't have a pagelist any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes: 0575ff3d33cd ("drm/radeon: stop using pages with drm_prime_sg_to_page_addr_arrays v2")
CC: stable@vger.kernel.org # 5.12