www.infradead.org Git - users/hch/block.git/log

]> www.infradead.org Git - users/hch/block.git/log

projects / users / hch / block.git / log

Christoph Hellwig [Tue, 4 May 2021 16:59:06 +0000 (18:59 +0200)]

ubd: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 16:54:38 +0000 (18:54 +0200)]

ubd: remove the code to register as the legacy IDE driver

With the legacy IDE driver long deprecated, and modern userspace being
much more flexible about dev_t assignments there is no reason to fake
a registration as the legacy IDE driver in ubd. This registeration
is a little problematic as it registers the same request_queue for
multiple gendisks, so just remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 15:44:25 +0000 (17:44 +0200)]

mtip32xx: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 15:48:31 +0000 (17:48 +0200)]

mtip32xx: simplify sysfs setup

Pass the driver specific attributes directly to device_add_disk instead
of manually creating them after the disk registration.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 16:49:18 +0000 (18:49 +0200)]

z2ram: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 16:47:59 +0000 (18:47 +0200)]

ataflop: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 16:47:07 +0000 (18:47 +0200)]

amiflop: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 16:40:58 +0000 (18:40 +0200)]

scm_blk: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 16:22:52 +0000 (18:22 +0200)]

ubi: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 16:17:07 +0000 (18:17 +0200)]

xen-blkfront: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 16:12:00 +0000 (18:12 +0200)]

sx8: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 16:09:14 +0000 (18:09 +0200)]

rnbd: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 16:05:07 +0000 (18:05 +0200)]

rbd: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 16:01:47 +0000 (18:01 +0200)]

pd: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 15:54:54 +0000 (17:54 +0200)]

nullb: use blk_mq_alloc_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 15:52:31 +0000 (17:52 +0200)]

nbd: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 15:39:11 +0000 (17:39 +0200)]

loop: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 15:36:47 +0000 (17:36 +0200)]

floppy: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Tue, 4 May 2021 15:33:15 +0000 (17:33 +0200)]

aoe: use blk_mq_alloc_disk and blk_cleanup_disk

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 17:29:03 +0000 (19:29 +0200)]

blk-mq: remove blk_mq_init_sq_queue

All users are gone now.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 17:28:40 +0000 (19:28 +0200)]

gdrom: use blk_mq_alloc_disk

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 17:19:02 +0000 (19:19 +0200)]

sunvdc: use blk_mq_alloc_disk

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 17:17:00 +0000 (19:17 +0200)]

swim: use blk_mq_alloc_disk

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 17:13:01 +0000 (19:13 +0200)]

swim3: use blk_mq_alloc_disk

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 17:06:40 +0000 (19:06 +0200)]

ps3disk: use blk_mq_alloc_disk

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 16:58:49 +0000 (18:58 +0200)]

mtd_blkdevs: use blk_mq_alloc_disk

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 16:37:20 +0000 (18:37 +0200)]

mspro: use blk_mq_alloc_disk

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 16:35:00 +0000 (18:35 +0200)]

ms_block: use blk_mq_alloc_disk

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 16:32:22 +0000 (18:32 +0200)]

pf: use blk_mq_alloc_disk

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 16:29:29 +0000 (18:29 +0200)]

pcd: use blk_mq_alloc_disk

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 16:20:58 +0000 (18:20 +0200)]

virtio-blk: use blk_mq_alloc_disk

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 16:01:02 +0000 (18:01 +0200)]

blk-mq: add the blk_mq_alloc_disk APIs

Add a new API to allocate a gendisk including the request_queue for use
with blk-mq based drivers. This is to avoid boilerplate code in drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 16:09:24 +0000 (18:09 +0200)]

blk-mq: improve the blk_mq_init_allocated_queue interface

Don't return the passed in request_queue but a normal error code, and
drop the elevator_init argument in favor of just calling elevator_init_mq
directly from dm-rq.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 15:58:44 +0000 (17:58 +0200)]

blk-mq: factor out a blk_mq_alloc_sq_tag_set helper

Factour out a helper to initialize a simple single hw queue tag_set from
blk_mq_init_sq_queue. This will allow to phase out blk_mq_init_sq_queue
in favor of a more symmetric and general API.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 15:44:29 +0000 (17:44 +0200)]

block: unexport blk_alloc_queue

blk_alloc_queue is just an internal helper now, unexport it and remove
it from the public header.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 15:35:23 +0000 (17:35 +0200)]

null_blk: convert to blk_alloc_disk/blk_cleanup_disk

Convert the null_blk driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation. Note that the
blk-mq mode is left with its own allocations scheme, to be handled later.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 15:08:40 +0000 (17:08 +0200)]

xpram: convert to blk_alloc_disk/blk_cleanup_disk

Convert the xpram driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 15:05:38 +0000 (17:05 +0200)]

dcssblk: convert to blk_alloc_disk/blk_cleanup_disk

Convert the dcssblk driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 15:01:18 +0000 (17:01 +0200)]

ps3vram: convert to blk_alloc_disk/blk_cleanup_disk

Convert the ps3vram driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:58:13 +0000 (16:58 +0200)]

n64cart: convert to blk_alloc_disk

Convert the n64cart driver to use the blk_alloc_disk helper to simplify
gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:56:53 +0000 (16:56 +0200)]

simdisk: convert to blk_alloc_disk/blk_cleanup_disk

Convert the simdisk driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:53:14 +0000 (16:53 +0200)]

nfblock: convert to blk_alloc_disk/blk_cleanup_disk

Convert the nfblock driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:49:32 +0000 (16:49 +0200)]

nvme-multipath: convert to blk_alloc_disk/blk_cleanup_disk

Convert the nvme-multipath driver to use the blk_alloc_disk and
blk_cleanup_disk helpers to simplify gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:45:20 +0000 (16:45 +0200)]

nvdimm-pmem: convert to blk_alloc_disk/blk_cleanup_disk

Convert the nvdimm-pmem driver to use the blk_alloc_disk and
blk_cleanup_disk helpers to simplify gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:40:49 +0000 (16:40 +0200)]

nvdimm-btt: convert to blk_alloc_disk/blk_cleanup_disk

Convert the nvdimm-btt driver to use the blk_alloc_disk and
blk_cleanup_disk helpers to simplify gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:37:57 +0000 (16:37 +0200)]

nvdimm-blk: convert to blk_alloc_disk/blk_cleanup_disk

Convert the nvdimm-blk driver to use the blk_alloc_disk and
blk_cleanup_disk helpers to simplify gendisk and request_queue
allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:34:39 +0000 (16:34 +0200)]

md: convert to blk_alloc_disk/blk_cleanup_disk

Convert the md driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 15:13:17 +0000 (17:13 +0200)]

dm: convert to blk_alloc_disk/blk_cleanup_disk

Convert the dm driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:31:37 +0000 (16:31 +0200)]

bcache: convert to blk_alloc_disk/blk_cleanup_disk

Convert the bcache driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:26:54 +0000 (16:26 +0200)]

lightnvm: convert to blk_alloc_disk/blk_cleanup_disk

Convert the lightnvm driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:24:06 +0000 (16:24 +0200)]

zram: convert to blk_alloc_disk/blk_cleanup_disk

Convert the zram driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:21:09 +0000 (16:21 +0200)]

rsxx: convert to blk_alloc_disk/blk_cleanup_disk

Convert the rsxx driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:14:33 +0000 (16:14 +0200)]

pktcdvd: convert to blk_alloc_disk/blk_cleanup_disk

Convert the pktcdvd driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:10:10 +0000 (16:10 +0200)]

drbd: convert to blk_alloc_disk/blk_cleanup_disk

Convert the drbd driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 14:05:36 +0000 (16:05 +0200)]

brd: convert to blk_alloc_disk/blk_cleanup_disk

Convert the brd driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation. This also
allows to remove the request_queue pointer in struct request_queue,
and to simplify the initialization as blk_cleanup_disk can be called
on any disk returned from blk_alloc_disk.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 13:41:59 +0000 (15:41 +0200)]

block: add blk_alloc_disk and blk_cleanup_disk APIs

Add two new APIs to allocate and free a gendisk including the
request_queue for use with BIO based drivers. This is to avoid
boilerplate code in drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 13:47:03 +0000 (15:47 +0200)]

block: add a flag to make put_disk on partially initalized disks safer

Add a flag to indicate that __device_add_disk did grab a queue reference
so that disk_release only drops it if we actually had it. This sort
out one of the major pitfals with partially initialized gendisk that
a lot of drivers did get wrong or still do.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 13:26:50 +0000 (15:26 +0200)]

block: automatically enable GENHD_FL_EXT_DEVT

Automatically set the GENHD_FL_EXT_DEVT flag for all disks allocated
without an explicit number of minors. This is what all new block
drivers should do, so make sure it is the default without boilerplate
code.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 13:26:16 +0000 (15:26 +0200)]

block: move the DISK_MAX_PARTS sanity check into __device_add_disk

Keep this together with the first place that actually looks at
->minors and prepare for not passing a minors argument to
alloc_disk.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Christoph Hellwig [Thu, 29 Apr 2021 12:54:20 +0000 (14:54 +0200)]

block: refactor device number setup in __device_add_disk

Untangle the mess around blk_alloc_devt by moving the check for
the used allocation scheme into the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

John Garry [Thu, 13 May 2021 12:00:58 +0000 (20:00 +0800)]

blk-mq: Use request queue-wide tags for tagset-wide sbitmap

The tags used for an IO scheduler are currently per hctx.

As such, when q->nr_hw_queues grows, so does the request queue total IO
scheduler tag depth.

This may cause problems for SCSI MQ HBAs whose total driver depth is
fixed.

Ming and Yanhui report higher CPU usage and lower throughput in scenarios
where the fixed total driver tag depth is appreciably lower than the total
scheduler tag depth:
https://lore.kernel.org/linux-block/440dfcfc-1a2c-bd98-1161-cec4d78c6dfc@huawei.com/T/#mc0d6d4f95275a2743d1c8c3e4dc9ff6c9aa3a76b

In that scenario, since the scheduler tag is got first, much contention
is introduced since a driver tag may not be available after we have got
the sched tag.

Improve this scenario by introducing request queue-wide tags for when
a tagset-wide sbitmap is used. The static sched requests are still
allocated per hctx, as requests are initialised per hctx, as in
blk_mq_init_request(..., hctx_idx, ...) ->
set->ops->init_request(.., hctx_idx, ...).

For simplicity of resizing the request queue sbitmap when updating the
request queue depth, just init at the max possible size, so we don't need
to deal with the possibly with swapping out a new sbitmap for old if
we need to grow.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/1620907258-30910-3-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

John Garry [Thu, 13 May 2021 12:00:57 +0000 (20:00 +0800)]

blk-mq: Some tag allocation code refactoring

The tag allocation code to alloc the sbitmap pairs is common for regular
bitmaps tags and shared sbitmap, so refactor into a common function.

Also remove superfluous "flags" argument from blk_mq_init_shared_sbitmap().

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/1620907258-30910-2-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Ming Lei [Tue, 11 May 2021 15:22:36 +0000 (23:22 +0800)]

blk-mq: clearing flush request reference in tags->rqs[]

Before we free request queue, clearing flush request reference in
tags->rqs[], so that potential UAF can be avoided.

Based on one patch written by David Jeffery.

Tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210511152236.763464-5-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Ming Lei [Tue, 11 May 2021 15:22:35 +0000 (23:22 +0800)]

blk-mq: clear stale request in tags->rq[] before freeing one request pool

refcount_inc_not_zero() in bt_tags_iter() still may read one freed
request.

Fix the issue by the following approach:

1) hold a per-tags spinlock when reading ->rqs[tag] and calling
refcount_inc_not_zero in bt_tags_iter()

2) clearing stale request referred via ->rqs[tag] before freeing
request pool, the per-tags spinlock is held for clearing stale
->rq[tag]

So after we cleared stale requests, bt_tags_iter() won't observe
freed request any more, also the clearing will wait for pending
request reference.

The idea of clearing ->rqs[] is borrowed from John Garry's previous
patch and one recent David's patch.

Tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: David Jeffery <djeffery@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210511152236.763464-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Ming Lei [Tue, 11 May 2021 15:22:34 +0000 (23:22 +0800)]

blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter

Grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter(), and
this way will prevent the request from being re-used when ->fn is
running. The approach is same as what we do during handling timeout.

Fix request use-after-free(UAF) related with completion race or queue
releasing:

- If one rq is referred before rq->q is frozen, then queue won't be
frozen before the request is released during iteration.

- If one rq is referred after rq->q is frozen, refcount_inc_not_zero()
will return false, and we won't iterate over this request.

However, still one request UAF not covered: refcount_inc_not_zero() may
read one freed request, and it will be handled in next patch.

Tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210511152236.763464-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Ming Lei [Tue, 11 May 2021 15:22:33 +0000 (23:22 +0800)]

block: avoid double io accounting for flush request

For flush request, rq->end_io() may be called two times, one is from
timeout handling(blk_mq_check_expired()), another is from normal
completion(__blk_mq_end_request()).

Move blk_account_io_flush() after flush_rq->ref drops to zero, so
io accounting can be done just once for flush request.

Fixes: b68663186577 ("block: add iostat counters for flush requests")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210511152236.763464-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Max Gurtovoy [Tue, 11 May 2021 15:53:19 +0000 (15:53 +0000)]

block: remove unneeded parenthesis from blk-sysfs

Align to common code conventions.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Link: https://lore.kernel.org/r/20210511155319.1885277-1-mgurtovoy@nvidia.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Tejun Heo [Tue, 11 May 2021 18:58:04 +0000 (14:58 -0400)]

blkcg: drop CLONE_IO check in blkcg_can_attach()

blkcg has always rejected to attach if any of the member tasks has shared
io_context. The rationale was that io_contexts can be shared across
different cgroups making it impossible to define what the appropriate
control behavior should be. However, this check causes more problems than it
solves:

* The check prevents controller enable and migrations but not CLONE_IO
  itself, which can lead to surprises as the outcome changes depending on
  the order of operations.

* Sharing within a cgroup is fine but the check can't distinguish that. This
  leads to unnecessary conflicts with the recent CLONE_IO usage in io_uring.

io_context sharing doesn't make any difference for rq_qos based controllers
and the way it's used is safe as long as tasks aren't migrated dynamically
which is the vast majority of use cases. While we can try to make the check
more precise to avoid false positives, the added complexity doesn't seem
worthwhile. Let's just drop blkcg_can_attach().

Signed-off-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/YJrTvHbrRDbJjw+S@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Yang Yingliang [Tue, 11 May 2021 11:34:40 +0000 (19:34 +0800)]

aoe: remove unnecessary mutex_init()

The mutex ktio_spawn_lock is initialized statically.
It is unnecessary to initialize by mutex_init().

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20210511113440.3772053-1-yangyingliang@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

zhangyi (F) [Sat, 13 Mar 2021 03:01:46 +0000 (11:01 +0800)]

block_dump: remove comments in docs

Now block_dump feature is gone, remove all comments in docs.

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210313030146.2882027-4-yi.zhang@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

zhangyi (F) [Sat, 13 Mar 2021 03:01:45 +0000 (11:01 +0800)]

block_dump: remove block_dump feature

We have already delete block_dump feature in mark_inode_dirty() because
it can be replaced by tracepoints, now we also remove the part in
submit_bio() for the same reason. The part of block dump feature in
submit_bio() dump the write process, write region and sectors on the
target disk into kernel message. it can be replaced by
block_bio_queue tracepoint in submit_bio_checks(), so we do not need
block_dump anymore, remove the whole block_dump feature.

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210313030146.2882027-3-yi.zhang@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

zhangyi (F) [Sat, 13 Mar 2021 03:01:44 +0000 (11:01 +0800)]

block_dump: remove block_dump feature in mark_inode_dirty()

block_dump is an old debugging interface, one of it's functions is used
to print the information about who write which file on disk. If we
enable block_dump through /proc/sys/vm/block_dump and turn on debug log
level, we can gather information about write process name, target file
name and disk from kernel message. This feature is realized in
block_dump___mark_inode_dirty(), it print above information into kernel
message directly when marking inode dirty, so it is noisy and can easily
trigger log storm. At the same time, get the dentry refcount is also not
safe, we found it will lead to deadlock on ext4 file system with
data=journal mode.

After tracepoints has been introduced into the kernel, we got a
tracepoint in __mark_inode_dirty(), which is a better replacement of
block_dump___mark_inode_dirty(). The only downside is that it only trace
the inode number and not a file name, but it probably doesn't matter
because the original printed file name in block_dump is not accurate in
some cases, and we can still find it through the inode number and device
id. So this patch delete the dirting inode part of block_dump feature.

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210313030146.2882027-2-yi.zhang@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit | commitdiff | tree

Linus Torvalds [Sun, 16 May 2021 22:27:44 +0000 (15:27 -0700)]

Linux 5.13-rc2

commit | commitdiff | tree

Linus Torvalds [Sun, 16 May 2021 17:13:14 +0000 (10:13 -0700)]

Merge tag 'driver-core-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
"Here are two driver fixes for driver core changes that happened in
  5.13-rc1.

  The clk driver fix resolves a many-reported issue with booting some
  devices, and the USB typec fix resolves the reported problem of USB
  systems on some embedded boards.

  Both of these have been in linux-next this week with no reported
  issues"

* tag 'driver-core-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  clk: Skip clk provider registration when np is NULL
  usb: typec: tcpm: Don't block probing of consumers of "connector" nodes

commit | commitdiff | tree

Linus Torvalds [Sun, 16 May 2021 17:06:19 +0000 (10:06 -0700)]

Merge tag 'staging-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging and IIO driver fixes from Greg KH:
"Here are some small IIO driver fixes and one Staging driver fix for
  5.13-rc2.

  Nothing major, just some resolutions for reported problems:

   - gcc-11 bogus warning fix for rtl8723bs

   - iio driver tiny fixes

  All of these have been in linux-next for many days with no reported
  issues"

* tag 'staging-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  iio: tsl2583: Fix division by a zero lux_val
  iio: core: return ENODEV if ioctl is unknown
  iio: core: fix ioctl handlers removal
  iio: gyro: mpu3050: Fix reported temperature value
  iio: hid-sensors: select IIO_TRIGGERED_BUFFER under HID_SENSOR_IIO_TRIGGER
  iio: proximity: pulsedlight: Fix rumtime PM imbalance on error
  iio: light: gp2ap002: Fix rumtime PM imbalance on error
  staging: rtl8723bs: avoid bogus gcc warning

commit | commitdiff | tree

Linus Torvalds [Sun, 16 May 2021 16:55:05 +0000 (09:55 -0700)]

Merge tag 'usb-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are some small USB fixes for 5.13-rc2. They consist of a number
  of resolutions for reported issues:

   - typec fixes for found problems

   - xhci fixes and quirk additions

   - dwc3 driver fixes

   - minor fixes found by Coverity

   - cdc-wdm fixes for reported problems

  All of these have been in linux-next for a few days with no reported
  issues"

* tag 'usb-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits)
  usb: core: hub: fix race condition about TRSMRCY of resume
  usb: typec: tcpm: Fix SINK_DISCOVERY current limit for Rp-default
  xhci: Add reset resume quirk for AMD xhci controller.
  usb: xhci: Increase timeout for HC halt
  xhci: Do not use GFP_KERNEL in (potentially) atomic context
  xhci: Fix giving back cancelled URBs even if halted endpoint can't reset
  xhci-pci: Allow host runtime PM as default for Intel Alder Lake xHCI
  usb: musb: Fix an error message
  usb: typec: tcpm: Fix wrong handling for Not_Supported in VDM AMS
  usb: typec: tcpm: Send DISCOVER_IDENTITY from dedicated work
  usb: typec: ucsi: Retrieve all the PDOs instead of just the first 4
  usb: fotg210-hcd: Fix an error message
  docs: usb: function: Modify path name
  usb: dwc3: omap: improve extcon initialization
  usb: typec: ucsi: Put fwnode in any case during ->probe()
  usb: typec: tcpm: Fix wrong handling in GET_SINK_CAP
  usb: dwc2: Remove obsolete MODULE_ constants from platform.c
  usb: dwc3: imx8mp: fix error return code in dwc3_imx8mp_probe()
  usb: dwc3: imx8mp: detect dwc3 core node via compatible string
  usb: dwc3: gadget: Return success always for kick transfer in ep queue
  ...

commit | commitdiff | tree

Linus Torvalds [Sun, 16 May 2021 16:42:13 +0000 (09:42 -0700)]

Merge tag 'timers-urgent-2021-05-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
"Two fixes for timers:

   - Use the ALARM feature check in the alarmtimer core code insted of
     the old method of checking for the set_alarm() callback.

     Drivers can have that callback set but the feature bit cleared. If
     such a RTC device is selected then alarms wont work.

   - Use a proper define to let the preprocessor check whether Hyper-V
     VDSO clocksource should be active.

     The code used a constant in an enum with #ifdef, which evaluates to
     always false and disabled the clocksource for VDSO"

* tag 'timers-urgent-2021-05-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/hyper-v: Re-enable VDSO_CLOCKMODE_HVCLOCK on X86
  alarmtimer: Check RTC features instead of ops

commit | commitdiff | tree

Linus Torvalds [Sun, 16 May 2021 16:39:04 +0000 (09:39 -0700)]

Merge tag 'for-linus-5.13b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

- two patches for error path fixes

- a small series for fixing a regression with swiotlb with Xen on Arm

* tag 'for-linus-5.13b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/swiotlb: check if the swiotlb has already been initialized
  arm64: do not set SWIOTLB_NO_FORCE when swiotlb is required
  xen/arm: move xen_swiotlb_detect to arm/swiotlb-xen.h
  xen/unpopulated-alloc: fix error return code in fill_list()
  xen/gntdev: fix gntdev_mmap() error exit path

commit | commitdiff | tree

Linus Torvalds [Sun, 16 May 2021 16:31:06 +0000 (09:31 -0700)]

Merge tag 'x86_urgent_for_v5.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:
"The three SEV commits are not really urgent material. But we figured
  since getting them in now will avoid a huge amount of conflicts
  between future SEV changes touching tip, the kvm and probably other
  trees, sending them to you now would be best.

  The idea is that the tip, kvm etc branches for 5.14 will all base
  ontop of -rc2 and thus everything will be peachy. What is more, those
  changes are purely mechanical and defines movement so they should be
  fine to go now (famous last words).

  Summary:

   - Enable -Wundef for the compressed kernel build stage

   - Reorganize SEV code to streamline and simplify future development"

* tag 'x86_urgent_for_v5.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot/compressed: Enable -Wundef
  x86/msr: Rename MSR_K8_SYSCFG to MSR_AMD64_SYSCFG
  x86/sev: Move GHCB MSR protocol and NAE definitions in a common header
  x86/sev-es: Rename sev-es.{ch} to sev.{ch}

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 23:39:45 +0000 (16:39 -0700)]

Merge tag 'powerpc-5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

- Fix a regression in the conversion of the 64-bit BookE interrupt
   entry to C.

- Fix KVM hosts running with the hash MMU since the recent KVM gfn
   changes.

- Fix a deadlock in our paravirt spinlocks when hcall tracing is
   enabled.

- Several fixes for oopses in our runtime code patching for security
   mitigations.

- A couple of minor fixes for the recent conversion of 32-bit interrupt
   entry/exit to C.

- Fix __get_user() causing spurious crashes in sigreturn due to a bad
   inline asm constraint, spotted with GCC 11.

- A fix for the way we track IRQ masking state vs NMI interrupts when
   using the new scv system call entry path.

- A couple more minor fixes.

Thanks to Cédric Le Goater, Christian Zigotzky, Christophe Leroy,
Naveen N. Rao, Nicholas Piggin Paul Menzel, and Sean Christopherson.

* tag 'powerpc-5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64e/interrupt: Fix nvgprs being clobbered
  powerpc/64s: Make NMI record implicitly soft-masked code as irqs disabled
  powerpc/64s: Fix stf mitigation patching w/strict RWX & hash
  powerpc/64s: Fix entry flush patching w/strict RWX & hash
  powerpc/64s: Fix crashes when toggling entry flush barrier
  powerpc/64s: Fix crashes when toggling stf barrier
  KVM: PPC: Book3S HV: Fix kvm_unmap_gfn_range_hv() for Hash MMU
  powerpc/legacy_serial: Fix UBSAN: array-index-out-of-bounds
  powerpc/signal: Fix possible build failure with unsafe_copy_fpr_{to/from}_user
  powerpc/uaccess: Fix __get_user() with CONFIG_CC_HAS_ASM_GOTO_OUTPUT
  powerpc/pseries: warn if recursing into the hcall tracing code
  powerpc/pseries: use notrace hcall variant for H_CEDE idle
  powerpc/pseries: Don't trace hcall tracing wrapper
  powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks
  powerpc/syscall: Calling kuap_save_and_lock() is wrong
  powerpc/interrupts: Fix kuep_unlock() call

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 17:24:48 +0000 (10:24 -0700)]

Merge tag 'sched-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
"Fix an idle CPU selection bug, and an AMD Ryzen maximum frequency
  enumeration bug"

* tag 'sched-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, sched: Fix the AMD CPPC maximum performance value on certain AMD Ryzen generations
  sched/fair: Fix clearing of has_idle_cores flag in select_idle_cpu()

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 17:18:23 +0000 (10:18 -0700)]

Merge tag 'objtool-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fixes from Ingo Molnar:
"Fix a couple of endianness bugs that crept in"

* tag 'objtool-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool/x86: Fix elf_add_alternative() endianness
objtool: Fix elf_create_undef_symbol() endianness

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 17:13:42 +0000 (10:13 -0700)]

Merge tag 'irq-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
"Fix build warning on SH"

* tag 'irq-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sh: Remove unused variable

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 17:00:35 +0000 (10:00 -0700)]

Merge tag 'core-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 stack randomization fix from Ingo Molnar:
"Fix an assembly constraint that affected LLVM up to version 12"

* tag 'core-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stack: Replace "o" output with "r" input constraint

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 16:42:27 +0000 (09:42 -0700)]

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"13 patches.

  Subsystems affected by this patch series: resource, squashfs, hfsplus,
  modprobe, and mm (hugetlb, slub, userfaultfd, ksm, pagealloc, kasan,
  pagemap, and ioremap)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/ioremap: fix iomap_max_page_shift
  docs: admin-guide: update description for kernel.modprobe sysctl
  hfsplus: prevent corruption in shrinking truncate
  mm/filemap: fix readahead return types
  kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled
  mm: fix struct page layout on 32-bit systems
  ksm: revert "use GET_KSM_PAGE_NOLOCK to get ksm page in remove_rmap_item_from_tree()"
  userfaultfd: release page in error path to avoid BUG_ON
  squashfs: fix divide error in calculate_skip()
  kernel/resource: fix return code check in __request_free_mem_region
  mm, slub: move slub_debug static key enabling outside slab_mutex
  mm/hugetlb: fix cow where page writtable in child
  mm/hugetlb: fix F_SEAL_FUTURE_WRITE

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 16:01:45 +0000 (09:01 -0700)]

Merge tag 'arc-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

- PAE fixes

- syscall num check off-by-one bug

- misc fixes

* tag 'arc-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: mm: Use max_high_pfn as a HIGHMEM zone border
  ARC: mm: PAE: use 40-bit physical page mask
  ARC: entry: fix off-by-one error in syscall number validation
  ARC: kgdb: add 'fallthrough' to prevent a warning
  arc: Fix typos/spellos

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 15:52:30 +0000 (08:52 -0700)]

Merge tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

- Fix for shared tag set exit (Bart)

- Correct ioctl range for zoned ioctls (Damien)

- Removed dead/unused function (Lin)

- Fix perf regression for shared tags (Ming)

- Fix out-of-bounds issue with kyber and preemption (Omar)

- BFQ merge fix (Paolo)

- Two error handling fixes for nbd (Sun)

- Fix weight update in blk-iocost (Tejun)

- NVMe pull request (Christoph):
      - correct the check for using the inline bio in nvmet (Chaitanya
        Kulkarni)
      - demote unsupported command warnings (Chaitanya Kulkarni)
      - fix corruption due to double initializing ANA state (me, Hou Pu)
      - reset ns->file when open fails (Daniel Wagner)
      - fix a NULL deref when SEND is completed with error in nvmet-rdma
        (Michal Kalderon)

- Fix kernel-doc warning (Bart)

* tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-block:
  block/partitions/efi.c: Fix the efi_partition() kernel-doc header
  blk-mq: Swap two calls in blk_mq_exit_queue()
  blk-mq: plug request for shared sbitmap
  nvmet: use new ana_log_size instead the old one
  nvmet: seset ns->file when open fails
  nbd: share nbd_put and return by goto put_nbd
  nbd: Fix NULL pointer in flush_workqueue
  blkdev.h: remove unused codes blk_account_rq
  block, bfq: avoid circular stable merges
  blk-iocost: fix weight updates of inner active iocgs
  nvmet: demote fabrics cmd parse err msg to debug
  nvmet: use helper to remove the duplicate code
  nvmet: demote discovery cmd parse err msg to debug
  nvmet-rdma: Fix NULL deref when SEND is completed with error
  nvmet: fix inline bio check for passthru
  nvmet: fix inline bio check for bdev-ns
  nvme-multipath: fix double initialization of ANA state
  kyber: fix out of bounds access when preempted
  block: uapi: fix comment about block device ioctl

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 15:43:44 +0000 (08:43 -0700)]

Merge tag 'io_uring-5.13-2021-05-14' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
"Just a few minor fixes/changes:

   - Fix issue with double free race for linked timeout completions

   - Fix reference issue with timeouts

   - Remove last few places that make SQPOLL special, since it's just an
     io thread now.

   - Bump maximum allowed registered buffers, as we don't allocate as
     much anymore"

* tag 'io_uring-5.13-2021-05-14' of git://git.kernel.dk/linux-block:
  io_uring: increase max number of reg buffers
  io_uring: further remove sqpoll limits on opcodes
  io_uring: fix ltout double free on completion race
  io_uring: fix link timeout refs

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 15:37:21 +0000 (08:37 -0700)]

Merge tag 'erofs-for-5.13-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:
"This mainly fixes 1 lcluster-sized pclusters for the big pcluster
  feature, which can be forcely generated by mkfs as a specific on-disk
  case for per-(sub)file compression strategies but missed to handle in
  runtime properly.

  Also, documentation updates are included to fix the broken
  illustration due to the ReST conversion by accident and complete the
  big pcluster introduction.

  Summary:

   - update documentation to fix the broken illustration due to ReST
     conversion by accident at that time and complete the big pcluster
     introduction

   - fix 1 lcluster-sized pclusters for the big pcluster feature"

* tag 'erofs-for-5.13-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix 1 lcluster-sized pcluster for big pcluster
  erofs: update documentation about data compression
  erofs: fix broken illustration in documentation

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 15:32:51 +0000 (08:32 -0700)]

Merge tag 'libnvdimm-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:
"A regression fix for a bootup crash condition introduced in this merge
  window and some other minor fixups:

   - Fix regression in ACPI NFIT table handling leading to crashes and
     driver load failures.

   - Move the nvdimm mailing list

   - Miscellaneous minor fixups"

* tag 'libnvdimm-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  ACPI: NFIT: Fix support for variable 'SPA' structure size
  MAINTAINERS: Move nvdimm mailing list
  tools/testing/nvdimm: Make symbol '__nfit_test_ioremap' static
  libnvdimm: Remove duplicate struct declaration

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 15:28:08 +0000 (08:28 -0700)]

Merge tag 'dax-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull dax fixes from Dan Williams:
"A fix for a hang condition due to missed wakeups in the filesystem-dax
  core when exercised by virtiofs.

  This bug has been there from the beginning, but the condition has
  not triggered on other filesystems since they hold a lock over
  invalidation events"

* tag 'dax-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dax: Wake up all waiters after invalidating dax entry
  dax: Add a wakeup mode parameter to put_unlocked_entry()
  dax: Add an enum for specifying dax wakup mode

commit | commitdiff | tree

Linus Torvalds [Sat, 15 May 2021 15:18:29 +0000 (08:18 -0700)]

Merge tag 'drm-fixes-2021-05-15' of git://anongit.freedesktop.org/drm/drm

Pull more drm fixes from Dave Airlie:
"Looks like I wasn't the only one not fully switched on this week. The
  msm pull has a missing tag so I missed it, and i915 team were a bit
  late. In my defence I did have a day with the roof of my home office
  removed, so was sitting at my kids desk.

  msm:
   - dsi regression fix
   - dma-buf pinning fix
   - displayport fixes
   - llc fix

  i915:
   - Fix active callback alignment annotations and subsequent crashes
   - Retract link training strategy to slow and wide, again
   - Avoid division by zero on gen2
   - Use correct width reads for C0DRB3/C1DRB3 registers
   - Fix double free in pdp allocation failure path
   - Fix HDMI 2.1 PCON downstream caps check"

* tag 'drm-fixes-2021-05-15' of git://anongit.freedesktop.org/drm/drm:
  drm/i915: Use correct downstream caps for check Src-Ctl mode for PCON
  drm/i915/overlay: Fix active retire callback alignment
  drm/i915: Fix crash in auto_retire
  drm/i915/gt: Fix a double free in gen8_preallocate_top_level_pdp
  drm/i915: Read C0DRB3/C1DRB3 as 16 bits again
  drm/i915: Avoid div-by-zero on gen2
  drm/i915/dp: Use slow and wide link training for everything
  drm/msm/dp: initialize audio_comp when audio starts
  drm/msm/dp: check sink_count before update is_connected status
  drm/msm: fix minor version to indicate MSM_PARAM_SUSPENDS support
  drm/msm/dsi: fix msm_dsi_phy_get_clk_provider return code
  drm/msm/dsi: dsi_phy_28nm_8960: fix uninitialized variable access
  drm/msm: fix LLC not being enabled for mmu500 targets
  drm/msm: Do not unpin/evict exported dma-buf's

commit | commitdiff | tree

Tetsuo Handa [Sat, 15 May 2021 03:00:37 +0000 (03:00 +0000)]

tty: vt: always invoke vc->vc_sw->con_resize callback

syzbot is reporting OOB write at vga16fb_imageblit() [1], for
resize_screen() from ioctl(VT_RESIZE) returns 0 without checking whether
requested rows/columns fit the amount of memory reserved for the graphical
screen if current mode is KD_GRAPHICS.

----------
  #include <sys/types.h>
  #include <sys/stat.h>
  #include <fcntl.h>
  #include <sys/ioctl.h>
  #include <linux/kd.h>
  #include <linux/vt.h>

  int main(int argc, char *argv[])
  {
        const int fd = open("/dev/char/4:1", O_RDWR);
        struct vt_sizes vt = { 0x4100, 2 };

        ioctl(fd, KDSETMODE, KD_GRAPHICS);
        ioctl(fd, VT_RESIZE, &vt);
        ioctl(fd, KDSETMODE, KD_TEXT);
        return 0;
  }
----------

Allow framebuffer drivers to return -EINVAL, by moving vc->vc_mode !=
KD_GRAPHICS check from resize_screen() to fbcon_resize().

Link: https://syzkaller.appspot.com/bug?extid=1f29e126cf461c4de3b3
Reported-by: syzbot <syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Christophe Leroy [Sat, 15 May 2021 00:27:39 +0000 (17:27 -0700)]

mm/ioremap: fix iomap_max_page_shift

iomap_max_page_shift is expected to contain a page shift, so it can't be a
'bool', has to be an 'unsigned int'

And fix the default values: P4D_SHIFT is when huge iomap is allowed.

However, on some architectures (eg: powerpc book3s/64), P4D_SHIFT is not a
constant so it can't be used to initialise a static variable. So,
initialise iomap_max_page_shift with a maximum shift supported by the
architecture, it is gated by P4D_SHIFT in vmap_try_huge_p4d() anyway.

Link: https://lkml.kernel.org/r/ad2d366015794a9f21320dcbdd0a8eb98979e9df.1620898113.git.christophe.leroy@csgroup.eu
Fixes: bbc180a5adb0 ("mm: HUGE_VMAP arch support cleanup")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Rasmus Villemoes [Sat, 15 May 2021 00:27:36 +0000 (17:27 -0700)]

docs: admin-guide: update description for kernel.modprobe sysctl

When I added CONFIG_MODPROBE_PATH, I neglected to update Documentation/.
It's still true that this defaults to /sbin/modprobe, but now via a level
of indirection. So document that the kernel might have been built with
something other than /sbin/modprobe as the initial value.

Link: https://lkml.kernel.org/r/20210420125324.1246826-1-linux@rasmusvillemoes.dk
Fixes: 17652f4240f7a ("modules: add CONFIG_MODPROBE_PATH")
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Jouni Roivas [Sat, 15 May 2021 00:27:33 +0000 (17:27 -0700)]

hfsplus: prevent corruption in shrinking truncate

I believe there are some issues introduced by commit 31651c607151
("hfsplus: avoid deadlock on file truncation")

HFS+ has extent records which always contains 8 extents.  In case the
first extent record in catalog file gets full, new ones are allocated from
extents overflow file.

In case shrinking truncate happens to middle of an extent record which
locates in extents overflow file, the logic in hfsplus_file_truncate() was
changed so that call to hfs_brec_remove() is not guarded any more.

Right action would be just freeing the extents that exceed the new size
inside extent record by calling hfsplus_free_extents(), and then check if
the whole extent record should be removed.  However since the guard
(blk_cnt > start) is now after the call to hfs_brec_remove(), this has
unfortunate effect that the last matching extent record is removed
unconditionally.

To reproduce this issue, create a file which has at least 10 extents, and
then perform shrinking truncate into middle of the last extent record, so
that the number of remaining extents is not under or divisible by 8.  This
causes the last extent record (8 extents) to be removed totally instead of
truncating into middle of it.  Thus this causes corruption, and lost data.

Fix for this is simply checking if the new truncated end is below the
start of this extent record, making it safe to remove the full extent
record.  However call to hfs_brec_remove() can't be moved to it's previous
place since we're dropping ->tree_lock and it can cause a race condition
and the cached info being invalidated possibly corrupting the node data.

Another issue is related to this one.  When entering into the block
(blk_cnt > start) we are not holding the ->tree_lock.  We break out from
the loop not holding the lock, but hfs_find_exit() does unlock it.  Not
sure if it's possible for someone else to take the lock under our feet,
but it can cause hard to debug errors and premature unlocking.  Even if
there's no real risk of it, the locking should still always be kept in
balance.  Thus taking the lock now just before the check.

Link: https://lkml.kernel.org/r/20210429165139.3082828-1-jouni.roivas@tuxera.com
Fixes: 31651c607151f ("hfsplus: avoid deadlock on file truncation")
Signed-off-by: Jouni Roivas <jouni.roivas@tuxera.com>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Cc: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Matthew Wilcox (Oracle) [Sat, 15 May 2021 00:27:30 +0000 (17:27 -0700)]

mm/filemap: fix readahead return types

A readahead request will not allocate more memory than can be represented
by a size_t, even on systems that have HIGHMEM available. Change the
length functions from returning an loff_t to a size_t.

Link: https://lkml.kernel.org/r/20210510201201.1558972-1-willy@infradead.org
Fixes: 32c0a6bcaa1f57 ("btrfs: add and use readahead_batch_length")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Peter Collingbourne [Sat, 15 May 2021 00:27:27 +0000 (17:27 -0700)]

kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled

These tests deliberately access these arrays out of bounds, which will
cause the dynamic local bounds checks inserted by
CONFIG_UBSAN_LOCAL_BOUNDS to fail and panic the kernel. To avoid this
problem, access the arrays via volatile pointers, which will prevent the
compiler from being able to determine the array bounds.

These accesses use volatile pointers to char (char *volatile) rather than
the more conventional pointers to volatile char (volatile char *) because
we want to prevent the compiler from making inferences about the pointer
itself (i.e. its array bounds), not the data that it refers to.

Link: https://lkml.kernel.org/r/20210507025915.1464056-1-pcc@google.com
Link: https://linux-review.googlesource.com/id/I90b1713fbfa1bf68ff895aef099ea77b98a7c3b9
Signed-off-by: Peter Collingbourne <pcc@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: George Popescu <georgepope@android.com>
Cc: Elena Petrova <lenaptr@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Matthew Wilcox (Oracle) [Sat, 15 May 2021 00:27:24 +0000 (17:27 -0700)]

mm: fix struct page layout on 32-bit systems

32-bit architectures which expect 8-byte alignment for 8-byte integers and
need 64-bit DMA addresses (arm, mips, ppc) had their struct page
inadvertently expanded in 2019.  When the dma_addr_t was added, it forced
the alignment of the union to 8 bytes, which inserted a 4 byte gap between
'flags' and the union.

Fix this by storing the dma_addr_t in one or two adjacent unsigned longs.
This restores the alignment to that of an unsigned long.  We always
store the low bits in the first word to prevent the PageTail bit from
being inadvertently set on a big endian platform.  If that happened,
get_user_pages_fast() racing against a page which was freed and
reallocated to the page_pool could dereference a bogus compound_head(),
which would be hard to trace back to this cause.

Link: https://lkml.kernel.org/r/20210510153211.1504886-1-willy@infradead.org
Fixes: c25fff7171be ("mm: add dma_addr_t to struct page")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Matteo Croce <mcroce@linux.microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Hugh Dickins [Sat, 15 May 2021 00:27:22 +0000 (17:27 -0700)]

ksm: revert "use GET_KSM_PAGE_NOLOCK to get ksm page in remove_rmap_item_from_tree()"

This reverts commit 3e96b6a2e9ad929a3230a22f4d64a74671a0720b. General
Protection Fault in rmap_walk_ksm() under memory pressure:
remove_rmap_item_from_tree() needs to take page lock, of course.

Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2105092253500.1127@eggly.anvils
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom