Christoph Hellwig [Fri, 4 Mar 2022 17:55:55 +0000 (18:55 +0100)]
nvme: remove support or stream based temperature hint
This support was added for RocksDB, but RocksDB ended up not using it.
At the same time drives on the open marked (vs those build for OEMs
for non-Linux support) that actually support streams are extremly
rare. Don't bloat the nvme driver for it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20220304175556.407719-1-hch@lst.de
[axboe: fold in ctrl->nr_streams removal from Keith] Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 4 Mar 2022 19:29:50 +0000 (12:29 -0700)]
Merge branch 'for-5.18/drivers' into for-next
* for-5.18/drivers:
floppy: use memcpy_{to,from}_bvec
drbd: use bvec_kmap_local in recv_dless_read
drbd: use bvec_kmap_local in drbd_csum_bio
bcache: use bvec_kmap_local in bio_csum
nvdimm-btt: use bvec_kmap_local in btt_rw_integrity
nvdimm-blk: use bvec_kmap_local in nd_blk_rw_integrity
zram: use memcpy_from_bvec in zram_bvec_write
zram: use memcpy_to_bvec in zram_bvec_read
aoe: use bvec_kmap_local in bvcpy
iss-simdisk: use bvec_kmap_local in simdisk_submit_bio
Jens Axboe [Fri, 4 Mar 2022 15:22:22 +0000 (08:22 -0700)]
io_uring: add support for registering ring file descriptors
Lots of workloads use multiple threads, in which case the file table is
shared between them. This makes getting and putting the ring file
descriptor for each io_uring_enter(2) system call more expensive, as it
involves an atomic get and put for each call.
Similarly to how we allow registering normal file descriptors to avoid
this overhead, add support for an io_uring_register(2) API that allows
to register the ring fds themselves:
1) IORING_REGISTER_RING_FDS - takes an array of io_uring_rsrc_update
structs, and registers them with the task.
2) IORING_UNREGISTER_RING_FDS - takes an array of io_uring_src_update
structs, and unregisters them.
When a ring fd is registered, it is internally represented by an offset.
This offset is returned to the application, and the application then
uses this offset and sets IORING_ENTER_REGISTERED_RING for the
io_uring_enter(2) system call. This works just like using a registered
file descriptor, rather than a real one, in an SQE, where
IOSQE_FIXED_FILE gets set to tell io_uring that we're using an internal
offset/descriptor rather than a real file descriptor.
In initial testing, this provides a nice bump in performance for
threaded applications in real world cases where the batch count (eg
number of requests submitted per io_uring_enter(2) invocation) is low.
In a microbenchmark, submitting NOP requests, we see the following
increases in performance:
Jens Axboe [Thu, 3 Mar 2022 15:52:17 +0000 (08:52 -0700)]
Merge branch 'for-5.18/drivers' into for-next
* for-5.18/drivers: (27 commits)
nvme: check that EUI/GUID/UUID are globally unique
nvme: check for duplicate identifiers earlier
nvme: fix the check for duplicate unique identifiers
nvme: cleanup __nvme_check_ids
nvme: remove nssa from struct nvme_ctrl
nvme: explicitly set non-error for directives
nvme: expose cntrltype and dctype through sysfs
nvme: send uevent on connection up
nvme: add vectored-io support for user-passthrough
nvme: add verbose error logging
nvme: add a helper to initialize connect_q
nvme-rdma: add helpers for mapping/unmapping request
nvmet-tcp: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvmet-rdma: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvmet-fc: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvmet: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvme-fc: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvme: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvmet: allow bdev in buffered_io mode
nvmet: use i_size_read() to set size for file-ns
...
Jens Axboe [Thu, 3 Mar 2022 15:51:48 +0000 (08:51 -0700)]
Merge tag 'nvme-5.18-2022-03-03' of git://git.infradead.org/nvme into for-5.18/drivers
Pull NVMe updates from Christoph:
"nvme updates for Linux 5.18
- add vectored-io support for user-passthrough (Kanchan Joshi)
- add verbose error logging (Alan Adamson)
- support buffered I/O on block devices in nvmet (Chaitanya Kulkarni)
- central discovery controller support (Martin Belanger)
- fix and extended the globally unique idenfier validation (me)
- move away from the deprecated IDA APIs (Sagi Grimberg)
- misc code cleanup (Keith Busch, Max Gurtovoy, Qinghua Jin,
Chaitanya Kulkarni)"
* tag 'nvme-5.18-2022-03-03' of git://git.infradead.org/nvme: (27 commits)
nvme: check that EUI/GUID/UUID are globally unique
nvme: check for duplicate identifiers earlier
nvme: fix the check for duplicate unique identifiers
nvme: cleanup __nvme_check_ids
nvme: remove nssa from struct nvme_ctrl
nvme: explicitly set non-error for directives
nvme: expose cntrltype and dctype through sysfs
nvme: send uevent on connection up
nvme: add vectored-io support for user-passthrough
nvme: add verbose error logging
nvme: add a helper to initialize connect_q
nvme-rdma: add helpers for mapping/unmapping request
nvmet-tcp: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvmet-rdma: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvmet-fc: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvmet: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvme-fc: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvme: replace ida_simple[get|remove] with the simler ida_[alloc|free]
nvmet: allow bdev in buffered_io mode
nvmet: use i_size_read() to set size for file-ns
...
Jens Axboe [Mon, 28 Feb 2022 13:40:29 +0000 (06:40 -0700)]
Merge branch 'for-5.18/block' into for-next
* for-5.18/block:
blk-crypto: show crypto capabilities in sysfs
block: don't delete queue kobject before its children
block: simplify calling convention of elv_unregister_queue()
Userspace can use these new files to decide what encryption settings to
use, or whether to use inline encryption at all. This also brings the
crypto capabilities in line with the other queue properties, which are
already discoverable via the queue directory in sysfs.
Design notes:
- Place the new files in a new subdirectory "crypto" to group them
together and to avoid complicating the main "queue" directory. This
also makes it possible to replace "crypto" with a symlink later if
we ever make the blk_crypto_profiles into real kobjects (see below).
- It was necessary to define a new kobject that corresponds to the
crypto subdirectory. For now, this kobject just contains a pointer
to the blk_crypto_profile. Note that multiple queues (and hence
multiple such kobjects) may refer to the same blk_crypto_profile.
An alternative design would more closely match the current kernel
data structures: the blk_crypto_profile could be a kobject itself,
located directly under the host controller device's kobject, while
/sys/block/$disk/queue/crypto would be a symlink to it.
I decided not to do that for now because it would require a lot more
changes, such as no longer embedding blk_crypto_profile in other
structures, and also because I'm not sure we can rule out moving the
crypto capabilities into 'struct queue_limits' in the future. (Even
if multiple queues share the same crypto engine, maybe the supported
data unit sizes could differ due to other queue properties.) It
would also still be possible to switch to that design later without
breaking userspace, by replacing the directory with a symlink.
- Use "max_dun_bits" instead of "max_dun_bytes". Currently, the
kernel internally stores this value in bytes, but that's an
implementation detail. It probably makes more sense to talk about
this value in bits, and choosing bits is more future-proof.
- "modes" is a sub-subdirectory, since there may be multiple supported
crypto modes, sysfs is supposed to have one value per file, and it
makes sense to group all the mode files together.
- Each mode had to be named. The crypto API names like "xts(aes)" are
not appropriate because they don't specify the key size. Therefore,
I assigned new names. The exact names chosen are arbitrary, but
they happen to match the names used in log messages in fs/crypto/.
- The "num_keyslots" file is a bit different from the others in that
it is only useful to know for performance reasons. However, it's
included as it can still be useful. For example, a user might not
want to use inline encryption if there aren't very many keyslots.
Eric Biggers [Mon, 24 Jan 2022 21:59:37 +0000 (13:59 -0800)]
block: don't delete queue kobject before its children
kobjects aren't supposed to be deleted before their child kobjects are
deleted. Apparently this is usually benign; however, a WARN will be
triggered if one of the child kobjects has a named attribute group:
sysfs group 'modes' not found for kobject 'crypto'
WARNING: CPU: 0 PID: 1 at fs/sysfs/group.c:278 sysfs_remove_group+0x72/0x80
...
Call Trace:
sysfs_remove_groups+0x29/0x40 fs/sysfs/group.c:312
__kobject_del+0x20/0x80 lib/kobject.c:611
kobject_cleanup+0xa4/0x140 lib/kobject.c:696
kobject_release lib/kobject.c:736 [inline]
kref_put include/linux/kref.h:65 [inline]
kobject_put+0x53/0x70 lib/kobject.c:753
blk_crypto_sysfs_unregister+0x10/0x20 block/blk-crypto-sysfs.c:159
blk_unregister_queue+0xb0/0x110 block/blk-sysfs.c:962
del_gendisk+0x117/0x250 block/genhd.c:610
Fix this by moving the kobject_del() and the corresponding
kobject_uevent() to the correct place.
Fixes: 2c2086afc2b8 ("block: Protect less code with sysfs_lock in blk_{un,}register_queue()") Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220124215938.2769-3-ebiggers@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
Christoph Hellwig [Thu, 24 Feb 2022 16:48:32 +0000 (17:48 +0100)]
nvme: check that EUI/GUID/UUID are globally unique
Add a check to verify that the unique identifiers are unique globally
in addition to the existing check that verifies that they are unique
inside a single subsystem.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Christoph Hellwig [Thu, 24 Feb 2022 16:46:50 +0000 (17:46 +0100)]
nvme: check for duplicate identifiers earlier
Lift the check for duplicate identifiers into nvme_init_ns_head, which
avoids pointless error unwinding in case they don't match, and also
matches where we check identifier validity for the multipath case.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Christoph Hellwig [Thu, 24 Feb 2022 10:32:58 +0000 (11:32 +0100)]
nvme: fix the check for duplicate unique identifiers
nvme_subsys_check_duplicate_ids should needs to return an error if any of
the identifiers matches, not just if all of them match. But it does not
need to and should not look at the CSI value for this sanity check.
Rewrite the logic to be separate from nvme_ns_ids_equal and optimize it
by reducing duplicate checks for non-present identifiers.
Fixes: ed754e5deeb1 ("nvme: track shared namespaces") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Keith Busch [Tue, 15 Feb 2022 15:37:19 +0000 (07:37 -0800)]
nvme: remove nssa from struct nvme_ctrl
The reported number of streams is not used outside the function that
gets it, so no need to stash it in the controller structure. Use a local
variable instead.
Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Keith Busch [Tue, 15 Feb 2022 15:03:08 +0000 (07:03 -0800)]
nvme: explicitly set non-error for directives
Stream directives is an optional feature. It is not an error if a
controller doesn't support as many as the kernel can optionally use.
Explicitly set the non-error return value on this condition with a
comment explaining why.
Note, the return value was already 0 in this condition, so the setting
is redundant. This patch should just silence bots that falsely believe
the condition contains an error omission.
Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Martin Belanger [Tue, 8 Feb 2022 19:33:46 +0000 (14:33 -0500)]
nvme: expose cntrltype and dctype through sysfs
TP8010 introduces the Discovery Controller Type attribute (dctype).
The dctype is returned in the response to the Identify command. This
patch exposes the dctype through the sysfs. Since the dctype depends on
the Controller Type (cntrltype), another attribute of the Identify
response, the patch also exposes the cntrltype as well. The dctype will
only be displayed for discovery controllers.
A note about the naming of this attribute:
Although TP8010 calls this attribute the Discovery Controller Type,
note that the dctype is now part of the response to the Identify
command for all controller types. I/O, Discovery, and Admin controllers
all share the same Identify response PDU structure. Non-discovery
controllers as well as pre-TP8010 discovery controllers will continue
to set this field to 0 (which has always been the default for reserved
bytes). Per TP8010, the value 0 now means "Discovery controller type is
not reported" instead of "Reserved". One could argue that this
definition is correct even for non-discovery controllers, and by
extension, exposing it in the sysfs for non-discovery controllers is
appropriate.
Signed-off-by: Martin Belanger <martin.belanger@dell.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
Martin Belanger [Tue, 8 Feb 2022 19:33:45 +0000 (14:33 -0500)]
nvme: send uevent on connection up
When connectivity with a controller is lost, the driver will keep
trying to reconnect once every 10 sec. When connection is restored,
user-space apps need to be informed so that they can take proper
action. For example, TP8010 introduces the DIM PDU, which is used to
register with a discovery controller (DC). The DIM PDU is sent from
user-space. The DIM PDU must be sent every time a connection is
established with a DC. Therefore, the kernel must tell user-space apps
when connection is restored so that registration can happen.
The uevent sent is a "change" uevent with environmental data
set to: "NVME_EVENT=connected".
Signed-off-by: Martin Belanger <martin.belanger@dell.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
Kanchan Joshi [Thu, 10 Feb 2022 05:37:55 +0000 (11:07 +0530)]
nvme: add vectored-io support for user-passthrough
Add a new NVME_IOCTL_IO64_CMD_VEC ioctl that works like the existing
NVME_IOCTL_IO64_CMD ioctl except that it takes and array of iovecs
and thus supports vectored I/O.
- cmd.addr is base address of user iovec array
- cmd.vec_cnt is count of iovec array elements
This patch does not include vectored-variant for admin-commands as most
of them are light on buffers and likely to have low invocation frequency.
Alan Adamson [Thu, 3 Feb 2022 08:11:53 +0000 (00:11 -0800)]
nvme: add verbose error logging
Improves logging of NVMe errors. If NVME_VERBOSE_ERRORS is configured,
a verbose description of the error is logged, otherwise only status
codes/bits is logged.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
[kch]: fix several nits, cosmetics, and trim down code. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
Chaitanya Kulkarni [Wed, 2 Feb 2022 09:04:45 +0000 (01:04 -0800)]
nvmet: allow bdev in buffered_io mode
Allow block device to be configured in the buffered I/O mode by using
the file backend. In this way now we can use cache for the block
device namespace which shows significant performance improvement.
We update the block device ns enable function and return early when
buffered_io flag is set.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Chaitanya Kulkarni [Wed, 2 Feb 2022 09:04:44 +0000 (01:04 -0800)]
nvmet: use i_size_read() to set size for file-ns
Instead of calling vfs_getattr() use i_size_read() to read the size of
file so we can read the size of not only file type but also block type
with one call. This is needed to implement buffered_io support for the
NVMeOF block device backend.
We also change return type of function nvmet_file_ns_revalidate() from
int to void, since this function does not return any meaning value.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Chaitanya Kulkarni [Wed, 12 Jan 2022 06:20:57 +0000 (22:20 -0800)]
nvme-fabrics: use consistent zeroout pattern
Remove zeroout memeset call & zeroout local variable cmd at the time
of declaration in nvmf_ref_read32() similar to what we have done in
nvmf_reg_read64(), nvmf_reg_write32(), nvmf_connect_admin_queue(), and
nvmf_connect_io_queue().
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
Jens Axboe [Sun, 27 Feb 2022 21:50:28 +0000 (14:50 -0700)]
Merge branch 'for-5.18/drivers' into for-next
* for-5.18/drivers:
null_blk: null_alloc_page() cleanup
null_blk: remove hardcoded null_alloc_page() param
null_blk: remove hardcoded alloc_cmd() parameter
loop: allow user to set the queue depth
loop: remove extra variable in lo_req_flush
loop: remove extra variable in lo_fallocate()
loop: use sysfs_emit() in the sysfs xxx show()
null_blk: fix return value from null_add_dev()
loop: clean up grammar in warning message
block/rnbd: Remove a useless mutex
block/rnbd: client device does not care queue/rotational
block/rnbd-clt: fix CHECK:BRACES warning
Jens Axboe [Sun, 27 Feb 2022 21:50:25 +0000 (14:50 -0700)]
Merge branch 'for-5.18/block' into for-next
* for-5.18/block: (81 commits)
block: default BLOCK_LEGACY_AUTOLOAD to y
block: update io_ticks when io hang
block, bfq: don't move oom_bfqq
block, bfq: avoid moving bfqq to it's parent bfqg
block, bfq: cleanup bfq_bfqq_to_bfqg()
block/bfq_wf2q: correct weight to ioprio
blk-mq: avoid extending delays of active hctx from blk_mq_delay_run_hw_queues
virtio_blk: simplify refcounting
memstick/mspro_block: simplify refcounting
memstick/mspro_block: fix handling of read-only devices
memstick/ms_block: simplify refcounting
block: add a ->free_disk method
block: revert 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios")
block: don't try to throttle split bio if iops limit isn't set
block: throttle split bio in case of iops limit
block: merge submit_bio_checks() into submit_bio_noacct
block: don't check bio in blk_throtl_dispatch_work_fn
block: don't declare submit_bio_checks in local header
block: move blk_crypto_bio_prep() out of blk-mq.c
block: move submit_bio_checks() into submit_bio_noacct
...
Chaitanya Kulkarni [Tue, 22 Feb 2022 15:28:52 +0000 (07:28 -0800)]
null_blk: null_alloc_page() cleanup
Remove goto labels and use direct returns as error unwinding code only
needs to free t_page variable if we alloc_pages() call fails as having
two labels for one kfree() can be avoided easily.
Only caller of null_alloc_page() is null_insert_page() unconditionally
sets only parameter to GFP_NOIO and that is statically hard-coded in
null_blk. There is no point in having statically hardcoded function
parameter.
Remove the unnecessary parameter gfp_flags and adjust the code, so it
can retain existing behavior null_alloc_page() with GFP_NOIO.
Chaitanya Kulkarni [Wed, 16 Feb 2022 17:29:45 +0000 (09:29 -0800)]
null_blk: remove hardcoded alloc_cmd() parameter
Only caller of alloc_cmd() is null_submit_bio() unconditionally sets
second parameter to true and that is statically hard-coded in null_blk.
There is no point in having statically hardcoded function parameter.
Remove the unnecessary parameter can_wait and adjust the code so it
can retain existing behavior of waiting when we don't get valid
nullb_cmd from __alloc_cmd() in alloc_cmd().
The restructured code avoids multiple return statements, multiple
calls to __alloc_cmd() and resulting a fast path call to
prepare_to_wait() due to removal of first alloc_cmd() call.
Follow the pattern that we have in bio_alloc() to set the structure
members in the structure allocation function in alloc_cmd() and pass
bio to initialize newly allocated cmd->bio member.
Follow the pattern in copy_to_nullb() to use result of one function call
(null_cache_active()) to be used as a parameter to another function call
(null_insert_page()), use result of alloc_cmd() as a first parameter to
the null_handle_cmd() in null_submit_bio() function. This allow us to
remove the local variable cmd on stack in null_submit_bio() that is in
fast path.
Chaitanya Kulkarni [Tue, 15 Feb 2022 21:33:10 +0000 (13:33 -0800)]
loop: allow user to set the queue depth
Instead of hardcoding queue depth allow user to set the hw queue depth
using module parameter. Set default value to 128 to retain the existing
behavior.
Chaitanya Kulkarni [Tue, 15 Feb 2022 21:33:09 +0000 (13:33 -0800)]
loop: remove extra variable in lo_req_flush
The local variable file is used to pass it to the vfs_fsync(). We can
get away with using lo->lo_backing_file instead of storing in a local
variable which is not used anywhere else.
Chaitanya Kulkarni [Tue, 15 Feb 2022 21:33:08 +0000 (13:33 -0800)]
loop: remove extra variable in lo_fallocate()
The local variable q is used to pass it to the blk_queue_discard(). We
can get away with using lo->lo_queue instead of storing in a local
variable which is not used anywhere else.
Chaitanya Kulkarni [Tue, 15 Feb 2022 21:33:07 +0000 (13:33 -0800)]
loop: use sysfs_emit() in the sysfs xxx show()
sprintf does not know the PAGE_SIZE maximum of the temporary buffer
used for outputting sysfs content and it's possible to overrun the
PAGE_SIZE buffer length.
Use a generic sysfs_emit function that knows the size of the
temporary buffer and ensures that no overrun is done for offset
attribute in
loop_attr_[offset|sizelimit|autoclear|partscan|dio]_show() callbacks.
Chaitanya Kulkarni [Tue, 15 Feb 2022 11:59:51 +0000 (03:59 -0800)]
null_blk: fix return value from null_add_dev()
The function nullb_device_power_store() returns -ENOMEM when
null_add_dev() fails. null_add_dev() can fail with return value
other than -ENOMEM such as -EINVAL when Zoned Block Device option
is used, see :
Christoph Hellwig [Fri, 25 Feb 2022 18:14:40 +0000 (19:14 +0100)]
block: default BLOCK_LEGACY_AUTOLOAD to y
As Luis reported, losetup currently doesn't properly create the loop
device without this if the device node already exists because old
scripts created it manually. So default to y for now and remove the
aggressive removal schedule.
Reported-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220225181440.1351591-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Sat, 26 Feb 2022 14:09:10 +0000 (07:09 -0700)]
Merge branch 'for-5.18/io_uring' into for-next
* for-5.18/io_uring:
io_uring: documentation fixup
io_uring: do not recalculate ppos unnecessarily
io_uring: update kiocb->ki_pos at execution time
io_uring: remove duplicated calls to io_kiocb_ppos
io_uring: Remove unneeded test in io_run_task_work_sig()
io-uring: Make tracepoints consistent.
io-uring: add __fill_cqe function
io-wq: use IO_WQ_ACCT_NR rather than hardcoded number
io-wq: reduce acct->lock crossing functions lock/unlock
io-wq: decouple work_list protection from the big wqe->lock
io_uring: Fix use of uninitialized ret in io_eventfd_register()
io_uring: remove ring quiesce for io_uring_register
io_uring: avoid ring quiesce while registering restrictions and enabling rings
io_uring: avoid ring quiesce while registering async eventfd
io_uring: avoid ring quiesce while registering/unregistering eventfd
io_uring: remove trace for eventfd
Stefan Roesch [Fri, 25 Feb 2022 18:53:26 +0000 (10:53 -0800)]
io-uring: Make statx API stable
One of the key architectual tenets is to keep the parameters for
io-uring stable. After the call has been submitted, its value can
be changed. Unfortunaltely this is not the case for the current statx
implementation.
IO-Uring change:
This changes replaces the const char * filename pointer in the io_statx
structure with a struct filename *. In addition it also creates the
filename object during the prepare phase.
With this change, the opcode also needs to invoke cleanup, so the
filename object gets freed after processing the request.
fs change:
This replaces the const char* __user filename parameter in the two
functions do_statx and vfs_statx with a struct filename *. In addition
to be able to correctly construct a filename object a new helper
function getname_statx_lookup_flags is introduced. The function makes
sure that do_statx and vfs_statx is invoked with the correct lookup flags.
Linus Torvalds [Fri, 25 Feb 2022 20:12:06 +0000 (12:12 -0800)]
Merge tag 'char-misc-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are a few small driver fixes for 5.17-rc6 for reported issues.
The majority of these are IIO fixes for small things, and the other
two are a mvmem and mtd core conflict fix.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
mtd: core: Fix a conflict between MTD and NVMEM on wp-gpios property
nvmem: core: Fix a conflict between MTD and NVMEM on wp-gpios property
iio: imu: st_lsm6dsx: wait for settling time in st_lsm6dsx_read_oneshot
iio: Fix error handling for PM
iio: addac: ad74413r: correct comparator gpio getters mask usage
iio: addac: ad74413r: use ngpio size when iterating over mask
iio: addac: ad74413r: Do not reference negative array offsets
iio: adc: men_z188_adc: Fix a resource leak in an error handling path
iio: frequency: admv1013: remove the always true condition
iio: accel: fxls8962af: add padding to regmap for SPI
iio:imu:adis16480: fix buffering for devices with no burst mode
iio: adc: ad7124: fix mask used for setting AIN_BUFP & AIN_BUFM bits
iio: adc: tsc2046: fix memory corruption by preventing array overflow
Linus Torvalds [Fri, 25 Feb 2022 20:05:40 +0000 (12:05 -0800)]
Merge tag 'driver-core-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is a single driver core fix for 5.17-rc6. It resolves a reported
problem when the DMA map of a device is not properly released.
It has been in linux-next with no reported problems"
* tag 'driver-core-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Free DMA range map when device is released
Linus Torvalds [Fri, 25 Feb 2022 19:56:16 +0000 (11:56 -0800)]
Merge tag 'staging-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fix from Greg KH:
"Here is a single staging driver fix for 5.17-rc6.
It resolves a reported problem in the fbtft fb_st7789v.c driver that
could cause the display to be flipped in cold weather.
It has been in linux-next with no reported problems"
* tag 'staging-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: fbtft: fb_st7789v: reset display before initialization
Linus Torvalds [Fri, 25 Feb 2022 19:45:29 +0000 (11:45 -0800)]
Merge tag 'tty-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small n_gsm and sc16is7xx serial driver fixes for
5.17-rc6.
The n_gsm fixes are from Siemens as it seems they are using the line
discipline and fixing up a number of issues they found in their
testing. The sc16is7xx serial driver fix is for a reported problem
with that chip.
All of these have been in linux-next with no reported problems"
* tag 'tty-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
sc16is7xx: Fix for incorrect data being transmitted
tty: n_gsm: fix deadlock in gsmtty_open()
tty: n_gsm: fix wrong modem processing in convergence layer type 2
tty: n_gsm: fix wrong tty control line for flow control
tty: n_gsm: fix NULL pointer access due to DLCI release
tty: n_gsm: fix proper link termination after failed open
tty: n_gsm: fix encoding of command/response bit
tty: n_gsm: fix encoding of control signal octet bit DV
Linus Torvalds [Fri, 25 Feb 2022 19:36:31 +0000 (11:36 -0800)]
Merge tag 'usb-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of small USB driver fixes for 5.17-rc6 to resolve
reported problems and add new device ids. They include:
All of these have been in linux-next with no reported problems"
* tag 'usb-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: gadget: rndis: add spinlock for rndis response list
usb: dwc3: gadget: Let the interrupt handler disable bottom halves.
USB: gadget: validate endpoint index for xilinx udc
USB: serial: option: add Telit LE910R1 compositions
USB: serial: option: add support for DW5829e
Revert "USB: serial: ch341: add new Product ID for CH341A"
usb: dwc2: drd: fix soft connect when gadget is unconfigured
usb: dwc3: pci: Fix Bay Trail phy GPIO mappings
tps6598x: clear int mask on probe failure
xhci: Prevent futile URB re-submissions due to incorrect return value.
xhci: re-initialize the HC during resume if HCE was set
usb: dwc3: pci: Add "snps,dis_u2_susphy_quirk" for Intel Bay Trail
usb: dwc3: pci: add support for the Intel Raptor Lake-S
amdgpu:
- Display FP fix
- PCO powergating fix
- RDNA2 OEM SKU stability fixes
- Display PSR fix
- PCI ASPM fix
- Display link encoder fix for TEST_COMMIT
- Raven2 suspend/resume fix
- Fix a regression in virtual display support
- GPUVM eviction fix
i915:
- Fix QGV handling on ADL-P+
- Fix bw atomic check when switching between SAGV vs. no SAGV
- Disconnect PHYs left connected by BIOS on disabled ports
- Fix SAVG to no SAGV transitions on TGL+
- Print PHY name properly on calibration error (DG2)
* tag 'drm-fixes-2022-02-25' of git://anongit.freedesktop.org/drm/drm: (24 commits)
drm/amdgpu: check vm ready by amdgpu_vm->evicting flag
drm/amdgpu: bypass tiling flag check in virtual display case (v2)
Revert "drm/amdgpu: add modifiers in amdgpu_vkms_plane_init()"
drm/amdgpu: do not enable asic reset for raven2
drm/amd/display: Fix stream->link_enc unassigned during stream removal
drm/amd: Check if ASPM is enabled from PCIe subsystem
drm/edid: Always set RGB444
drm/tegra: dpaux: Populate AUX bus
drm/radeon: fix variable type
drm/amd/display: For vblank_disable_immediate, check PSR is really used
drm/amd/pm: fix some OEM SKU specific stability issues
drm/amdgpu: disable MMHUB PG for Picasso
drm/amd/display: Protect update_bw_bounding_box FPU code.
drm/i915/dg2: Print PHY name properly on calibration error
drm/i915: Fix bw atomic check when switching between SAGV vs. no SAGV
drm/i915: Correctly populate use_sagv_wm for all pipes
drm/i915: Disconnect PHYs left connected by BIOS on disabled ports
drm/i915: Widen the QGV point mask
drm/imx/dcss: i.MX8MQ DCSS select DRM_GEM_CMA_HELPER
drm/vc4: crtc: Fix runtime_pm reference counting
...
Linus Torvalds [Thu, 24 Feb 2022 22:36:38 +0000 (14:36 -0800)]
Merge tag 'perf-tools-fixes-for-v5.17-2022-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix double free in in the error path when opening perf.data from
multiple files in a directory instead of from a single file
- Sync the msr-index.h copy with the kernel sources
- Fix error when printing 'weight' field in 'perf script'
- Skip failing sigtrap test for arm+aarch64 in 'perf test'
- Fix failure to use a cpu list for uncore events in hybrid systems,
e.g. Intel Alder Lake
* tag 'perf-tools-fixes-for-v5.17-2022-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf script: Fix error when printing 'weight' field
tools arch x86: Sync the msr-index.h copy with the kernel sources
perf data: Fix double free in perf_session__delete()
perf evlist: Fix failed to use cpu list for uncore events
perf test: Skip failing sigtrap test for arm+aarch64
Linus Torvalds [Thu, 24 Feb 2022 22:05:49 +0000 (14:05 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"x86 host:
- Expose KVM_CAP_ENABLE_CAP since it is supported
- Disable KVM_HC_CLOCK_PAIRING in TSC catchup mode
- Ensure async page fault token is nonzero
- Fix lockdep false negative
- Fix FPU migration regression from the AMX changes
x86 guest:
- Don't use PV TLB/IPI/yield on uniprocessor guests
PPC:
- reserve capability id (topic branch for ppc/kvm)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: nSVM: disallow userspace setting of MSR_AMD64_TSC_RATIO to non default value when tsc scaling disabled
KVM: x86/mmu: make apf token non-zero to fix bug
KVM: PPC: reserve capability 210 for KVM_CAP_PPC_AIL_MODE_3
x86/kvm: Don't use pv tlb/ipi/sched_yield if on 1 vCPU
x86/kvm: Fix compilation warning in non-x86_64 builds
x86/kvm/fpu: Remove kvm_vcpu_arch.guest_supported_xcr0
x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0
kvm: x86: Disable KVM_HC_CLOCK_PAIRING if tsc is in always catchup mode
KVM: Fix lockdep false negative during host resume
KVM: x86: Add KVM_CAP_ENABLE_CAP to x86
Linus Torvalds [Thu, 24 Feb 2022 21:19:57 +0000 (13:19 -0800)]
Merge tag 'pci-v5.17-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci fixes from Bjorn Helgaas:
- Fix a merge error that broke PCI device enumeration on mvebu
platforms, including Turris Omnia (Armada 385) (Pali Rohár)
- Avoid using ATS on all AMD Navi10 and Navi14 GPUs because some
VBIOSes don't account for "harvested" (disabled) parts of the chip
when initializing caches (Alex Deucher)
* tag 'pci-v5.17-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Mark all AMD Navi10 and Navi14 GPU ATS as broken
PCI: mvebu: Fix device enumeration regression
Linus Torvalds [Thu, 24 Feb 2022 20:45:32 +0000 (12:45 -0800)]
Merge tag 'net-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf and netfilter.
Current release - regressions:
- bpf: fix crash due to out of bounds access into reg2btf_ids
- mvpp2: always set port pcs ops, avoid null-deref
- eth: marvell: fix driver load from initrd
- eth: intel: revert "Fix reset bw limit when DCB enabled with 1 TC"
Current release - new code bugs:
- mptcp: fix race in overlapping signal events
Previous releases - regressions:
- xen-netback: revert hotplug-status changes causing devices to not
be configured
- dsa:
- avoid call to __dev_set_promiscuity() while rtnl_mutex isn't
held
- fix panic when removing unoffloaded port from bridge
- dsa: microchip: fix bridging with more than two member ports
Previous releases - always broken:
- bpf:
- fix crash due to incorrect copy_map_value when both spin lock
and timer are present in a single value
- fix a bpf_timer initialization issue with clang
- do not try bpf_msg_push_data with len 0
- add schedule points in batch ops
- nf_tables:
- unregister flowtable hooks on netns exit
- correct flow offload action array size
- fix a couple of memory leaks
- vsock: don't check owner in vhost_vsock_stop() while releasing
- gso: do not skip outer ip header in case of ipip and net_failover
- smc: use a mutex for locking "struct smc_pnettable"
- mptcp: fix race in incoming ADD_ADDR option processing
- sysfs: add check for netdevice being present to speed_show
- sched: act_ct: fix flow table lookup after ct clear or switching
zones
- eth: intel: fixes for SR-IOV forwarding offloads
- eth: broadcom: fixes for selftests and error recovery
- eth: mellanox: flow steering and SR-IOV forwarding fixes
Misc:
- make __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor
friends not report freed skbs as drops
- force inlining of checksum functions in net/checksum.h"
* tag 'net-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
net: mv643xx_eth: process retval from of_get_mac_address
ping: remove pr_err from ping_lookup
Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC"
openvswitch: Fix setting ipv6 fields causing hw csum failure
ipv6: prevent a possible race condition with lifetimes
net/smc: Use a mutex for locking "struct smc_pnettable"
bnx2x: fix driver load from initrd
Revert "xen-netback: Check for hotplug-status existence before watching"
Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"
net/mlx5e: Fix VF min/max rate parameters interchange mistake
net/mlx5e: Add missing increment of count
net/mlx5e: MPLSoUDP decap, fix check for unsupported matches
net/mlx5e: Fix MPLSoUDP encap to use MPLS action information
net/mlx5e: Add feature check for set fec counters
net/mlx5e: TC, Skip redundant ct clear actions
net/mlx5e: TC, Reject rules with forward and drop actions
net/mlx5e: TC, Reject rules with drop and modify hdr action
net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets
net/mlx5e: Fix wrong return value on ioctl EEPROM query failure
net/mlx5: Fix possible deadlock on rule deletion
...
Dave Airlie [Thu, 24 Feb 2022 19:51:04 +0000 (05:51 +1000)]
Merge tag 'drm-intel-fixes-2022-02-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix QGV handling on ADL-P+ (Ville Syrjälä)
- Fix bw atomic check when switching between SAGV vs. no SAGV (Ville Syrjälä)
- Disconnect PHYs left connected by BIOS on disabled ports (Imre Deak)
- Fix SAVG to no SAGV transitions on TGL+ (Ville Syrjälä)
- Print PHY name properly on calibration error (DG2) (Matt Roper)
Linus Torvalds [Thu, 24 Feb 2022 19:15:10 +0000 (11:15 -0800)]
Merge tag 'block-5.17-2022-02-24' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- NVMe pull request:
- send H2CData PDUs based on MAXH2CDATA (Varun Prakash)
- fix passthrough to namespaces with unsupported features (Christoph
Hellwig)
- Clear iocb->private at poll completion (Stefano)
* tag 'block-5.17-2022-02-24' of git://git.kernel.dk/linux-block:
nvme-tcp: send H2CData PDUs based on MAXH2CDATA
nvme: also mark passthrough-only namespaces ready in nvme_update_ns_info
nvme: don't return an error from nvme_configure_metadata
block: clear iocb->private in blkdev_bio_end_io_async()
Linus Torvalds [Thu, 24 Feb 2022 19:08:15 +0000 (11:08 -0800)]
Merge tag 'io_uring-5.17-2022-02-23' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
- Add a conditional schedule point in io_add_buffers() (Eric)
- Fix for a quiesce speedup merged in this release (Dylan)
- Don't convert to jiffies for event timeout waiting, it's way too
coarse when we accept a timespec as input (me)
* tag 'io_uring-5.17-2022-02-23' of git://git.kernel.dk/linux-block:
io_uring: disallow modification of rsrc_data during quiesce
io_uring: don't convert to jiffies for waiting on timeouts
io_uring: add a schedule point in io_add_buffers()
Linus Torvalds [Thu, 24 Feb 2022 18:42:20 +0000 (10:42 -0800)]
Merge tag 'platform-drivers-x86-v5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull more x86 platform driver fixes from Hans de Goede:
"Two more fixes:
- Fix suspend/resume regression on AMD Cezanne APUs in >= 5.16
- Fix Microsoft Surface 3 battery readings"
* tag 'platform-drivers-x86-v5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
surface: surface3_power: Fix battery readings on batteries without a serial number
platform/x86: amd-pmc: Set QOS during suspend on CZN w/ timer wakeup
Mauri Sandberg [Wed, 23 Feb 2022 14:23:37 +0000 (16:23 +0200)]
net: mv643xx_eth: process retval from of_get_mac_address
Obtaining a MAC address may be deferred in cases when the MAC is stored
in an NVMEM block, for example, and it may not be ready upon the first
retrieval attempt and return EPROBE_DEFER.
It is also possible that a port that does not rely on NVMEM has been
already created when getting the defer request. Thus, also the resources
allocated previously must be freed when doing a roll-back.
Fixes: 76723bca2802 ("net: mv643xx_eth: add DT parsing support") Signed-off-by: Mauri Sandberg <maukka@ext.kapsi.fi> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220223142337.41757-1-maukka@ext.kapsi.fi Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maxim Levitsky [Wed, 23 Feb 2022 11:56:49 +0000 (13:56 +0200)]
KVM: x86: nSVM: disallow userspace setting of MSR_AMD64_TSC_RATIO to non default value when tsc scaling disabled
If nested tsc scaling is disabled, MSR_AMD64_TSC_RATIO should
never have non default value.
Due to way nested tsc scaling support was implmented in qemu,
it would set this msr to 0 when nested tsc scaling was disabled.
Ignore that value for now, as it causes no harm.
Liang Zhang [Tue, 22 Feb 2022 03:12:39 +0000 (11:12 +0800)]
KVM: x86/mmu: make apf token non-zero to fix bug
In current async pagefault logic, when a page is ready, KVM relies on
kvm_arch_can_dequeue_async_page_present() to determine whether to deliver
a READY event to the Guest. This function test token value of struct
kvm_vcpu_pv_apf_data, which must be reset to zero by Guest kernel when a
READY event is finished by Guest. If value is zero meaning that a READY
event is done, so the KVM can deliver another.
But the kvm_arch_setup_async_pf() may produce a valid token with zero
value, which is confused with previous mention and may lead the loss of
this READY event.
Xin Long [Thu, 24 Feb 2022 03:41:08 +0000 (22:41 -0500)]
ping: remove pr_err from ping_lookup
As Jakub noticed, prints should be avoided on the datapath.
Also, as packets would never come to the else branch in
ping_lookup(), remove pr_err() from ping_lookup().
Mateusz Palczewski [Wed, 23 Feb 2022 17:53:47 +0000 (09:53 -0800)]
Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC"
Revert of a patch that instead of fixing a AQ error when trying
to reset BW limit introduced several regressions related to
creation and managing TC. Currently there are errors when creating
a TC on both PF and VF.
Error log:
[17428.783095] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14
[17428.783107] i40e 0000:3b:00.1: Failed configuring TC map 0 for VSI 391
[17428.783254] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14
[17428.783259] i40e 0000:3b:00.1: Unable to configure TC map 0 for VSI 391
Ipv6 ttl, label and tos fields are modified without first
pulling/pushing the ipv6 header, which would have updated
the hw csum (if available). This might cause csum validation
when sending the packet to the stack, as can be seen in
the trace below.
Niels Dossche [Wed, 23 Feb 2022 13:19:56 +0000 (14:19 +0100)]
ipv6: prevent a possible race condition with lifetimes
valid_lft, prefered_lft and tstamp are always accessed under the lock
"lock" in other places. Reading these without taking the lock may result
in inconsistencies regarding the calculation of the valid and preferred
variables since decisions are taken on these fields for those variables.
Fabio M. De Francesco [Wed, 23 Feb 2022 10:02:52 +0000 (11:02 +0100)]
net/smc: Use a mutex for locking "struct smc_pnettable"
smc_pnetid_by_table_ib() uses read_lock() and then it calls smc_pnet_apply_ib()
which, in turn, calls mutex_lock(&smc_ib_devices.mutex).
read_lock() disables preemption. Therefore, the code acquires a mutex while in
atomic context and it leads to a SAC bug.
Fix this bug by replacing the rwlock with a mutex.
Reported-and-tested-by: syzbot+4f322a6d84e991c38775@syzkaller.appspotmail.com Fixes: 64e28b52c7a6 ("net/smc: add pnet table namespace support") Confirmed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Link: https://lore.kernel.org/r/20220223100252.22562-1-fmdefrancesco@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Manish Chopra [Wed, 23 Feb 2022 08:57:20 +0000 (00:57 -0800)]
bnx2x: fix driver load from initrd
Commit b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0") added
new firmware support in the driver with maintaining older firmware
compatibility. However, older firmware was not added in MODULE_FIRMWARE()
which caused missing firmware files in initrd image leading to driver load
failure from initrd. This patch adds MODULE_FIRMWARE() for older firmware
version to have firmware files included in initrd.
The reasoning in the commit was wrong - the code expected to setup the
watch even if 'hotplug-status' didn't exist. In fact, it relied on the
watch being fired the first time - to check if maybe 'hotplug-status' is
already set to 'connected'. Not registering a watch for non-existing
path (which is the case if hotplug script hasn't been executed yet),
made the backend not waiting for the hotplug script to execute. This in
turns, made the netfront think the interface is fully operational, while
in fact it was not (the vif interface on xen-netback side might not be
configured yet).
This was a workaround for 'hotplug-status' erroneously being removed.
But since that is reverted now, the workaround is not necessary either.
More discussion at
https://lore.kernel.org/xen-devel/afedd7cb-a291-e773-8b0d-4db9b291fa98@ipxe.org/T/#u