www.infradead.org Git - users/sagi/blktests.git/log

nvme/rc,srp/rc,common/multipath-over-rdma: rename use_rxe to USE_RXE

To follow uppercase letter guide of environment variables, rename
use_rxe to USE_RXE.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{rc,016,017}: rename nvme_num_iter to NVME_NUM_ITER

To follow uppercase letter guide of environment variables, rename
nvme_num_iter to NVME_NUM_ITER.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{rc,010,017,031,034,035}: rename nvme_img_size to NVME_IMG_SIZE

To follow uppercase letter guide of environment variables, rename
nvme_img_size to NVME_IMG_SIZE.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{021,022,025,026,027,028}: do not hard code target blkdev type

There is no need to hardcode the target blkdev type. This allows
the user to select different blkdev types via the nvmet_blkdev_type
environment variable. Also modify set_conditions() hooks to cover
combinations of NVMET_TRTYPES and NVMET_BLKDEV_TYPES.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
[Shin'ichiro: modified set_conditions()]
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{007,009,011,013,015,020,024}: drop duplicate nvmet blkdev type tests

There are various tests which only differ on the blkdev type of the
target. With the newly added feature which allows to control the target
blkdev type via the environment, these duplicate tests are not necessary
anymore and reduces the maintenance overhead.

The removed tests are covered by the other test cases nvme/006 ,008,
010, 012, 014, 019 and 023 using 'file' blkdev type.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{006,008,010,012,014,019,023}: support NVMET_BLKDEV_TYPES

Enable repeated test runs for the listed test cases for
NVMET_BLKDEV_TYPES. The default values of NVMET_BLKDEV_TYPES is
"device file". With this default set up, each of the listed test cases
are run twice. The second runs of the test cases for 'file' blkdev type
do exact same test as other test cases nvme/007, 009, 011, 013, 015, 020
and 024.

The test cases already support the repetition for NVMET_TRTYPES. Modify
the set_conditions() hooks to call both NVMET_BLKDEV_TYPES and
NVMET_TRTYPES using _set_combined_conditions(). When NVMET_BLKDEV_TYPES
and NVMET_TRTYPES are set as follows, the test cases are repeated
2 x 3 = 6 times each.

NVMET_BLKDEV_TYPES="device file"
NVMET_TRTYPES="loop rdma tcp"

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{002-031,033-038,040-045,047,048}: support NMVET_TRTYPES

Add set_conditions() hook and call _set_nvme_trtype() so that the test
cases are repeated for NMVET_TRTYPES.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: introduce NVMET_BLKDEV_TYPES

Some of the test cases in nvme test group do the exact same test for two
blkdev types: device type and file type. Except for this difference, the
test cases are pure duplication. It is desired to avoid the duplication.
When the duplication is avoided, it is required to control which
condition to run the test.

To avoid the duplication and also to allow the blkdev type control,
introduce a new configuration parameter NVMET_BLKDEV_TYPES. This
parameter specifies which blkdev type to setup for the tests. It can
take one of the blkdev types. Or it can take both types, which is the
default. When both types are specified, the test cases are repeated
twice to cover the types.

Also add the helper function _set_nvmet_blkdev_type(). It sets up
nvmet_blkdev_type variable for each test case run from
NVMET_BLKDEV_TYPES.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: add blkdev type environment variable

Introduce nvmet_blkdev_type environment variable which allows to control
the target setup. This allows us to drop duplicate tests which just
differ how the target is setup.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
[Shin'ichiro: dropped description in Documentation/running-tests.md]
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: introduce NVMET_TRTYPES

Some of the test cases in nvme test group can be run under various nvme
target transport types. The configuration parameter nvme_trtype
specifies the transport to use. But this configuration method has two
drawbacks. Firstly, the blktests check script needs to be invoked
multiple times to cover multiple transport types. Secondly, the test
cases irrelevant to the transport types are executed exactly same
conditions in the multiple blktests runs.

To avoid the drawbacks, allow setting multiple transport types. Taking
this chance, rename the parameter from nvme_trtype to NVMET_TRTYPES to
follow the uppercase letter naming guide for environment variables.
NVMET_TRTYPES can take multiple transport types like:

NVMET_TRTYPES="loop tcp"

Introduce _nvmet_set_nvme_trtype() which can be called from the
set_conditions() hook of the transport type dependent test cases.
Blktests will repeat the test case as many as the number of elements in
NVMET_TRTYPES, and set nvme_trtype for each test case run.

Also introduce _NVMET_TRTYPES_is_valid() to check NVMET_TRTYPES value
before test run.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

common/rc: introduce _check_conflict_and_set_default()

Following commits are going to rename some config option parameters from
lowercase letters to uppercase. The old lowercase options will be
deprecated but still be kept usable to not cause confusions. When these
changes are made, it will be required to check that both new and old
parameters are not set at once and ensure they do not have two different
values.

To simplify the code to check the two parameters, introduce the helper
_check_conflict_and_set_default(). If the both two parameters are
set, it errors out. If the old option is set, it propagates the old
option value to the new option. Also, when neither of them is set, it
sets the default value to the new option.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

meta/{018,019}: add test cases to check _set_combined_conditions

Add test cases to confirm that the helper _set_combined_conditions is
working. meta/018 combines two hooks, and meta/019 combines three hooks.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

common/rc: introduce _set_combined_conditions

When the test case has the "set_conditions" hook, blktests repeats the
test case multiple times. This allows repeating the test changing one
condition parameter. However, it is often desired to run the test for
multiple condition parameters. For example, some test cases in the nvme
test group are required to run for different "transport types" as well
as different "backend block device types". In this case, it is required
to iterate over all combinations of the two condition parameters
"transport types" and "backend block device types".

To cover such iteration for the multiple condition parameters, introduce
the helper function _set_combined_conditions. It takes multiple
_set_conditions hooks as its arguments, combines them and works as the
set_conditions() hook. When the hook x iterates x1 and x2, and the other
hook y iterates y1 and y2, the function iterates (x1, y1), (x2, y1),
(x1, y2) and (x2, y2). In other words, it iterates over the Cartesian
product of the given condition sets.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

meta/{016,017}: add test cases to check repeated test case runs

Add test cases to confirm the feature to repeat test case runs with
different conditions is working.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

check: use set_conditions() for the CAN_BE_ZONED test cases

When the test case with test() function is marked as CAN_BE_ZONED,
blktests runs the test case twice: once for non-zoned device, and the
second for zoned device. This is now implemented as a special logic in
the check script.

To simplify the implementation, use the feature to repeat test cases
with different conditions. Use set_conditions() and move out the special
logic from the check script to the common/zoned script file.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

check: support test case repeat by different conditions

It is often required to run the same test with slightly different test
conditions. If we create each test case for each test condition, those
test cases are almost same and have code duplication. Such duplication
is seen in many of the nvme test cases that set up nvme transport.

To avoid the code duplication, introduce a new feature to support test
case repetition with different conditions. When a test case implements
the function set_conditions(), blktests repeat the test case. When
set_conditions() is called without an argument, it returns how many
times the test case is to be repeated. Before each test case run,
blktests calls set_conditions() with an argument number from 0 to the
number of repetitions minus 1. set_conditions() sets up the condition
for each test run referring to the argument as the index of the
condition to set up. set_conditions() also sets up a short string in
the COND_DESC variable. This string is printed to stdout to identify the
condition of each run. It is also used as the directory path name to
hold result files.

Document the usage of set_conditions() in the new script. Separate out
shellcheck command line for the new script to avoid a false-positive
warning unique to the file.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

check: factor out _run_test()

The function _run_test() is rather complex and has deep nests. Before
modifying it for repeated test case runs, simplify it. Factor out some
part of the function to the new functions _check_and_call_test() and
_check_and_call_test_device().

Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/039: adjust to util-linux v2.40 dmesg format change

Since util-linux version 2.40, dmesg supports "caller ID". When Linux
kernel supports CONFIG_PRINTK_CALLER, dmesg adds thread ID or CPU ID
with parenthesis such as [ T123] or [ C16] to each message. This
made the dmesg string check of the test case nvme/039 fail. Fix this by
filtering out the added caller ID field.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nbd/002: repeat partition existence check for ioctl interface

When nbd-client is set up with the ioctl interface, it takes some time
for the nbd driver and the block layer to complete the partition read.
The test script calls stat command for the /dev/nbd0p1 device to check
the partition exists as expected. However, this stat command is often
called before the partition read completion, then causes the test case
failure.

To avoid the test case failure, repeat the partition check a few times
with one second wait.

Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nbd/002: fix wrong -L/-nonetlink option usage

As the commit 3c014acd5171 ("nbd/001: use -L for nbd-client") explains,
the nbd-client command uses the netlink interface instead of the ioctl
interface. The default interface changed at nbd version 3.17 in March
2018. Before that, the default was ioctl. After the change, the
nbd-client command requires -L or -nonetlink option to use the ioctl
interface.

The commit 3c014acd5171 adjusted nbd/001 test script to the default
interface change. However, it is not reflected to nbd/002. This caused
mismatch between the comments in the test case and the actual test. The
comments describe the first half as "Do it with ioctls", and the last
half as "Do it with netlink". However, the test script does opposite. It
specifies no option for the first half, then tests with netlink
interface. It specifies -L option for the last half, then tests with the
ioctl interface.

This makes it difficult to debug the failure of the test case. Fix the
nbd-client command option to match the comments. Also, use the long
option -nonetlink instead of -L for easier reading.

Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

Merge pull request #139 from yizhanglinux/dev-240419-misc-fix

blktests misc fix

block/033: additional fix

The previous commit moved the UBLK_PROG definition from tests/ublk/rc to
common/ublk end. Move that again from common/ublk end to start. Also,
the UBLK_PROG local variable in block/033 is no longer required. Remove
it.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

Merge pull request #138 from yizhang/dev-240426-ublk-fix

block/033 fix

block/034: add the missing +x mode

Signed-off-by: Yi Zhang <yi.zhang@redhat.com>

nvme/{003,006,007}: remove the blank line

Signed-off-by: Yi Zhang <yi.zhang@redhat.com>

block/033: fix the output when ublk prog not avaiable

UBLK_PROG was not defined when calling _have_ublk, move the defination
to common/ublk.
Replace ublk_prog with UBLK_PROG in block/033.
common/fio already included in common/rc, so remove the dup inclusion.

$ ./check block/033
block/033 (add & delete ublk device and test if gendisk is leaked) [not run]
driver ublk_drv is not available
is not available

Signed-off-by: Yi Zhang <yi.zhang@redhat.com>

block/037: add test to cover blk-cgroup vs. disk rebind

Recently it is observed that list corruption is triggered when running
scsi disk rebind in case of blk-cgroup.

Add one such test case for covering this unusual operation.

Cc: Changhui Zhong <czhong@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
[Shin'ichiro: changed the test case number from block/035 to block/037]
[Shin'ichiro: removed the _have_fio call and improved test description]
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

block/036: test return EIO from BLKRRPART

When we fail to reread the partition superblock from the disk, due to
bad sector or bad disk etc, BLKRRPART should fail with EIO.
Simulate failure for the entire block device and run
"blockdev --rereadpt" and expect it to fail and return EIO instead of
pass.

Link: https://lore.kernel.org/all/20240405014253.748627-1-saranyamohan@google.com/
Signed-off-by: Saranya Muruganandam <saranyamohan@google.com>
[Shin'ichiro: changed the test case number from block/035 to block/036]
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

Merge pull request #137 from bvanassche/master

block/035: Report IOPS

block/035: Report IOPS

Make it easier to retrieve the IOPS results by reporting these on stdout.

Suggested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Merge pull request #136 from bvanassche/data-lifetime

scsi/008: Add a data lifetime test

Merge pull request #135 from bvanassche/master

block/035: Test shared queue fairness

block/035: Test shared queue fairness

Test whether both requests queues process I/O if the tag set is shared
and if the completion times of the two request queues differ
significantly.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>

scsi/008: Test SCSI disk data lifetime support

SCSI disk data lifetime support is available since kernel v6.9-rc1.
See also https://lore.kernel.org/linux-scsi/3b789eacddd6265921be9da6e15257908f29b186.camel@HansenPartnership.com/.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>

common/fio: Fix the _run_fio() return code

Make _run_fio() return the fio exit code such that tests can use that
exit code to verify whether or not a fio run completed successfully.

Fixes: 3891768d9d6b ("blktests: add fio data verification routine")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>

nvme/rc: fix shellcheck warning SC2086

tests/nvme/rc:1056:7: note: Double quote to prevent globbing and word splitting. [SC2086]
tests/nvme/rc:1057:7: note: Double quote to prevent globbing and word splitting. [SC2086]

The warnings are observed with ShellCheck version 0.8.0. They are not
observed with ShellCheck version 0.9.0 and 0.10.0.

Fixes: 369d310 ("nvme: Add passthru error logging tests to nvme/039")
Signed-off-by: Yi Zhang <yi.zhang@redhat.com>
[Shin'ichiro: noted ShellCheck version dependency]
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{013,014}: fix device filename

Fixes: e55c4e09e457 ("nvme: don't assume namespace id")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/011: fix filename path

Fixes: e55c4e0 ("nvme: don't assume namespace id")
Signed-off-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/028: drop unused nvmedev

Nothing uses nvmedev, so just remove it.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme: don't assume namespace id

The tests assume that the namespace id is always 1. This might not be
correct in future (e.g. running real targets), thus harden the test by
using the uuid to lookup the correct namespace id.

The passthru test already do this, so it makes also sense to update the
other tests as well.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{041,042,043,044,045,048}: do not pass default host{nqn|id} to _nvme_connect_subsys

There is no point in passing the default values to
_nvme_connect_subsys, thus drop these arguments.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme: drop default subsysnqn argument from _nvmet_passthru_target_connect

Remove the last positional argument for _nvmet_passthru_target_connect
which most test pass in the default subsysnqn anyway. There is little
point in cluttering all the test textual noise.

While at it, also use subsysnqn as variable name everywhere, instead of
subsys_name.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme: drop default subsysnqn argument from _nvme_passthru_target_{setup|cleanup}

Remove the last positional argument for
_nvme_passthrue_target_{setup|cleanup} which most test pass in the
default subsysnqn anyway. There is little point in cluttering all the
test textual noise.

While at it, also use subsysnqn as variable name everywhere, instead of
subsys_name.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
[Shin'ichiro: dropped the change for _nvme_disconnect_subsys in nvme/037]
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme: drop default subsysnqn argument from _nvme_{connect|disconnect}_subsys

Remove the last positional argument for
_nvme_{connect|disconnect}_subsys which most test pass in the default
subsysnqn anyway. There is little point in cluttering all the test
textual noise.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme: drop default trtype argument for _nvmet_passthru_target_connect

Every invocation of _nvmet_passthru_target_connect passes in the default
nvme_trtype argument. The argument is not evaluated anymore, thus just
remove it.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme: drop default trtype argument for _nvmet_connect_subsys

Every invocation of _nvmet_connect_subsys passes in the default
nvme_trtype argument. nvme/rc also assumes the test is always using
nvme_trtype for trtype (e.g. cleanup code paths), thus just drop
this argument.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/031: do not open code target setup/cleanup

No need to open code the target setup and cleanup step. Just use the
common helper to setup and cleanup the target.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: do not cleanup external managed loop device

If the test setups a loop device itself (not created by
_nvmet_target_setup), _nvmet_target_cleanup should not cleanup the block
device automatically.

Because _nvmet_target_cleanup has no way to figure this out by itself if
it is managed or not, the caller needs to pass in the block device type.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: remove unused connect options

These options are not used, thus remove them.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: add nqn/uuid args to target setup/cleanup helper

Make these helper a bit more flexible, so that the caller
can setup not just the default subsysnqn.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: connect subsys only support long options

There is no user for the short command line options, thus
remove the short options to reduce the parsing overhead.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{014,015,018,019,020,023,024,026,045,046}: use long command line option for nvme

The long format of the command line option are more descriptive and more
likely to stay stable.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: use long command line option for nvme

The long format of the command line option are more descriptive and more
likely to stay stable.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{012,013,035}: check return value of _xfs_run_fio_verify_io

When _xfs_run_fio_verify_io fails we should log the error. Currently, no
failure is detected when this function fails.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

common/xfs: propagate errors from _xfs_run_fio_verify_io

If _xfs_mkfs_and_mount fails _xfs_run_fio_verify_io will continue to
execute and fio will run against the local file system instead against
the block device.

Propagate all errors back to the caller.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: log error if stale configuration is found

It's possible that a previous run of blktest left some stale
configuration left. E.g. when the module unload doesn't work (the bug
might in the kernel we are testing). In this case error out and avoid
confusion.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: silence fcloop cleanup failures

When the ctl file is missing we are logging

  tests/nvme/rc: line 265: /sys/class/fcloop/ctl/del_target_port: No such file or directory
  tests/nvme/rc: line 257: /sys/class/fcloop/ctl/del_local_port: No such file or directory
  tests/nvme/rc: line 249: /sys/class/fcloop/ctl/del_remote_port: No such file or directory

because the first redirect operator fails. Also it's not possible to
redirect the 'echo' error to /dev/null, because it's a builtin command
which escapes the stderr redirect operator (why?).

Anyway, the simplest way to catch this error is to first check if the
control file exists before attempting to write to it.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: silence error on module unload for fc

The other transports silence the error output when trying to unload the
module. Do the same for FC.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/048: make queue count check retry-able

We are racing with the reset path of the controller. That means, when we
set a new queue count, we might not observe the resetting state in time.
Thus, first check if we see the correct queue count and then the
controller state.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
[Shin'ichiro: removed unnecessary if block in nvmf_check_queue_count()]
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/048: remove unused argument for set_qid_max

The port is argument is unsed, thus remove it.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nbd/001: wait for the device node to show up before running parted

The parted call can happen before the device is settled and thus fail.
Currently this happens very rarely for me (about 1 in 500 runs), but
a pending change to freeze the queues for updating the limits will make
it much more likely to hit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

Merge pull request #133 from yizhanglinux/dev-240219-nbd-001-fix

nbd/001: change to use lsblk raw output format

nvme: Add passthru error logging tests to nvme/039

Tests the ability to enable and disable error logging for passthru admin
commands issued to the controller and passthru IO commands issued to a
namespace.

These tests will only be run on kernels that export the
passthru_err_log_enabled attribute.

Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/029: reserve hugepages for lager allocations

The test is issuing larger IO workload. This depends on being able to
allocate larger chunks of linear memory. nvme-cli used to use libhugetlb
to automatically allocate the HugeTLB pool. Though nvme-cli dropped the
dependency on the library, thus the test should try to provision the
system accordingly.

Link: https://github.com/linux-nvme/nvme-cli/issues/2218
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nbd/001: change to use lsblk raw output format

lsblk -n ouput format changed due to the substantial changes
in libsmartcols, which lsblk relies on for generating output,
fix it by using the raw format

Fixes: #132
Signed-off-by: Yi Zhang <yi.zhang@redhat.com>

nvme/rc: revert nvme-cli context tracking

This feature is not needed anymore, after fixing nvmet-fc. The nvmet
target code is able to handle parallel operations and doesn't crash
anymore. Furthermore, it can't prevent from discovery controller created
by the udev rules, so let's rip it out.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: do not issue errors when disconnecting when using fc transport

When running the tests with FC as transport and the udev auto connect
enabled, discovery controllers are created and destroyed while the tests
are running.

The cleanup code expects that all devices are under blktests control,
but this isn't the case. Thus filter out disconnect failures as well.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: do not issue warnings on cleanup when using fc transport

When running the tests with FC as transport and the udev auto connect
enabled, discovery controllers are created and destroyed while the tests
are running.

The cleanup code expects that all devices are under blktests control,
but this isn't the case. So just disable the warning as it is reporting
a lot of false positives.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: filter out errors from cat when reading files

When running the tests with FC as transport and the udev auto connect
enabled, discovery controllers are created and destroyed while the tests
are running. This races with the cleanup code and also the
_find_nvme_dev() which iterates over all device entries and tries to
read the connect of transport and subsysnqn sysfs attributes. Since
these steps are not locked in anyway, the resources can go away in
between.

Thus filter out 'cat' reporting non existing subsysnqn or transport
attributes. The tests will still fail if they can't find the device etc.
But without filtering these errors out the tests fail randomly.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/029: fix local variable declarations

The syntax for local variables declarations uses whitespace as separator
and not commas:

tests/nvme/029: line 24: local: `bs,': not a valid identifier
tests/nvme/029: line 24: local: `size,': not a valid identifier
tests/nvme/029: line 24: local: `img,': not a valid identifier

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/046: change nvme io-passthru command option from -o to --opcode

A recent commit in nvme-cli v2.6 changed the single letter of the
--opcode option from -o to -O. This caused the failure of nvme/046. To
make the test case work regardless of the nvme-cli version, replace -o
with --opcode.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme: add nvme pci timeout testcase

Trigger and test nvme-pci timeout with concurrent fio jobs.

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

block/031: allow to run with built-in null_blk driver

The test case block/031 sets the null_blk parameter shared_tag_bitmap=1
for testing. The parameter has been set as a module parameter, so the
null_blk driver must be loadable. However, null_blk allows you to set
shared_tag_bitmap as a configfs parameter since the kernel commit
7012eef520cb ("null_blk: add configfs variables for 2 options"). The
test case can now be run with the built-in null_blk driver by specifying
shared_tag_bitmap through configfs.

Modify the test case for that purpose. Refer to the null_blk feature
list and check if shared_tag_bitmap can be specified through configfs.
If so, specify the parameter as an option of _configure_null_blk and set
it through configfs. If not, check in requires() that shared_tag_bitmap
can be specified as a module parameter. Then call _init_null_blk() in
test() and specify shared_tag_bitmap=1 at null_blk module load.

Also, change the null_blk device name from nullb0 to nullb1 since the
default null_blk device name nullb0 is not usable with the built-in
null_blk driver.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

common/null_blk: introduce _have_null_blk_feature

Introduce a helper function _have_null_blk_feature which checks
/sys/kernel/config/features. It allows test cases to adapt to null_blk
feature support status.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

Merge pull request #131 from yizhanglinux/dev-blktests-fix-blktests-failure-with-latest-nvme-cli

nvme/rc: don't print the nvme connect msg

nvme/rc: don't print the nvme connect msg

With commit [1], the nvme connect command will show the connect msg
which breaks blktests nvme/ related cases[2], fix it from blktests
side.

[1]
https://github.com/linux-nvme/nvme-cli/commit/0b8d1e03049c5092d705bcd3ce369f02a9472f95
[2]
$ ./check nvme/003
nvme/003 (test if we're sending keep-alives to a discovery controller) [failed]
    runtime  11.344s  ...  11.346s
    --- tests/nvme/003.out 2024-01-10 04:03:21.975035862 +0100
    +++ /root/blktests/results/nodev/nvme/003.out.bad 2024-01-10 07:19:46.978193215 +0100
    @@ -1,3 +1,4 @@
     Running nvme/003
    +connecting to device: nvme0
     disconnected 1 controller(s)
     Test complete

Signed-off-by: Yi Zhang <yi.zhang@redhat.com>

block/007: skip hybrid polling tests when kernel does not support it

Since the kernel commit 54bdd67d0f88 ("blk-mq: remove hybrid polling"),
kernel does not support hybrid polling. The test case block/007
specifies auto-hybrid and fixed-hybrid polling for testing. But it is
confusing and meaningless when kernel does not support it. Check if
kernel supports hybrid polling. If not, skip the hybrid polling tests.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

block/011: set default timeout to 20 minutes

The test case runs fio while disabling and enabling PCI device of the
test target block device. Depending on the device type, it takes very
long time to re-enable the device. At worst case, it takes 4 hours to
complete the test case.

To avoid the meaningless long test runtime, set default timeout limit. I
ran the test case on various devices: real NVME SSD, QEMU NVME
emulation, HDDs with AHCI, HDDs with SAS-HBA. Many of them takes less
than 20 minutes to complete and pass the test case. Hence, choose 20
minutes as the timeout duration.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

block/011: recover test target devices to online or live status

The test case runs fio while disabling and enabling PCI device of the
test target block device. This often leaves the devices in offline or
dead status. For example, when the block device is a HDD connected to
HBA, kernel makes the device into offline mode with this message:

sd x:x:x:x Device offlined - not ready after error recovery

This causes following test cases to fail. To avoid the failure, remove
and rescan the devices to get them back to online or live status. This
improvement is similar as the commit f8f33218eca7 ("block/011: recover
test target NVME device capacity"). While at this change, improve code
comments for the commit f8f33218eca7, and add missing local variable
declarations.

Of note is that the added rescan operation triggers a lockdep WARN if
the system has devices which depend on P2SB [1].

[1] https://lore.kernel.org/linux-pci/6xb24fjmptxxn5js2fjrrddjae6twex5bjaftwqsuawuqqqydx@7cl3uik5ef6j/

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

loop/009: require --option of udevadm control command

The test case loop/009 calls udevadm control command with --ping option.
When systemd version is prior to 241, udevadm control command does not
support the option, and the test case fails. Check availability of the
option to avoid the failure.

Link: https://github.com/osandov/blktests/issues/129
Reported-by: Disha Goel <disgoel@linux.ibm.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Reviewed-by: Alyssa Ross <hi@alyssa.is>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{041,042,043,044,045}: check dhchap_ctrl_secret support by nvme-fabrics

The kernel commit d68006348288 ("nvme: rework NVME_AUTH Kconfig
selection") in v6.7-rc1 introduced a new kernel config option
NVME_HOST_AUTH. When the option is disabled, nvme test cases from 041 to
045 fail because nvme-fabrics module does not support the feature
dhchap_ctrl_secret.

To check the requirement, add _require_kernel_nvme_fabrics_feature()
which refers /dev/nvme-fabrics and checks if the specified feature
string is found or not. Call it to check dhchap_ctrl_secret support in
require() of the test cases.

This change relies on the kernel commit 1697d7d4c5ef ("nvme: blank out
authentication fabrics options if not configured").

Suggested-by: Daniel Wagner <dwagner@suse.de>
Suggested-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme: do not print subsystem NQN to stdout

The subsystem NQN might be changed from the default value, but
that shouldn't cause the tests to fail. So don't register the
subsystem NQN in the 'out' files to avoid a false positive.

Signed-off-by: Hannes Reinecke <hare@suse.de>
[Shin'ichiro: remove only subsystem NQN from nvme disconnect message]
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme: do not print UUID to log files

The UUID/wwid of a namespace might be assigned externally, so
we shouldn't register it in the 'out' files.
The current checks for UUID/wwid are just there to ensure that
if a UUID is present is should match the wwid setting.
So rather add a function _check_uuid() which does exactly that
and don't register the actual UUID in the 'out' files.

[Shin'ichiro: added check against def_subsys_uuid in _check_uuid()]

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

common/ublk: allow to run ublk test without building miniublk

Now `rublk` is enough for supporting ublk test, not necessary to build
miniublk any more.

Convert ublk common helpers into ${UBLK_PROG}.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
[Shin'ichiro: fixed a shellcheck warning]
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

ublk/rc: prefer to rublk over miniublk

Add one wrapper script for using rublk to run ublk tests, and prefer
to rublk because it is well implemented and more reliable.

This way has been run for months in rublk's github CI test.

https://github.com/ming1/rublk

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

Merge pull request #128 from yizhanglinux/codespell-fix

Codespell fix and remove unused srp/015.out

Fix common misspellings from codespell project

Signed-off-by: Yi Zhang <yi.zhang@redhat.com>

tests/srp: remove the unused file 015.out

Signed-off-by: Yi Zhang <yi.zhang@redhat.com>

src/miniublk: fix logical block size setting

The miniublk always sets the logical block size to 512 bytes when setting
a regular file-backed loop target.
A test fails if the regular file is on a filesystem built on a block
device with a logical block size of 4KB.

$ cd blktests
$ modprobe -r scsi_debug
$ modprobe scsi_debug sector_size=4096 dev_size_mb=2048
$ mkfs.ext4 /dev/sdX
$ mount /dev/sdX results/
$ ./check ublk/003

The logical block size of the ublk block device is set to 512 bytes,
so a request that is not 4KB aligned may occur, and the miniublk will
attempt to process it with direct IO and fail.

The original ublk program already fixed this problem by determining
the logical block size to set based on the block device to which the
target regular file belongs.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

block/002: fix TMPDIR path

There has been a typo of TMPDIR variable. This resulted in blktrace
files created at unexpected place. Fix the typo.

Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/{rc,017,031}: replace def_file_path with _nvme_def_file_path()

The commit b6356f6 ("nvme/rc: Add common file_path name define") defined
a global variable 'def_file_path' in nvme/rc, which refers TMPDIR.
However, when nvme/rc is sourced and def_file_path is defined for the
nvme test group, TMPDIR is not yet defined since TMPDIR is defined for
each test case. Then an unexpected path is set to def_file_path and
temporary files are created at the unexpected path.

Fix this by replacing the global variable def_file_path with a helper
function _nvme_def_file_path(). This helper function allows to refer
TMPDIR not at nvme/rc source timing but in test() or test_device()
context of each test case.

Reported-by: Yi Zhang <yi.zhang@redhat.com>
Fixes: b6356f6 ("nvme/rc: Add common file_path name define")
Reviewed-by: Daniel Wagner <dwagern@suse.de>
Reviewed-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme/rc: fix rdma driver check

Since the commit 4824ac3f5c4a ("Skip tests based on SKIP_REASON, not
return value"), blktests no longer checks return values of _have_foo
helpers. Instead, it checks if _have_foo helpers set SKIP_REASON, which
was renamed to SKIP_REASONS later, to judge test case skip. If two
_have_foo helpers are chained with ||, the skip check does not work as
expected since one of the helper may set SKIP_REASONS even when the
other does not set. Such chain with || is done in _nvme_requires() to
check rdma drivers.

To fix the check, do not chain the helper functions with || operator.
Instead, refer $use_rxe to call only the required function.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nbd/004: avoid left connection

The test case nbd/004 disconnects /dev/nbd0 in most cases, but sometimes
leaves it in connected status. The test case stops the nbd server then
/dev/nbd0 does not work even when it is in connected status. This makes
"udevadm settle" command to wait for nbd udev events infinitely and
causes failures of following test cases.

There are two causes of the left connection. The first cause is left
nbd-client process. The test case waits for completion of its child
process connect_and_disconnect. However, it does not wait for completion
of nbd-client process that connect_and_disconnect spawns. After the test
case end, the left nbd-client process establishes the connection of
/dev/nbd0. The second cause is missing disconnect operation. The
connect_and_disconnect process repeats _netlink_connect and
_netlink_disconnect. When this process is killed after _netlink_connect
and before _netlink_disconnect, the connected status is left.

To avoid the left connection, wait for nbd-client process completion
and call _netlink_disconnect at the test case end.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

Change the default RDMA driver from rdma_rxe to siw

Since the siw driver is more stable than the rdma_rxe driver, change the
default into siw. See e.g.
https://lore.kernel.org/all/c3d1a966-b9b0-d015-38ec-86270b5045fc@acm.org/.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

srp/015: Remove this test

Remove this test because except for the RDMA driver choice, it is a duplicate
of test srp/002.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

tests/srp/rc: Reduce the number of channels

Login failures have been observed with the default number of channels
(ch_count) and dynamic debug enabled on a system with a large number of
CPU cores (72). Hence reduce the number of channels.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

tests/srp/rc: Rework use_blk_mq()

Prepare for adding an additional kernel module parameter. This patch does
not change any functionality.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

tests/nvme/031: fix connecting faiure

allow_any_host was disabled during _create_nvmet_subsystem, call
_create_nvmet_host before connecting to allow the host to connect.

[76096.420586] nvmet: adding nsid 1 to subsystem blktests-subsystem-0
[76096.440595] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
[76096.491344] nvmet: connect by host nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 for subsystem blktests-subsystem-0 not allowed
[76096.505049] nvme nvme2: Connect for subsystem blktests-subsystem-0 is not allowed, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
[76096.519609] nvme nvme2: failed to connect queue: 0 ret=16772

Signed-off-by: Yi Zhang <yi.zhang@redhat.com>
Fixes: c32b233b7dd6 ("nvme/rc: Add helper for adding/removing to allow list")
Link: https://lore.kernel.org/linux-block/20230907034423.3928010-1-yi.zhang@redhat.com/
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

block/002,scsi/007,zbd/003: remove options for shellcheck SC2119

The commits 852996fea4f1 and 45b203cce8ba added options of a few
function calls to avoid the shellcheck warning SC2119. After that,
SC2119 was disabled with the commit 3d1c0fe2677d. Then the added options
are no longer needed. Remove them to clean up.

Link: https://lore.kernel.org/linux-nvme/o5xnqvujzakhrudv7p64owiuzgozmean6blxow4vdxhdqozg5v@qznf2tzmey7k/
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

nvme: introduce nvmet_target_{setup/cleanup} common code

Almost all fabric tests have the identically code for
setting up and cleaning up the target side. Introduce
two new helpers.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
[Shin'ichiro: added missing "--blkdev file" option in nvme/018]
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>