Qu Wenruo [Tue, 7 May 2024 07:06:06 +0000 (16:36 +0930)]
fstests: btrfs/301: handle auto-removed qgroups
There are always attempts to auto-remove empty qgroups after dropping a
subvolume.
For squota mode, not all qgroups can or should be dropped, as there are
common cases where the dropped subvolume are still referred by other
snapshots.
In that case, the numbers can only be freed when the last referencer
got dropped.
The latest kernel attempt would only try to drop empty qgroups for
squota mode.
But even with such safe change, the test case still needs to handle
auto-removed qgroups, by explicitly echoing "0", or later calculation
would break bash grammar.
This patch would add extra handling for such removed qgroups, to be
future proof for qgroup auto-removal behavior change.
Josef Bacik [Thu, 16 May 2024 22:12:53 +0000 (00:12 +0200)]
btrfs/{140,141}: verify read-repair test data by md5sum
For validating that read repair works properly we corrupt one mirror and
then read back the physical location after we do a direct or buffered
read on the mounted file system and then unmount the file system. The
golden output expects all a's, however with encryption this will
obviously not be the case.
However I still broke read repair, so these tests are quite valuable.
Fix them to dump the on disk values to a temporary file and then md5sum
the files, and then validate the md5sum to make sure the read repair
worked properly.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Anand Jain <anand.jain@oracle.com>
Josef Bacik [Thu, 16 May 2024 22:12:48 +0000 (00:12 +0200)]
generic/269: require no compression
This is meant to test ENOSPC, but we're dd'ing /dev/zero, which won't
fill up anything with compression on.
Additionally we're killing dd and then immediately trying to unmount.
With compression we could have references to the inode being held by the
async compression workers, so sometimes this will fail with EBUSY on the
unmount.
A better test would be to use slightly compressible data; use _ddt.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Anand Jain <anand.jain@oracle.com>
[ changed to use _ddt ]
Josef Bacik [Thu, 16 May 2024 22:12:42 +0000 (00:12 +0200)]
generic/027: require no compression
This test creates a small file and then a giant file and then tries to
create a bunch of small files in a loop to exercise ENOPSC. The problem
is that with compression the giant file isn't actually giant, so it can
make this test take forever. Simply disable it for compression.
Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Anand Jain <anand.jain@oracle.com>
Josef Bacik [Thu, 16 May 2024 22:12:39 +0000 (00:12 +0200)]
generic/352: require no compression
Our CI has been failing on this test for compression since 0fc226e7
("fstests: generic/352 should accomodate other pwrite behaviors"). This
is because we changed the size of the initial write down to 4k, and we
write a repeatable pattern. With compression on btrfs this results in
an inline extent, and when you reflink an inline extent this just turns
it into full on copies instead of a reflink.
As this isn't a bug with compression, it's just not well aligned with
how compression interacts with the allocation of space, simply exclude
this test from running when you have compression enabled.
Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Anand Jain <anand.jain@oracle.com>
Hans Holmberg [Mon, 15 Apr 2024 11:23:24 +0000 (11:23 +0000)]
generic: add gc stress test
This test stresses garbage collection for file systems by first filling
up a scratch mount to a specific usage point with files of random size,
then doing overwrites in parallel with deletes to fragment the backing
storage, forcing reclaim.
Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Zorro Lang [Fri, 10 May 2024 04:33:39 +0000 (12:33 +0800)]
common/tracing: use /sys/kernel/tracing at first
To avoid the dependence of debugfs, tracefs is mounted on another
place -- /sys/kernel/tracing now. But for the compatibility, the
/sys/kernel/debug/tracing is still there. So change _require_ftrace
helper, try to use the new /sys/kernel/tracing path at first, or
fallback to the old one if it's not supported.
xfs/499 uses ftrace, so call _require_ftrace in it.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Zorro Lang [Fri, 10 May 2024 04:29:45 +0000 (12:29 +0800)]
fstests: fix _require_debugfs and call it properly
The old _require_debugfs helper doesn't work now, fix it to check
a system supports debugfs. And then call this helper in cases which
need $DEBUGFS_MNT.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
David Sterba [Tue, 7 May 2024 19:07:47 +0000 (21:07 +0200)]
fstests: remove the rest of shared
All tests from shared/ have been moved to generic/, remove the Makefile
and the reference from the 'check' scripts.
Signed-off-by: David Sterba <dsterba@suse.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
David Sterba [Fri, 10 May 2024 03:43:45 +0000 (11:43 +0800)]
fstests: move shared/298 to generic directory
The shared/ directory was supposed to host tests that apply to a subset
of all supported filesystems but this is not utilized much and creates a
split from the generic tests. Move the test to generic.
Signed-off-by: David Sterba <dsterba@suse.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
David Sterba [Fri, 10 May 2024 03:41:00 +0000 (11:41 +0800)]
fstests: move shared/002 to generic directory
The shared/ directory was supposed to host tests that apply to a subset
of all supported filesystems but this is not utilized much and creates a
split from the generic tests. Move the test to generic.
Signed-off-by: David Sterba <dsterba@suse.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
David Sterba [Fri, 10 May 2024 03:36:29 +0000 (11:36 +0800)]
fstests: move shared/032 to generic directory
The shared/ directory was supposed to host tests that apply to a subset
of all supported filesystems but this is not utilized much and creates a
split from the generic tests. Move the test to generic.
Signed-off-by: David Sterba <dsterba@suse.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Christoph Hellwig [Sat, 27 Apr 2024 07:55:30 +0000 (09:55 +0200)]
generic/095: add to the quick group
generic/095 doesn't take more than 4 seconds on any of my test setups,
but is exercises code that handles buffered write iterations interrupted
by concurrent direct I/O that no other test in the quick group does.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Christoph Hellwig [Mon, 29 Apr 2024 17:05:48 +0000 (19:05 +0200)]
xfs/077: remove _require_meta_uuid
_require_meta_uuid tries to check if the configuration supports the
metauuid feature. It assumes a scratch fs has already been created,
which in the part was accidentally true to do a _require_xfs_crc call
that was removed in commit 39afc0aa237d ("xfs: remove support for tools
and kernels without v5 support").
As v5 file systems always support meta uuids, and xfs/077 forces a v5
file systems we can just remove the check.
Reported-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Chandan Babu R <chandanbabu@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
David Disseldorp [Mon, 29 Apr 2024 15:37:52 +0000 (01:37 +1000)]
tests: _fail on _scratch_mkfs_sized failure
If _scratch_mkfs_sized() fails, e.g. due to an FS not supporting the
provided size, tests may subsequently mount and run atop a previously
created (e.g. non-size-bound) filesystem.
This can lead to difficult to debug failures, or for some -ENOSPC
exercising tests, near infinite runtimes. Avoid this by renaming the
current function to _try_scratch_mkfs_sized() and _fail in the parent
_scratch_mkfs_sized() wrapper.
Suggested-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
David Disseldorp [Thu, 11 Apr 2024 06:32:33 +0000 (16:32 +1000)]
common/config: export TEST_DEV for mkfs.xfs
As of xfsprogs commit 6e0ed3d1 ("mkfs: stop allowing tiny filesystems")
attempts to create XFS filesystems sized under 300M fail, unless
TEST_DIR, TEST_DEV and QA_CHECK_FS environment variables are exported
(or a --unsupported mkfs parameter is provided).
TEST_DIR and QA_CHECK_FS are already exported, while TEST_DEV may only
be locally set if provided via e.g. configs/$HOSTNAME.config. Explicitly
export TEST_DEV to ensure that tests which call _scratch_mkfs_sized()
with an fssize under 300M run normally.
Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
When build xfstests in some platform it will return
no-return-in-nonvoid-function error in dio-buf-fault.c:83 and
fake-dump-rootino.c:224, add return value to solve the issue.
Signed-off-by: Yong Sun <yosun@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Anand Jain [Sat, 9 Mar 2024 06:45:24 +0000 (14:45 +0800)]
generic: move btrfs clone device testcase to the generic group
Given that ext4 also allows mounting of a cloned filesystem, the btrfs
test case btrfs/312, which assesses the functionality of cloned
filesystem support, can be refactored to be under the generic group.
So add _require_duplicated_fsid helper, then move btrfs/312 to generic.
[zlang: remove "quick" group, change the cleanup of g/744 a bit]
Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Anand Jain [Fri, 22 Mar 2024 06:32:15 +0000 (14:32 +0800)]
common/verity: fix btrfs-corrupt-block -v option
The btrfs-corrupt-block -v has been replaced with --value so fix it.
_fsv_scratch_corrupt_merkle_tree() uses the btrfs-corrupt-block
--value option, so add the "value" prerequisite in the function
_require_fsverity_corruption.
Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Anand Jain [Fri, 22 Mar 2024 06:22:57 +0000 (14:22 +0800)]
btrfs/290: fix btrfs_corrupt_block options
Checks if the running btrfs-corrupt-block also has the options value and
offset.
Remove btrfs-corrupt-block command's STDOUT and STDERR output redirection
to /dev/null. Without this, debugging wasn't possible. I also noticed that
command is quiet when successfull, so no redirect to $seqres.full is required.
Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Anand Jain [Fri, 22 Mar 2024 07:05:46 +0000 (15:05 +0800)]
common/btrfs: refactor _require_btrfs_corrupt_block to check option
The -v and -o short options in btrfs-corrupt-block were introduced and
replaced with the long options --value and --offset in the same
btrfs-progs release 5.19 by the following commits:
b2ada0594116 ("btrfs-progs: corrupt-block: corrupt generic item data") 22ffee3c6cf2 ("btrfs-progs: corrupt-block: use only long options for value and offset")
We hope that if these commits are backported, they are both backported at
the same time.
Use only the long options of btrfs-corrupt-block in the test cases. Also,
check if btrfs-corrupt-block has the options --value and --offset.
[zlang: use -w option for grep, and remove "ret" local value]
Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
fstests: btrfs: use _btrfs for 'subvolume snapshot' command
[BUG]
All the touched test cases would fail after btrfs-progs commit 5f87b467a9e7 ("btrfs-progs: subvolume: output the prompt line only when
the ioctl succeeded") due to golden output mismatch.
[CAUSE]
Although the patch I sent to the mail list doesn't change the output at
all but only a timing change, David uses this patch to unify the output
of "btrfs subvolume create" and "btrfs subvolume snapshot".
Unfortunately this changes the output and causes mismatch with
golden output.
[FIX]
Just use the recommended way to run simple btrfs command, _btrfs, for
those all "btrfs subvolume snapshot" call sites, and remove the line
from golden output.
The only case not utilizing `_btrfs` is btrfs/300, which utilize
user_do(), which doesn't have the fstests functions.
The "_btrfs()" helper has the following advantages:
- Save the command line arguments and output into $seqres.full
For easier debugging
- Check the return value of the btrfs command
This would ensure future informative output change would not trigger
such situation any more.
fstests: btrfs: rename _run_btrfs_util_prog to _btrfs
For simple btrfs commands like "btrfs subvolume create", the output is
only informative, meanwhile the output format may still change in the
future.
Normally we already have quite some test cases just redirect the output
for null or seqres.full, without knowing we have a better suitable
function `_run_btrfs_util_prog()` already.
This patch firstly rename the function to a much shorter name `_btrfs`,
then move it to the top of `common/btrfs`, and add a comment
recommending to use it when possible.
The use of `_btrfs` mostly matches the real world usage of btrfs-progs
(just "btrfs" command), and no need to do any filtering or redirection,
and would be the recommended way for future test cases.
David Sterba [Tue, 9 Apr 2024 13:32:34 +0000 (15:32 +0200)]
btrfs: remove useless comments
Remove comments from the new test template that are not relevant once
the test case is written:
- commented out common.filters (no filters used)
- Import common functions.
- real QA test starts here
- Modify as appropriate.
- get standard environment, filters and checks
Use SCRATCH_DEV_NAME[n] to provide the device path for each device from
the scratch device pool. Also, in btrfs/197, remove common/filter since
it calls common/filter.btrfs.
Reviewed-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Anand Jain <anand.jain@oracle.com>
Josef Bacik [Fri, 5 Apr 2024 19:56:14 +0000 (15:56 -0400)]
fstests: update tests to skip unsupported raid profile types
Tests btrfs/197, btrfs/198, and btrfs/297 test multiple raid types in
their workout() function. We may not support some of the raid types, so
add a check in the workout() function to skip any incompatible raid
profiles.
Josef Bacik [Fri, 5 Apr 2024 19:56:13 +0000 (15:56 -0400)]
fstests: change how we test for supported raid configs
In btrfs there's a few ways we limit the RAID profiles we'll use. We
have the raid56 feature that can be compiled out, zoned devices don't
support certain raid configurations, and you can manually set
BTRFS_PROFILE_CONFIGS to limit what you're testing.
To handle all of these different scenarios in the same way, update
_btrfs_get_profile_configs() to check for RAID56 support and remove it
if it is not there, and then add _require_btrfs_raid_type and
_check_btrfs_raid_type to get all the settings and then check if the
requested raid type is available.
>From there I've updated all of the existing tests that use
Josef Bacik [Fri, 5 Apr 2024 19:56:12 +0000 (15:56 -0400)]
fstests: change btrfs/197 and btrfs/198 golden output
Both btrfs/197 and btrfs/198 check several raid types. We may not have
support for raid5/6 for our available profiles, but we'd like to be able
to test the other profiles. In order to enable this, update the golden
output to have no output, and simply have the test check for the device
we removed to see if it still exists in the device list output. This
will allow us to add a check to skip unsupported raid configurations in
our config.
Boris Burkov [Mon, 11 Mar 2024 19:13:44 +0000 (12:13 -0700)]
btrfs: new test for devt change between mounts
It is possible to confuse the btrfs device cache (fs_devices) by
starting with a multi-device filesystem, then removing and re-adding a
device in a way which changes its dev_t while the filesystem is
unmounted. After this procedure, if we remount, then we are in a funny
state where struct btrfs_device's "devt" field does not match the bd_dev
of the "bdev" field. I would say this is bad enough, as we have violated
a pretty clear invariant.
But for style points, we can then remove the extra device from the fs,
making it a single device fs, which enables the "temp_fsid" feature,
which permits multiple separate mounts of different devices with the
same fsid. Since btrfs is confused and *thinks* there are different
devices (based on device->devt), it allows a second redundant mount of
the same device (not a bind mount!). This then allows us to corrupt the
original mount by doing stuff to the one that should be a bind mount.
Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: Anand Jain <anand.jain@oracle.com>
[ use _create_loop_device, renamed $MNT $BIND and rm them before mkdir ] Signed-off-by: Zorro Lang <zlang@kernel.org>
[ update the commit id of _fixed_by_kernel_commit ]
Christoph Hellwig [Thu, 18 Apr 2024 07:40:46 +0000 (09:40 +0200)]
xfs: don't run tests that require v4 file systems when not supported
Add a _require_xfs_nocrc helper that checks that we can mkfs and mount
a crc=0 file systems before running tests that rely on it to avoid failures
on kernels with CONFIG_XFS_SUPPORT_V4 disabled.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Taylor Jackson [Wed, 17 Apr 2024 19:34:24 +0000 (19:34 +0000)]
generic/645: Add hint for expected failure with old kernel
The following hint is added to reflect that any old kernel
without kernel commit dacfd001eaf2 (“fs/mnt_idmapping.c: Return
-EINVAL when no map is written”) is expected to fail this generic
645 test since without that commit, mount_setattr won’t return
EINVAL when attempting to create an idmapped mount using a user
namespace with no mappings.
Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Taylor Jackson <tjackson9431@gmail.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Christoph Hellwig [Mon, 8 Apr 2024 13:32:40 +0000 (15:32 +0200)]
xfs/078: remove the 512 byte block size sub-case
512 byte block sizes are only supported for v4 file systems, and
xfs/078 crudely forces use of v4 file systems for it. This doesn't
work if the kernel is built without v4 support. Given that v4
support is slowly being phased out and 512 byte block sizes have never
been common, drop this part of the test.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Creating an ext4 filesystem using '-O journal' will fail with:
Invalid filesystem option set: journal
Fix it by replacing it by '-O has_journal', which ensures the filesystem
(ext3 or ext4) is created with a journal. While there, also redirect stderr
and stdout to the full log.
Signed-off-by: "Luis Henriques (SUSE)" <luis.henriques@linux.dev> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Zorro Lang <zlang@kernel.org>
Filipe Manana [Wed, 27 Mar 2024 17:11:44 +0000 (17:11 +0000)]
btrfs/06[0-9]..07[0-4]: kill all background tasks when test is killed/interrupted
Test cases btrfs/06[0-9] and btrfs/07[0-4] exercise multiple concurrent
operations while fsstress is running in parallel, and all these are left
as child processes running in the background, which are correctly stopped
if the tests are not interrupted/killed. However if any of these tests is
interrupted/killed, it often leaves child processes still running in the
background, which prevent further running fstests again. For example:
our local _scratch_mkfs routine ...
btrfs-progs v6.6.2
See https://btrfs.readthedocs.io for more information.
ERROR: unable to open /dev/sdb: Device or resource busy
check: failed to mkfs $SCRATCH_DEV using specified options
Interrupted!
Passed all 0 tests
In this case there was still a process running _btrfs_stress_subvolume()
from common/btrfs.
This is a bit annoying because it requires manually finding out which
process is preventing unmounting the scratch device and then properly
stop/kill it.
So fix this by adding a _cleanup() function to all these tests and then
making sure it stops all the child processes it spawned and are running
in the background.
All these tests have the same structure as they were part of the same
patchset and from the same author.
Filipe Manana [Wed, 27 Mar 2024 17:11:43 +0000 (17:11 +0000)]
btrfs: remove stop file early at _btrfs_stress_subvolume
Instead of having every test case that uses _btrfs_stress_subvolume()
removing the stop file before calling that function, do the file
remove at _btrfs_stress_subvolume(). There's no point in doing it in
every single test case.
Filipe Manana [Wed, 27 Mar 2024 17:11:41 +0000 (17:11 +0000)]
btrfs: add helper to kill background process running _btrfs_stress_replace
Killing a background process running _btrfs_stress_replace() is not as
simple as sending a signal to the process and waiting for it to die.
Therefore we have the following logic to terminate such process:
kill $pid
wait $pid
while ps aux | grep "replace start" | grep -qv grep; do
sleep 1
done
Since this is repeated in several test cases, move this logic to a common
helper and use it in all affected test cases. This will help to avoid
repeating the same code again several times in upcoming changes.
Filipe Manana [Wed, 27 Mar 2024 17:11:40 +0000 (17:11 +0000)]
btrfs: add helper to kill background process running _btrfs_stress_remount_compress
Killing a background process running _btrfs_stress_remount_compress() is
not as simple as sending a signal to the process and waiting for it to
die. Therefore we have the following logic to terminate such process:
kill $pid
wait $pid
while ps aux | grep "mount.*$SCRATCH_MNT" | grep -qv grep; do
sleep 1
done
Since this is repeated in several test cases, move this logic to a common
helper and use it in all affected test cases. This will help to avoid
repeating the same code again several times in upcoming changes.
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Anand Jain <anand.jain@oracle.com>
[ Restore 'wait $fsstress_pid' before 'kill $replace_pid' ]
Filipe Manana [Wed, 27 Mar 2024 17:11:39 +0000 (17:11 +0000)]
btrfs: add helper to kill background process running _btrfs_stress_defrag
Killing a background process running _btrfs_stress_defrag() is not as
simple as sending a signal to the process and waiting for it to die.
Therefore we have the following logic to terminate such process:
kill $pid
wait $pid
while ps aux | grep "btrfs filesystem defrag" | grep -qv grep; do
sleep 1
done
Since this is repeated in several test cases, move this logic to a common
helper and use it in all affected test cases. This will help to avoid
repeating the same code again several times in upcoming changes.
Filipe Manana [Wed, 27 Mar 2024 17:11:38 +0000 (17:11 +0000)]
btrfs: add helper to kill background process running _btrfs_stress_scrub
Killing a background process running _btrfs_stress_scrub() is not as
simple as sending a signal to the process and waiting for it to die.
Therefore we have the following logic to terminate such process:
kill $pid
wait $pid
while ps aux | grep "scrub start" | grep -qv grep; do
sleep 1
done
Since this is repeated in several test cases, move this logic to a common
helper and use it in all affected test cases. This will help to avoid
repeating the same code again several times in upcoming changes.
Filipe Manana [Wed, 27 Mar 2024 17:11:37 +0000 (17:11 +0000)]
btrfs/028: removed redundant sync and scratch filesystem unmount
There's no need to have an explicit scratch filesystem sync and unmount
at the of the test, as the fstests framework automatically unmounts the
filesystem and the unmount naturally syncs any data and metadata.
So remove them and update the comment to be more clear.
Filipe Manana [Wed, 27 Mar 2024 17:11:36 +0000 (17:11 +0000)]
btrfs/028: use the helper _btrfs_kill_stress_balance_pid
Now that there's a helper to kill a background process that is running
_btrfs_stress_balance(), use it in btrfs/028. It's equivalent to the
existing code in btrfs/028.
Filipe Manana [Wed, 27 Mar 2024 17:11:35 +0000 (17:11 +0000)]
btrfs: add helper to kill background process running _btrfs_stress_balance
Killing a background process running _btrfs_stress_balance() is not as
simple as sending a signal to the process and waiting for it to die.
Therefore we have the following logic to terminate such process:
kill $pid
wait $pid
# Wait for the balance operation to finish.
while ps aux | grep "balance start" | grep -qv grep; do
sleep 1
done
Since this is repeated in several test cases, move this logic to a common
helper and use it in all affected test cases. This will help to avoid
repeating the same code again several times in upcoming changes.
Darrick J. Wong [Wed, 27 Mar 2024 02:43:41 +0000 (19:43 -0700)]
generic: test MADV_POPULATE_READ with IO errors
This is a regression test for "mm/madvise: make
MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly".
Cc: David Hildenbrand <david@redhat.com> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Wed, 27 Mar 2024 02:43:30 +0000 (19:43 -0700)]
xfs/176: fix stupid failure
Create the $SCRATCH_MNT/urk directory before we fill the filesystem so
that its creation won't fail and result in find spraying ENOENT errors
all over the golden output.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Wed, 27 Mar 2024 02:43:23 +0000 (19:43 -0700)]
xfs/270: fix rocompat regex
This test fails with the fsverity patchset because the rocompat feature
bit for verity is 0x10. The regular expression used to check if the
output is hexadecimal requires a single-digit answer, which is no longer
the case.
Fixes: 5bb78c56ef ("xfs/270: Fix ro mount failure when nrext64 option is enabled") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Disha Goel [Tue, 19 Mar 2024 11:16:13 +0000 (16:46 +0530)]
generic/735: improve test by incorporating extra hints
On power systems with 64k block size (where default page size is 64k) we
encountered a kernel oops due to an integer overflow issue when writing
near the last logical block of a file. The allocator could allocate a
range where the end exceeds the maximum supported logical block
(UINT32_MAX), leading to a subsequent BUG_ON. This issue has been
addressed in the upstream kernel with commit 2dcf5fde6dff
("ext4: prevent the normalized size from exceeding EXT_MAX_BLOCKS").
Luis Henriques (SUSE) [Fri, 15 Mar 2024 17:13:25 +0000 (17:13 +0000)]
ext4/006: take into account updates to _scratch_fuzz_modify()
Test ext4/006 takes into account the number of lines produced by its own
output. However, changes introduced to function _scratch_fuzz_modify() by
commit 9bab148bb3c7 ("common/fuzzy: exercise the filesystem a little harder
after repairing"), modified the output. Namely, the following three lines
were removed:
Luis Henriques (SUSE) [Fri, 15 Mar 2024 17:13:24 +0000 (17:13 +0000)]
common/fuzzy: make _scratch_fuzz_modify work for non-xfs filesystems
Since commit 9bab148bb3c7 ("common/fuzzy: exercise the filesystem a little
harder after repairing") funtion _scratch_fuzz_modify() has become
xfs-specific due to the use of some functions that assume this filesytem,
namely _xfs_force_bdev() and _xfs_has_feature().
Ensure _scratch_fuzz_modify() works again with other filesystems by using
these functions only when testing xfs.
Signed-off-by: "Luis Henriques (SUSE)" <luis.henriques@linux.dev> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Josef Bacik [Wed, 20 Mar 2024 15:46:50 +0000 (11:46 -0400)]
generic: add a regression test for fiemap into an mmap range
Btrfs had a deadlock that you could trigger by mmap'ing a large file and
using that as the buffer for fiemap. This test adds a c program to do
this, and the fstest creates a large enough file and then runs the
reproducer on the file. Without the fix btrfs deadlocks, with the fix
we pass fine.
Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
In kernel commit dacfd001eaf2 (“fs/mnt_idmapping.c: Return -EINVAL
when no map is written”), the behavior of mount_setattr changed to
return EINVAL when attempting to create an idmapped mount when using
a user namespace with no mappings. The following commit updates the test
to expect no mount to be created in that case. And since no mount is created,
this commit also removes the check for overflow IDs because it does not make
sense to check for overflow IDs for a mount that was not created.
Signed-off-by: Taylor Jackson <tjackson9431@gmail.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Taylor Jackson [Tue, 26 Mar 2024 20:33:51 +0000 (20:33 +0000)]
vfs/idmapped_mounts.c: Incorrect array index for nested user ns
Within the vfs test for idmapped mounts, the function nested_userns()
is using an incorrect array index when attempting to set up the mapping
for the 4th nested user ns within hierarchy[4]. The correct index that
belongs to the 4th nested user ns is actually hierarchy[3].
And hierarchy[4] is reserved for the dummy entry that marks the end
of the array.
Signed-off-by: Taylor Jackson <tjackson9431@gmail.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Pankaj Raghav [Wed, 13 Mar 2024 20:38:18 +0000 (21:38 +0100)]
xfs/558: scale blk IO size based on the filesystem blksz
This test fails for 64k filesystem block size on a 4k PAGE_SIZE
system. Scale the `blksz` based on the filesystem block size instead of
fixing it as 64k so that we do get some iomap invalidations while doing
concurrent writes.
Cap the blksz to be at least 64k to retain the same behaviour as before
for smaller filesystem blocksizes.
This fixes the "Expected to hear about writeback iomap invalidations?"
message for 64k filesystems.
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Tested-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Zorro Lang [Sun, 24 Mar 2024 14:17:07 +0000 (22:17 +0800)]
common/rc: fix unknown _xfs_repair_test_fs function name
Sometimes I hit below errors:
./common/rc: line 1293: _xfs_repair_test_fs: command not found
./common/rc: line 1298: _xfs_repair_test_fs: command not found
The _repair_test_fs trys to call _xfs_repair_test_fs(), but there's
not that function in fstests. According to commit c7d81cdecbef,
it brought in _test_xfs_repair, but called wrong name. So fix it.
Fixes: c7d81cdecbef ("check: try to fix the test device if it gets corrupted") Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Josef Bacik [Tue, 19 Mar 2024 16:55:57 +0000 (12:55 -0400)]
fstests: btrfs/195: skip raid setups not in the profile configs
You can specify a custom BTRFS_PROFILE_CONFIGS to skip certain raid
configurations in the tests, however btrfs/195 doesn't honor this
currently. Fix this up by getting the profile configs and skipping any
configurations that are not listed in BTRFS_PROFILE_CONFIGS.
Anand Jain [Sat, 16 Mar 2024 17:02:34 +0000 (22:32 +0530)]
generic: test mount fails on physical device with configured dm volume
When a dm Flakey device is configured, (or similar dm where both physical
and dm devices are accessible) we have access to both the physical device
and the dm flakey device, ensure that the physical device mount fails.
Josef Bacik [Tue, 19 Mar 2024 18:12:05 +0000 (19:12 +0100)]
btrfs/330: add test to validate ro/rw subvol mounting
Btrfs has had the ability for almost a decade to allow ro and rw
mounting of subvols. This behavior specifically
mount -o subvol=foo,ro /some/dir
mount -o subvol=bar,rw /some/other/dir
This seems simple, but because of the limitations of how we did mounting
in ye olde days we would mark the super block as RO and the mount if we
mounted RO first. In the case above /some/dir would instantiate the
super block as read only and the mount point. So the second mount
command under the covers would convert the super block to RW, and then
allow the mount to continue.
The results were still consistent, /some/dir was still read only because
the mount was marked read only, but /some/other/dir could be written to.
This is a test to make sure we maintain this behavior, as I almost
regressed this behavior while converting us to the new mount API.
Josef Bacik [Tue, 19 Mar 2024 18:12:03 +0000 (19:12 +0100)]
btrfs/131,btrfs/172,btrfs/206: add check for block-group-tree feature in btrfs
A new disk format option will make the no-holes option a requirement, so
add a helper to make sure that we aren't creating a fs with
BLOCK_GROUP_TREE by default, and skip the tests that require turning off
no-holes.
Boris Burkov [Wed, 13 Mar 2024 23:46:30 +0000 (16:46 -0700)]
btrfs/316: use rescan wrapper
btrfs/316 is broken on the squota configuration because it uses a raw
rescan call which fails, instead of using the rescan wrapper. The test
passes with squota, so run it (instead of requiring rescan) though I
suspect it isn't the most meaningful test.
Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: Anand Jain <anand.jain@oracle.com>
Boris Burkov [Wed, 13 Mar 2024 23:46:29 +0000 (16:46 -0700)]
btrfs/277: specify protocol version 3 for verity send
This test uses btrfs send with fs-verity which relies on protocol
version 3. The default in progs is version 2, so we need to explicitly
specify the protocol version. Note that the max protocol version in
progs is also currently broken (not properly gated by EXPERIMENTAL) so
that needs fixing as well.
Filipe Manana [Wed, 13 Mar 2024 15:41:36 +0000 (15:41 +0000)]
fstests: add missing commit IDs to some tests
Some tests are still using a 'xxx...' commit ID but the respective patches
were already merged to Linus' tree or btrfs-progs, so update them with the
correct commit IDs and in two cases update the subject as well, because it
was modified after the test case was added and before being sent to Linus
(btrfs/317 and generic/707).
Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Tue, 12 Mar 2024 14:57:20 +0000 (07:57 -0700)]
generic/574: don't fail the test on intentional coredump
Don't fail this test just because the mmap read of a corrupt verity file
causes xfs_io to segfault and then dump core.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Tue, 27 Feb 2024 17:39:45 +0000 (18:39 +0100)]
misc: fix test that fail formatting with 64k blocksize
There's a bunch of tests that fail the formatting step when the test run
is configured to use XFS with a 64k blocksize. This happens because XFS
doesn't really support that combination due to minimum log size
constraints. Fix the test to format larger devices in that case.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Co-developed-by: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Zorro Lang [Mon, 11 Mar 2024 16:20:29 +0000 (00:20 +0800)]
common/rc: notrun if io_uring is disabled by sysctl
If kernel supports io_uring, userspace still can/might disable that
supporting by set /proc/sys/kernel/io_uring_disabled=2. Let's notrun
if io_uring is disabled by that way.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Zorro Lang [Mon, 11 Mar 2024 16:20:28 +0000 (00:20 +0800)]
fsstress: bypass io_uring testing if io_uring_queue_init returns EPERM
I found the io_uring testing still fails as:
io_uring_queue_init failed
even if kernel supports io_uring feature.
That because of the /proc/sys/kernel/io_uring_disabled isn't 0.
Different value means:
0 All processes can create io_uring instances as normal.
1 io_uring creation is disabled (io_uring_setup() will fail with
-EPERM) for unprivileged processes not in the io_uring_group
group. Existing io_uring instances can still be used. See the
documentation for io_uring_group for more information.
2 io_uring creation is disabled for all processes. io_uring_setup()
always fails with -EPERM. Existing io_uring instances can still
be used.
So besides the CONFIG_IO_URING kernel config, there's another switch
can on or off the io_uring supporting. And the "2" or "1" might be
the default on some systems.
On this situation the io_uring_queue_init returns -EPERM, so I change
the fsstress to ignore io_uring testing if io_uring_queue_init returns
-ENOSYS or -EPERM. And print different verbose message for debug.
Signed-off-by: Zorro Lang <zlang@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
As the manual of io_uring_queue_init says "io_uring_queue_init(3)
returns 0 on success and -errno on failure". We should check if the
return value is -ENOSYS, not the errno.
Fixes: d15b1721f284 ("ltp/fsstress: don't fail on io_uring ENOSYS") Signed-off-by: Zorro Lang <zlang@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Christoph Hellwig [Mon, 26 Feb 2024 10:03:19 +0000 (11:03 +0100)]
generic/392: stop checking st_blocks
st_blocks is a rather vaguely defined field. To quote the Linux stat(2)
man page:
Use of the st_blocks and st_blksize fields may be less portable.
(They were introduced in BSD. The interpretation differs between
systems, and possibly on a single system when NFS mounts are
involved.)
or the FreeBSD one:
st_blocks Actual number of blocks allocated for the file in
512-byte units. As short symbolic links are stored in
the inode, this number may be zero.
and at least for XFS they include speculative preallocations and
in-flight COW fork allocations, and the numbers can change when the way
how data is stored is reorganized. Because of that it doesn't make sense
to require st_blocks to not change after a crash even when fsync or
fdatasync was involved.
Remove the st_blocks checks and the now superfluous XFS always_cow
workaround.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Su Yue [Thu, 7 Mar 2024 04:04:08 +0000 (12:04 +0800)]
btrfs/172,206: call _log_writes_cleanup in _cleanup
Because block group tree requires require no-holes feature,
_log_writes_mkfs "-O ^no-holes" fails when "-O block-group-tree" is
given in MKFS_OPTION.
Without explicit _log_writes_cleanup, the two tests fail with
logwrites-test device left. And all next tests will fail due to
SCRATCH DEVICE EBUSY.
Fix it by overriding _cleanup to call _log_writes_cleanup.
Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Su Yue <glass.su@suse.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Qu Wenruo [Sun, 3 Mar 2024 06:52:51 +0000 (17:22 +1030)]
fstests: btrfs/121: allow snapshot with invalid qgroup to return error
[BUG]
After incoming kernel commit "btrfs: qgroup: verify btrfs_qgroup_inherit
parameter", test case btrfs/121 would fail like this:
btrfs/121 1s ... [failed, exit status 1]- output mismatch (see /xfstests/results//btrfs/121.out.bad)
--- tests/btrfs/121.out 2022-05-11 09:55:30.739999997 +0800
+++ /xfstests/results//btrfs/121.out.bad 2024-03-03 13:33:38.076666665 +0800
@@ -1,2 +1,3 @@
QA output created by 121
-Silence is golden
+failed: '/usr/bin/btrfs subvolume snapshot -i 1/10 /mnt/scratch /mnt/scratch/snap1'
+(see /xfstests/results//btrfs/121.full for details)
...
(Run 'diff -u /xfstests/tests/btrfs/121.out /xfstests/results//btrfs/121.out.bad' to see the entire diff)
[CAUSE]
The incoming kernel commit would do early qgroups validation before
subvolume/snapshot creation, and reject invalid qgroups immediately.
Meanwhile that test case itself still assume the ioctl would go on
without any error, thus the new behavior would break the test case.
[FIX]
Instead of relying on the snapshot creation ioctl return value, we just
completely ignore the output of that snapshot creation.
Then manually check if the fs is still read-write.
For different kernels (3 cases), they would lead to the following
results:
- Older unpatched kernel
The filesystem would trigger a transaction abort (would be caught by
dmesg filter), and also fail the "touch" command.
- Older but patched kernel
The filesystem continues to create the snapshot, while still keeps the
fs read-write.
- Latest kernel with qgroup validation
The filesystem refuses to create the snapshot, while still keeps the
fs read-write.
Both "older but patched" and "latest" kernels would still pass the test
case, even with different behaviors.
Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Christoph Hellwig [Wed, 21 Feb 2024 06:37:41 +0000 (07:37 +0100)]
common: dm-error now supports zoned devices
Since kernel commit a951104333bd ("dm error: Add support for zoned block
devices") dm-error fully supports zoned devices. Make use of that to
also run error injection tests for zoned device.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Christoph Hellwig [Fri, 1 Mar 2024 15:28:20 +0000 (08:28 -0700)]
shared/298: run xfs_db against the loop device instead of the image file
xfs_db fails to properly detect the device sector size and thus segfaults
when run again an image file with 4k sector size. While that's something
we should fix in xfs_db it will require a fair amount of refactoring of
the libxfs init code. For now just change shared/298 to run xfs_db
against the loop device created on the image file that is used for I/O,
which feels like the right thing to do anyway to avoid cache coherency
issues.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Christoph Hellwig [Wed, 6 Mar 2024 01:22:46 +0000 (18:22 -0700)]
shared/298: call fs commands on the loop device
In general calling fs tools is best done on the block device used for
the file system and not the backing device of a loop file. Thus switch
shared/298 to call all fs commands on the loop device. Also add a
common on why the xfs_io fiemap command is called on the backing file,
and to have a good place for the comment stop passing the backing file
as the argument to get_holes function and just use it implicitly as
the other helpers to with the loop device.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Disha Goel [Fri, 1 Mar 2024 08:15:40 +0000 (13:45 +0530)]
xfstest: add detection for ext4.h presence in configure.ac
In some distributions, __u64 is already defined in system header files,
causing compilation errors when building xfstest.
# make
[CC] ext4_resize
ext4_resize.c:17:28: error: conflicting types for '__u64'
typedef unsigned long long __u64;
^~~~~
In file included from /usr/include/asm/types.h:26:0,
from /usr/include/linux/types.h:5,
from /usr/include/linux/mount.h:4,
from /usr/include/sys/mount.h:32,
from ext4_resize.c:15:
/usr/include/asm-generic/int-l64.h:30:23: note: previous declaration of '__u64' was here
typedef unsigned long __u64;
^~~~~
To address this issue, configure.ac now checks for the presence and
compilability of <linux/ext4.h>. If found and compilable, the macro
HAVE_LINUX_EXT4_H is defined. The commit also updates src/ext4_resize.c
to conditionally include <linux/ext4.h> based on the presence of the
header, ensuring compatibility with systems where ext4.h is either
present or not. Also include <linux/types.h> which gets __u64
definition on systems where ext4.h is not present. This change
enhances the configure process and improves code consistency.
The changes were tested on various distributions on Power
architecture, by successfully compiling xfstest. Additionally,
verified the compatibility by running ext4/033 and ext4/056
tests, both of which use ext4_resize and observed successful
test execution.
# make
checking linux/ext4.h usability... yes
checking linux/ext4.h presence... yes
checking for linux/ext4.h... yes
[CC] detached_mounts_propagation
[CC] ext4_resize
[CC] t_readdir_3
# make
checking linux/ext4.h usability... no
checking linux/ext4.h presence... no
checking for linux/ext4.h... no
[CC] detached_mounts_propagation
[CC] ext4_resize
[CC] t_snapshot_deleted_subvolume
Signed-off-by: Disha Goel <disgoel@linux.ibm.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Thu, 7 Mar 2024 23:22:55 +0000 (15:22 -0800)]
xfs: test for premature ENOSPC with large cow delalloc extents
On a higly fragmented filesystem a Direct IO write can fail with -ENOSPC error
even though the filesystem has sufficient number of free blocks.
This occurs if the file offset range on which the write operation is being
performed has a delalloc extent in the cow fork and this delalloc extent
begins much before the Direct IO range.
In such a scenario, xfs_reflink_allocate_cow() invokes xfs_bmapi_write() to
allocate the blocks mapped by the delalloc extent. The extent thus allocated
may not cover the beginning of file offset range on which the Direct IO write
was issued. Hence xfs_reflink_allocate_cow() ends up returning -ENOSPC.
This test addresses this issue.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Fri, 1 Mar 2024 17:51:24 +0000 (09:51 -0800)]
xfs/43[4-6]: make module reloading optional
These three tests examine two things -- first, can xfs CoW staging
extent recovery handle corruptions in the refcount btree gracefully; and
second, can we avoid leaking incore inodes and dquots.
The only cheap way to check the second condition is to rmmod and
modprobe the XFS module, which triggers leak detection when rmmod tears
down the caches. Currently, the entire test is _notrun if module
reloading doesn't work.
Unfortunately, these tests never run for the majority of XFS developers
because their testbeds either compile the xfs kernel driver into vmlinux
statically or the rootfs is xfs so the module cannot be reloaded. The
author's testbed boots from NFS and does not have this limitation.
Because we've had repeated instances of CoW recovery regressions not
being caught by testing until for-next hits my machine, let's make the
module reloading optional in all three tests to improve coverage.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Tue, 27 Feb 2024 02:01:50 +0000 (18:01 -0800)]
xfs/599: reduce the amount of attrs created here
Luis Chamberlain reported insane runtimes in this test:
"xfs/599 takes a long time on LBS, but it passes. The amount of time it
takes, however, begs the question if the test is could be trimmed to do
less work because the larger the block size the larger the number of
dirents and xattrs are used to create. The large dirents are not a
problem. The amount of time it takes to create xattrs with hashcol
however grows exponentially in time.
"Do we really need so many xattrs for larger block sizes for this test?"
No, we don't. The goal of this test is to create a two-level dabtree of
xattrs having identical hashes. However, the test author (me)
apparently forgot that if a dabtree is created in the attr fork, there
will be a dabtree entry for each extended attribute, not each attr leaf
block. Hence it's a waste of time to multiply da_records_per_block by
attr_records_per_block.
Reported-by: Luis Chamberlain <mcgrof@kernel.org> Fixes: 1cd6b61299 ("xfs: add a couple more tests for ascii-ci problems") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Tue, 27 Feb 2024 02:01:34 +0000 (18:01 -0800)]
generic/491: increase test timeout
Bump the read timeout in this test to a few seconds just in case it
actually takes the IO system more than a second to retrieve the data
(e.g. cloud storage network lag).
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Tue, 27 Feb 2024 02:01:19 +0000 (18:01 -0800)]
generic/192: fix spurious timeout
I have a theory that when the nfs server that hosts the root fs for my
testing VMs gets backed up, it can take a while for path resolution and
loading of echo, cat, or tee to finish. That delays the test enough to
result in:
--- /tmp/fstests/tests/generic/192.out 2023-11-29 15:40:52.715517458 -0800
+++ /var/tmp/fstests/generic/192.out.bad 2023-12-15 21:28:02.860000000 -0800
@@ -1,5 +1,6 @@
QA output created by 192
sleep for 5 seconds
test
-delta1 is in range
+delta1 has value of 12
+delta1 is NOT in range 5 .. 7
delta2 is in range
Therefore, invoke all these utilities with --help before the critical
section to make sure they're all in memory.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Tue, 27 Feb 2024 04:41:00 +0000 (20:41 -0800)]
xfs/155: fail the test if xfs_repair hangs for too long
There are a few hard to reproduce bugs in xfs_repair where it can
deadlock trying to lock a buffer that it already owns. These stalls
cause fstests never to finish, which is annoying! To fix this, set up
the xfs_repair run to abort after 10 minutes, which will affect the
golden output and capture a core file.
This doesn't fix xfs_repair, obviously.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Zorro Lang <zlang@kernel.org>
Darrick J. Wong [Tue, 27 Feb 2024 04:40:21 +0000 (20:40 -0800)]
generic/604: try to make race occur reliably
This test will occasionaly fail like so:
--- /tmp/fstests/tests/generic/604.out 2024-02-03 12:08:52.349924277 -0800
+++ /var/tmp/fstests/generic/604.out.bad 2024-02-05 04:35:55.020000000 -0800
@@ -1,2 +1,5 @@
QA output created by 604
-Silence is golden
+mount: /opt: /dev/sda4 already mounted on /opt.
+ dmesg(1) may have more information after failed mount system call.
+mount -o usrquota,grpquota,prjquota, /dev/sda4 /opt failed
+(see /var/tmp/fstests/generic/604.full for details)
As far as I can tell, the cause of this seems to be _scratch_mount
getting forked and exec'd before the backgrounded umount process has a
chance to enter the kernel. When this occurs, the mount() system call
will return -EBUSY because this isn't an attempt to make a bind mount.
Slow things down slightly by stalling the mount by 10ms.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Zorro Lang <zlang@kernel.org>
Josef Bacik [Tue, 5 Mar 2024 18:52:24 +0000 (19:52 +0100)]
btrfs: test normal qgroup operations in a compress friendly way
btrfs/022 currently fails if you are testing with -o compress because it
does a limit exceed test which will pass with compression on.
However the other functionality this test tests is completely acceptable
with compression enabled. Handle this by breaking the test into two
tests, one that simply tests the qgroup exceed limits test that requires
no compression, and the rest of the tests that do not have the no
compression restriction.
Josef Bacik [Tue, 5 Mar 2024 18:52:19 +0000 (19:52 +0100)]
btrfs/287,btrfs/293: filter all btrfs subvolume delete calls
Some of our btrfs subvolume delete calls get put into the golden output,
and many of them simply _filter_scratch. This works fine, but we
recently changed btrfs subvolume delete output, and it would have been
nice to simply filter this in one place. We have a
_filter_btrfs_subvol_delete helper, but it's only used in one place.
Fix all of these uses to call _filter_btrfs_subvol_delete, this will
allow for follow up fixes against _filter_btrfs_subvol_delete itself to
deal with changed output.
Josef Bacik [Tue, 5 Mar 2024 18:52:17 +0000 (19:52 +0100)]
btrfs/271: adjust failure condition
btrfs/271 was failing with the subpage blocksize VM's. This is because
there's an assumption made that the device error counters are
per-sector, but they're per-io. With a 16kib pagesize and a 4k
sectorsize/nodesize the threshold was expecting 16 failed IO's, but
instead we were getting 5.
This other gotcha here is that with the tree log we will write the log
tree first, and then update the log root tree with the location of the
log tree root node. With pagesize == nodesize this is fine, we will
only write the log tree root node. However with subpage blocksize both
of these nodes could be on the same page, and thus they are both written
out during that initial write. When we update the pointer for the log
root tree we will COW the log root tree root node and submit another IO,
resulting in 3 metadata IO's instead of 2.
Fix the failure case to be < 4 blocks, which is the minimum number of
IO's we should be seeing.
Josef Bacik [Tue, 5 Mar 2024 18:52:14 +0000 (19:52 +0100)]
btrfs/213: make the test more reliable
This test will write for 8 seconds and then try to balance, but for some
setups 8 seconds may be enough to fill the disk. Instead figure out
what half the size of the disk is and write at most that many bytes, or
for 8 seconds, whichever comes first. Then use the amount of time it
took to do the write to determine how long we should allow the balance
to continue before we attempt to cancel it.
Additionally the macro is '_notrun' not '_not_run'. With this change
this test now does the correct thing on my ARM CI VM.