www.infradead.org Git - users/hch/xfstests-dev.git/log

generic/759,760: fix MADV_COLLAPSE detection and inclusion

On systems with "old" C libraries such as glibc 2.36 in Debian 12, the
MADV_COLLAPSE flag might not be defined in any of the header files
pulled in by sys/mman.h, which means that the fsx binary might not get
built with any of the MADV_COLLAPSE code.  If the kernel supports THP,
the test will fail with:

>  QA output created by 760
>  fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h
> +mapped writes DISABLED
> +MADV_COLLAPSE not supported. Can't support -h

Fix both tests to detect fsx binaries that don't support MADV_COLLAPSE,
then fix fsx.c to include the mman.h from the kernel headers (aka
linux/mman.h) so that we can actually test IOs to and from THPs if the
kernel is newer than the rest of userspace.

Cc: <fstests@vger.kernel.org> # v2025.02.02
Cc: joannelkoong@gmail.com
Fixes: 627289232371e3 ("generic: add tests for read/writes from hugepages-backed buffers")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/populate: correct the parent pointer name creation formulae

The formulae used to compute the number of parent pointers that we have
to create in a child file in order to generate a particular xattr
structure are not even close to correct -- the first one needs a bit of
adjustment, but the second one is way off and creates far too many
files.

Fix the computation, and document where the magic numbers come from.

Cc: <fstests@vger.kernel.org> # v2024.06.27
Fixes: 0c02207d61af9a ("populate: create hardlinks for parent pointers")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/dump: don't replace pids arbitrarily

In the next patch we'll run tests in a pid namespace, which means that
the test process will have a very low pid.  This low pid situation
causes problems with the dump tests because they unconditionally replace
it with the word "PID", which causes unnecessary test failures.

Initially I was going to fix it by bracketing the regexp with a
whitespace/punctuation/eol/sol detector, but then I decided to remove it
see how many sheep came barreling through.  None did, so I removed it
entirely.  The commit adding it (linked below) was not insightful at
all.

Fixes: 19beb54c96e363 ("Extra filtering as part of IRIX/Linux xfstests reconciliation for dump.")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: revert recursive unmount in _clear_mount_stack

Zorro complained about the following regression in generic/589 that I
can't reproduce here on Debian 12:

>  --- /dev/fd/63  2025-01-08 19:53:49.621421512 -0500
>  +++ generic/589.out.bad 2025-01-08 19:53:49.244099127 -0500
>  @@ -30,6 +30,7 @@
>   mpC SCRATCH_DEV
>   mpC SCRATCH_DEV
>   ======
>  +umount: /mnt/xfstests/test/589-dst: not mounted
>
> It fails on generic/589 -> end_test -> _clear_mount_stack
>
>   ...
>   + end_test
>   + _clear_mount_stack
>   + '[' -n '/mnt/test/589-dst/201227_mpC /mnt/test/589-src/201227_mpA /mnt/test/589-dst /mnt/test/589-dst /mnt/test/589-src' ']'
>   + _unmount -R /mnt/test/589-dst/201227_mpC /mnt/test/589-src/201227_mpA /mnt/test/589-dst /mnt/test/589-dst /mnt/test/589-src
>   + local outlog=/tmp/201227.201227.umount
>   + local errlog=/tmp/201227.201227.umount.err
>   + rm -f /tmp/201227.201227.umount /tmp/201227.201227.umount.err
>   + /usr/bin/umount -R /mnt/test/589-dst/201227_mpC /mnt/test/589-src/201227_mpA /mnt/test/589-dst /mnt/test/589-dst /mnt/test/589-src
>   + local res=1
>   + '[' -s /tmp/201227.201227.umount ']'
>   + '[' -s /tmp/201227.201227.umount.err ']'
>   + cat /tmp/201227.201227.umount.err
>   + cat /tmp/201227.201227.umount.err
>   umount: /mnt/test/589-dst: not mounted
>
> The _get_mount already save all mount points into MOUNTED_POINT_STACK,
> MOUNTED_POINT_STACK="/mnt/test/589-dst/201227_mpC /mnt/test/589-src/201227_mpA /mnt/test/589-dst /mnt/test/589-dst /mnt/test/589-src"
>
> '-/mnt/test                                     /dev/sda5                                xfs        rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
>   |-/mnt/test/589-src                           /dev/sda6                                xfs        rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
>   | '-/mnt/test/589-src/223262_mpA              /dev/sda6                                xfs        rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
>   '-/mnt/test/589-dst                           /dev/sda6                                xfs        rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
>     |-/mnt/test/589-dst                         /dev/sda6                                xfs        rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
>     | '-/mnt/test/589-dst/223262_mpC            /dev/sda6                                xfs        rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
>     '-/mnt/test/589-dst/223262_mpC              /dev/sda6                                xfs        rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=64k,sunit=128,swidth=256,noquota
>
> Orignal _clear_mount_stack trys to umount all of them. But Dave gave -R option to
> umount command, so
>   `umount -R /mnt/test/589-dst/201227_mpC` and `umount -R /mnt/test/589-src/201227_mpA`
> already umount /mnt/test/589-dst and /mnt/test/589-src. recursively.
> Then `umount -R /mnt/test/589-dst` shows "not mounted".

I /think/ this is a result of commit 4c6bc4565105e6 performing this
"conversion" in _clear_mount_stack:

- $UMOUNT_PROG $MOUNTED_POINT_STACK
+ _unmount -R $MOUNTED_POINT_STACK

This is not a 1:1 conversion here -- previously there was no
-R(ecursive) unmount option, and now there is.  It looks as though
umount parses /proc/self/mountinfo to figure out what to unmount, and
maybe on Zorro's system it balks if the argument passed is not present
in that file?  Debian 12's umount doesn't care.

Regardless, there was no justification for this change in behavior that
was buried in what is otherwise a hoist patch, so revert it.  The author
can resubmit the change with proper documentation.

Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 4c6bc4565105e6 ("fstests: clean up mount and unmount operations")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: do not set _FSSTRESS_PID when exercising fsx

If we're not running fsstress as the scrub exerciser, don't set
_FSSTRESS_PID because the _kill_fsstress call in the cleanup function
will think that it has to wait for a nonexistant fsstress process.
This fixes the problem of xfs/565 runtime increasing from 30s to 800s
because it tries to kill a nonexistent "565.fsstress" process and then
waits for the fsx loop control process, which hasn't been sent any
signals.

Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/019: don't fail if fio crashes while shutting down

My system (Debian 12) has fio 3.33.  Once in a while, fio crashes while
shutting down after it receives a SIGBUS on account of the filesystem
going down.  This causes the test to fail with:

generic/019       - output mismatch (see /var/tmp/fstests/generic/019.out.bad)
    --- tests/generic/019.out   2024-02-28 16:20:24.130889521 -0800
    +++ /var/tmp/fstests/generic/019.out.bad    2025-01-03 15:00:35.903564431 -0800
    @@ -5,5 +5,6 @@

     Start fio..
     Force SCRATCH_DEV device failure
    +/tmp/fstests/tests/generic/019: line 112: 90841 Segmentation fault      $FIO_PROG $fio_config >> $seqres.full 2>&1
     Make SCRATCH_DEV device operable again
     Disallow global fail_make_request feature
    ...
    (Run 'diff -u /tmp/fstests/tests/generic/019.out /var/tmp/fstests/generic/019.out.bad'  to see the entire diff)

because the wait command will dutifully report fatal signals that kill
the fio process.  Unfortunately, a core dump shows that we blew up in
some library's exit handler somewhere:

(gdb) where
#0  unlink_chunk (p=p@entry=0x55b31cb9a430, av=0x7f8b4475ec60 <main_arena>) at ./malloc/malloc.c:1628
#1  0x00007f8b446222ff in _int_free (av=0x7f8b4475ec60 <main_arena>, p=0x55b31cb9a430, have_lock=<optimized out>, have_lock@entry=0) at ./malloc/malloc.c:4603
#2  0x00007f8b44624f1f in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3385
#3  0x00007f8b3a71cf0e in ?? () from /lib/x86_64-linux-gnu/libtasn1.so.6
#4  0x00007f8b4426447c in ?? () from /lib/x86_64-linux-gnu/libgnutls.so.30
#5  0x00007f8b4542212a in _dl_call_fini (closure_map=closure_map@entry=0x7f8b44465620) at ./elf/dl-call_fini.c:43
#6  0x00007f8b4542581e in _dl_fini () at ./elf/dl-fini.c:114
#7  0x00007f8b445ca55d in __run_exit_handlers (status=0, listp=0x7f8b4475e820 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true)
    at ./stdlib/exit.c:116
#8  0x00007f8b445ca69a in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:146
#9  0x00007f8b445b3251 in __libc_start_call_main (main=main@entry=0x55b319278e10 <main>, argc=argc@entry=2, argv=argv@entry=0x7ffec6f8b468) at ../sysdeps/nptl/libc_start_call_main.h:74
#10 0x00007f8b445b3305 in __libc_start_main_impl (main=0x55b319278e10 <main>, argc=2, argv=0x7ffec6f8b468, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
    stack_end=0x7ffec6f8b458) at ../csu/libc-start.c:360
#11 0x000055b319278ed1 in _start ()

This isn't a filesystem failure, so mask this by shovelling the output
to seqres.full.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

metadump: fix cleanup for v1 metadump testing

In commit ce79de11337e38, the metadump v2 tests were updated to leave
the names of loop devices in some global variables so that the cleanup
method can find them and remove the loop devices. Inexplicably, the
metadump v1 test function was not upgraded. Do so now.

Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: ce79de11337e38 ("fstests: clean up loop device instantiation")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

metadump: make non-local function variables more obvious

In _xfs_verify_metadump_v2(), we want to set up some loop devices,
record the names of those loop devices, and then leave the variables in
the global namespace so that _xfs_cleanup_verify_metadump can dispose of
them.

Elsewhere in fstests the convention for global variables is to put them
in all caps to make it obvious that they're global and not local
variables, so do that here too.

Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: ce79de11337e38 ("fstests: clean up loop device instantiation")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/476: fix fsstress process management

generic/476 never used to run fsstress in the background, but
8973af00ec212f made it do that. This is incorrect, because now 476 runs
for three seconds (i.e. long enough to fall out the bottom of the test
and end up in _cleanup), ignoring any SOAK_DURATION/TIME_FACTOR
settings.

Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 8973af00ec212f ("fstests: cleanup fsstress process management")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: skip tests that exercise compression property when using nodatasum

A couple tests exercise the compression property and that fails when an
inode has the nodatasum flag set. So skip the tests when running under the
nodatasum mount option.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/281: skip test when running with nodatasum mount option

The test exercises compression and compression doesn't happen on inodes
with checksums disabled (nodatasum), making the test fail the expectations
if getting compressed extents. So skip the test if nodatasum is present.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: skip tests exercising data corruption and repair when using nodatasum

Several tests exercise corrupting data and then checking that on read the
data is repaired, but this requires using checksums, so the tests fail
when running with the nodatasum mount option.

So add a _require_btrfs_no_nodatasum call to these tests.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/205: skip test when running with nodatasum mount option

Currently the test fails when we pass "-o nodatasum" in MOUNT_OPTIONS and
the reason is because we enable compression, with "chattr +c", on a file
and then try to clone from it to a file with nodatasum inherited from the
mount options, which results in the clone ioctl to fail with -EINVAL since
it's not possible to clone from datasum to nodatasum and vice-versa.

So skip the test if nodatasum is a mount option.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/333: skip the test when running with nodatacow or nodatasum

Encoded writes are reject for inodes with the NODATASUM flag, so we must skip
the test if running with either the nodatasum or nodatacow (which implies
nodatasum) mount options.

So skip the test when under nodatacow and nodatasum mount options.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/btrfs: add a _require_btrfs_no_nodatasum helper

Add a _require_btrfs_no_nodatasum helper to skip a test if the nodatasum
mount option is give, as we do have several tests that fail, for several
reasons, when that mount option is passed.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/290: skip test if we are running with nodatacow mount option

We exercise corrupting an inline extent and inline extents can't be created
with nodatacow, we get instead a regular file extent item and if we attempt
to corrupt its disk_bytenr field with btrfs-corrupt-block we fail tree-checker
validation at mount time resulting in failure to mount and the following in
dmesg:

[514127.759739] BTRFS critical (device sdc): corrupt leaf: root=5 \
        block=30408704 slot=8 ino=257 file_offset=0, invalid disk_bytenr for \
        file extent, have 7416089308958521981, should be aligned to 4096
[514127.762715] BTRFS error (device sdc): read time tree block corruption \
        detected on logical 30408704 mirror 1

So add a _require_btrfs_no_nodatacow call to the test.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: skip tests incompatible with compression when compression is enabled

We have several tests that fail when compression is enabled (MOUNT_OPTIONS
has "-o compress" or "-o compress-force"") because they expect a fixed
extent size and they trigger corruption by writing directly to a device,
therefore making them incompatible with compression.

So add a _require_no_compress call to them.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: suggest fs specific fix only if the tested filesystem matches

It's odd when a test fails on a filesystem and a specific fix is suggested
for another filesystem. Some generic tests are suggesting filesystem
specific fixes without checking if the running filesystem matches, so
update them.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: add a generic test to verify direct IO writes with buffer contents change

There is a long existing btrfs problem that if some one is modifying the
buffer of an on-going direct IO write, it has a very high chance causing
permanent data checksum mismatch.

This is caused by the following factors:

- Direct IO buffer is out of the control of filesystem
  Thus user space can modify the contents during writeback.

- Btrfs generate its data checksum just before submitting the bio
  This means if the contents happens after the checksum is generated,
  the data written to disk will no longer match the checksum.

Btrfs later fixes the problem by forcing the direct IO to fallback to
buffered IO (if the inode requires data checksum), so that btrfs can
have a consistent view of the buffer.

This test case will verify the behavior by:

- Create a helper program 'dio-writeback-race'
  Which does direct IO writes block-by-block, and the buffer is always
  initialized to all 0xff before write,
  Then starting two threads:
  - One to submit the direct IO
  - One to modify the buffer to 0x00

  The program uses 4K as default block size, and 64MiB as the default
  file size.
  Which is more than enough to trigger tons of btrfs checksum errors
  on unpatched kernels.

- New test case generic/761
  Which will:

  * Use above program to create a 64MiB file

  * Do buffered read on that file
    Since above program is doing direct IO, there is no page cache
    populated.
    And the buffered read will need to read out all data from the disk,
    and if the filesystem supports data checksum, then the data checksum
    will also be verified against the data.

The test case passes on the following fses:
- ext4
- xfs
- btrfs with "nodatasum" mount option
  No data checksum to bother.

- btrfs with default "datasum" mount option and the fix "btrfs: always
  fallback to buffered write if the inode requires checksum"
  This makes btrfs to fallback on buffered IO so the contents won't
  change during writeback of page cache.

And fails on the following fses:

- btrfs with default "datasum" mount option and without the fix
  Expected.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fscrypt-crypt-util: fix KDF contexts for SM8650

Update the KDF contexts to match those actually used on SM8650. This
turns out to be needed for the hardware-wrapped key tests generic/368
and generic/369 to pass on the SM8650 HDK (now that I have one to
actually test it). Apparently the contexts changed between the
prototype version I tested a couple years ago and the final version.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/614: remove the _require_loop call

This test only creates file system images as regular files, but never
actually uses the kernel loop driver. Remove the extra requirement.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: btrfs/226: use nodatasum mount option to prevent false alerts

[BUG]
With recent kernel patch "btrfs: always fallback to buffered write if the
inode requires checksum", the test case btrfs/226 will fail with the
following error:

FSTYP         -- btrfs
PLATFORM      -- Linux/x86_64 btrfs-vm 6.13.0-rc6-custom+ #209 SMP PREEMPT_DYNAMIC Fri Jan 24 17:23:03 ACDT 2025
MKFS_OPTIONS  -- /dev/mapper/test-scratch1
MOUNT_OPTIONS -- /dev/mapper/test-scratch1 /mnt/scratch

btrfs/226 1s ... - output mismatch (see /home/adam/xfstests/results//btrfs/226.out.bad)
    --- tests/btrfs/226.out 2024-04-12 14:04:03.080000035 +0930
    +++ /home/adam/xfstests/results//btrfs/226.out.bad 2025-02-06 08:23:42.564298585 +1030
    @@ -39,14 +39,11 @@
     Testing write against prealloc extent at eof
     wrote 65536/65536 bytes at offset 0
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -wrote 65536/65536 bytes at offset 65536
    -XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    +pwrite: Resource temporarily unavailable
     File after write:
    ...
    (Run 'diff -u /home/adam/xfstests/tests/btrfs/226.out /home/adam/xfstests/results//btrfs/226.out.bad'  to see the entire diff)
Ran: btrfs/226
Failures: btrfs/226
Failed 1 of 1 tests

[CAUSE]
That kernel patch makes btrfs to always fallback to buffered IO if the
target inode requires data checksum.

This is to avoid more deadly problems of mismatched data checksum.

But this also means, for inodes with data checksum, RWF_NOWAIT will
always fail, because we will wait writing back the page cache, thus
breaking the RWF_NOWAIT requirement.

[FIX]
Update the test case to utilize nodatasum mount option, so that the
direct-IO will not fallback to buffered ones unconditionally.

Reported-by: Filipe Manana <fdmanana@kernel.org>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>

common: fix a spelling error in _require_zoned_device

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

check: Fix fs specfic imports when $FSTYPE!=$OLD_FSTYPE

Bug Description:

_test_mount function is failing with the following error:
./common/rc: line 4716: _xfs_prepare_for_eio_shutdown: command not found
check: failed to mount /dev/loop0 on /mnt1/test

when the second section in local.config file is xfs and the first section
is non-xfs.

It can be easily reproduced with the following local.config file

[s2]
export FSTYP=ext4
export TEST_DEV=/dev/loop0
export TEST_DIR=/mnt1/test
export SCRATCH_DEV=/dev/loop1
export SCRATCH_MNT=/mnt1/scratch

[s1]
export FSTYP=xfs
export TEST_DEV=/dev/loop0
export TEST_DIR=/mnt1/test
export SCRATCH_DEV=/dev/loop1
export SCRATCH_MNT=/mnt1/scratch

./check selftest/001

Root cause:
When _test_mount() is executed for the second section, the FSTYPE has
already changed but the new fs specific common/$FSTYP has not yet
been done. Hence _xfs_prepare_for_eio_shutdown() is not found and
the test run fails.

Fix:
Remove the additional _test_mount in check file just before ". common/rc"
since ". common/rc" is already sourcing fs specific imports and doing a
_test_mount.

Fixes: 1a49022fab9b4 ("fstests: always use fail-at-unmount semantics for XFS")
Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fix vfs/utils.c for big-endian systems

generic/633 was failing with EINVAL on the fsxgetattr call on s390.
Looks like this is due to a failure to properly endian swap the
arguments to the syscall, so fix that, and the magic_etc compare
in expected_dummy_vfs_caps_uid() as well while we're at it.

Fixes: 0d1af68e ("generic: add fstests for idmapped mounts")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common: skip tests using LVM when the device is no known

LVM has a lot of elaborate code to make the users life painful. This
includes claiming the device type is unknown if it doesn't match it's
elaborately crafted internal list instead of just letting the user use
it. Skip tests using LVM if this is the case to avoid arcane failures
due to missing lvm devices when using null_blk.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/418: use min_dio_alignment

Otherwise this test uses the wrong sector size on the RT device.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

replace _supported_fs with _exclude_fs

Tests don't require a list of supported file systems, as that is deducted
from the test directory name. Instead we exclude specific file systems
from a few common tests, and specify which of ext2 and ext3 should
actually also be tested after oddly multiplexing them into the ext4
directory full of tests only working for ext4.

Replace _supported_fs with a new _exclude_fs that takes only a single
file systems as the argument, making it easier to explain why the file
system is not supported for the common test. For ext4 this increases
the existing mess even further, but the maintainers have a plan to
move it to feature checks instead that are hopefully easier to
understand.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org> # common and tests/generic
Acked-by: Darrick J. Wong <djwong@kernel.org> # tests/ext4
Signed-off-by: Zorro Lang <zlang@kernel.org>

common: remove the $FSYP check in _cleanup_dump

Despite the comment, _cleanup_dump is only called from xfs specific
tests, so this is superfluous.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/363: remove _supported_fs xfs

Run this test for all file systems. Just because they are broken doesn't
mean that zeroing should not be tested.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/541: _notrun if the file system can't mount

A file system created without an RT section might not be able to mount
with an rtdev specified if the RT device has a larger LBA size.

Instead of letting the test fail, _notrun it for that case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: add tests for read/writes from hugepages-backed buffers

Add generic tests 758 and 759 for testing reads/writes from buffers
backed by hugepages.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fsx: support reads/writes from buffers backed by hugepages

Add support for reads/writes from buffers backed by hugepages.
This can be enabled through the '-h' flag. This flag should only be used
on systems where THP capabilities are enabled.

This is motivated by a recent bug that was due to faulty handling of
userspace buffers backed by hugepages. This patch is a mitigation
against problems like this in the future.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/032: fix test failure on kernels which don't support bs > ps

When trying to mount a file system with a block size > page size on
kernel which doesn't support this, suppress the error messages from
showing up in the output file lest it cause test failures.

Fixes: 0b66f6efd669 ("xfs/032: try running on blocksize > pagesize filesystems")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: test swap activation on file that used to have clones

Test that we are able to activate a swap file on a file that used to have
its extents shared multiple times.

This exercises a bug on btrfs' extent sharedness detection during swap
file activation, which is fixed by the following kernel commit:

  03018e5d8508 ("btrfs: fix swap file activation failure due to extents that used to be shared")

The fails sporadically on XFS and the bug was already reported to the XFS
mailing list:

   https://lore.kernel.org/linux-xfs/CAL3q7H7cURmnkJfUUx44HM3q=xKmqHb80eRdisErD_x8rU4+0Q@mail.gmail.com/

   https://lore.kernel.org/fstests/dca49a16a7aacdab831b8895bdecbbb52c0e609c.1733928765.git.fdmanana@suse.com/

So for now skip the test on XFS and add comments with references to these
threads.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fsx: fix compile error for preadv2()

Commit d6b9d8eff076 ("fsx: add support for RWF_DONTCACHE") introduced
preadv2() calls in ltp/fsx.c. However, sys/uio.h is not included to the
source file, which causes compile errors with the gcc option
-Werror-implicit-function-declaration:

fsx.c: In function 'test_dontcache_io':
fsx.c:1956:15: error: implicit declaration of function 'preadv2'; did you mean 'pread64'? [-Wimplicit-function-declaration]
1956 |         ret = preadv2(fd, &iov, 1, 0, RWF_DONTCACHE);
      |               ^~~~~~~
      |               pread64
fsx.c: In function 'fsx_rw':
fsx.c:2836:31: error: implicit declaration of function 'pwritev2'; did you mean 'pwrite64'? [-Wimplicit-function-declaration]
2836 |                         ret = pwritev2(fd, &iov, 1, offset, flags);
      |                               ^~~~~~~~
      |                               pwrite64

To fix it, add the include directive.

Fixes: d6b9d8eff076 ("fsx: add support for RWF_DONTCACHE")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: test cycle mounting a filesystem right after enabling simple quotas

Test that if we enable simple quotas on a filesystem and unmount it right
after without doing any other changes to the filesystem, we are able to
mount again the filesystem.

This is a regression test for the following kernel commit:

f2363e6fcc79 ("btrfs: fix transaction atomicity bug when enabling simple quotas")

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>

btrfs/326: update _fixed_by_kernel_commit

The original fix is already in the upstream, meanwhile a new but much
harder to hit bug is also exposed by this test case.

Add a new _fixed_by_kernel_commit for that bug, and special thanks to
Christian for pointing out the fix.

Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>

btrfs: add test for encoded reads

Add btrfs/333 and its helper programs btrfs_encoded_read and
btrfs_encoded_write, in order to test encoded reads.

We use the BTRFS_IOC_ENCODED_WRITE ioctl to write random data into a
compressed extent, then use the BTRFS_IOC_ENCODED_READ ioctl to check
that it matches what we've written. If the new io_uring interface for
encoded reads is supported, we also check that that matches the ioctl.

Note that what we write isn't valid compressed data, so any non-encoded
reads on these files will fail.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Mark Harmstone <maharmstone@fb.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>

configure: use pkg-config to find liburing

Change our autoconf macros so that instead of checking for the presence
of liburing.h, we use pkg-config.

The benefit of this is that we can then check the version of liburing,
and do conditional compilation based on this. There's a macro
IO_URING_CHECK_VERSION already, but it's only in relatively recent
versions of liburing.h.

This replaces HAVE_URING_H, defined by AC_CHECK_HEADERS, with
HAVE_URING. I also had to rename PKG_{MAJOR,MINOR,REVISION,BUILD} to
start with PACKAGE_, to avoid "possibly undefined macro" errors; it
looks like pkg-config assumes that anything called PKG_* is for its own
use.

Signed-off-by: Mark Harmstone <maharmstone@fb.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>

common/quota: filter out option projquota in _qmount_option for ocfs2

ocfs2 doesn't support projquota but does support usrquota an grpquota.
For now, two tests generic/594 and generic/603 are for testing
usrquota,grpquota and projquota. The mount option 'projquota' causes
failure of ocfs2 mount.

To make things simple, just skip the tests for ocfs2.
However, we can't just put '_require_prjquota $SCRATCH_DEV' before
_qmount because f2fs and xfs need runtime after mount of SCRATCH_DEV.

For ocfs2, filter out option 'projquota' in _qmount_option() to make
_qmount successful. Then in _require_prjquota(), ocfs2 will fallthrough
as a no kernel projquota support fs type.

Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: add ocfs2 supported timesptamp range

ocfs2 supports timestamp ranging from s64min to s64max.
Add it to _filesystem_timestamp_range then generic/402 runs.

Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: add a partial pages zeroing out test

This addresses a data corruption issue encountered during partial page
zeroing in ext4 which the block size is smaller than the page size [1].
Add a new test which is expanded upon generic/567, this test performs a
zeroing range test that spans two partial pages to cover this case, and
also generalize it to work for non-4k page sizes.

Link: https://lore.kernel.org/linux-ext4/20241220011637.1157197-2-yi.zhang@huaweicloud.com/
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fsx: add support for RWF_DONTCACHE

Using RWF_DONTCACHE tells the kernel that any page cache instantiated
by this operation should get pruned once the operation completes. If
data is in cache prior to the operation it will remain there.

Add ops for testing both the read and write side of this. At startup,
kernel support for this feature is probed. If support isn't available,
uncached/dontcache IO is performed as regular buffered IO. If -Z is
used to turn on O_DIRECT, then uncached/dontcache IO isn't performed.
Defaults to on if available, and adds a -T parameter to turn it off.

See the kernel posting adding support:

https://lore.kernel.org/linux-fsdevel/20241220154831.1086649-1-axboe@kernel.dk/

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fsstress: add support for RWF_DONTCACHE

Using RWF_DONTCACHE tells the kernel that any page cache instantiated
by this operation should get pruned once the operation completes. If
data is in cache prior to the operation it will remain there.

Add ops for testing both the read and write side of this. If the kernel
being tested doesn't support RWF_DONTCACHE, then operations are performed
with regular read/write buffered IO.

See the kernel posting adding support:

https://lore.kernel.org/linux-fsdevel/20241220154831.1086649-1-axboe@kernel.dk/

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/43[4-6]: implement impatient module reloading

These three tests try to reload the xfs module as a cheap way to detect
leaked inode and dquot objects when the slabs for those object are torn
down during rmmod.  Removal might not succeed, and we don't really care
for that case because we still want to exercise the log recovery code.

However, if (say) the root filesystem is xfs, then removal will never
succeed.  There's no way that waiting 50 seconds(!) per test is going
to change that.  Add a silly helper to do it fast or go home.

Reported-by: sandeen@sandeen.net
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/032: try running on blocksize > pagesize filesystems

Now that we're no longer limited to blocksize <= pagesize, let's make
sure that mkfs, fsstress, and copy work on such things. This is also a
subtle way to get more people running at least one test with that
config.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: verify ciphertext with hardware-wrapped keys

Add two tests which verify that encrypted files are encrypted correctly
when a hardware-wrapped inline encryption key is used. The two tests
are identical except that one uses FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64
and the other uses FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32. These cover both
of the settings where hardware-wrapped keys may be used.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/encrypt: support hardware-wrapped key testing

To support testing the kernel's support for hardware-wrapped inline
encryption keys, update _verify_ciphertext_for_encryption_policy() to
support a hw_wrapped_key option.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fscrypt-crypt-util: add hardware KDF support

Add support to fscrypt-crypt-util for replicating the extra KDF (Key
Derivation Function) step that is required when a hardware-wrapped
inline encryption key is used. This step normally occurs in hardware,
but we need to replicate it for testing purposes.

Note, some care was needed to handle the fact that both inlinecrypt_key
and sw_secret can be needed in a single run of fscrypt-crypt-util.
Namely, with --iv-ino-lblk-32, inlinecrypt_key is needed for the
en/decryption while sw_secret is needed for hash_inode_number().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

f2fs/008: test snapshot creation/deletion on lvm device

This patch introduce a regression testcase to check whether
f2fs can handle discard correctly once underlying lvm device
changes to not support discard after user creates snapshot
on it.

Related bug was fixed by commit bc8aeb04fd80 ("f2fs: fix to
drop all discards after creating snapshot on lvm device")

Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: support f2fs in _mkfs_dev()

Then, f2fs/* testcases can use this wrapped common helper instead of
raw codes.

Suggested-by: Zorro Lang <zlang@kernel.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/530: only use xfs-specific mkfs options when testing on xfs

This fixes a regression introduced by commit 000813899afb ("fstests:
scale some tests for high CPU count sanity") where this test
unconditionally tried to use the mkfs option "-l size=256m" which
would break when testing any file sytem other than xfs.

Fix this the same way commit 000813899afb dealt with this for
generic/531; so this was just an oversight.

Fixes: 000813899afb ("fstests: scale some tests for high CPU count sanity")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/135: don't try to rm $SCRATCH_MNT/*

This fixes a regression for ext4 introduced by commit 32e14cb90b88
("fstests: don't use directory stacks"), which replaced a number of
files at the top-level of the scratch file system:

    async_file sync_file direct_file trunc_file

with "rm $SCRATCH_MNT/*".  This causes a test failure on ext4 file
systems caused by a failed attempt to unlink the lost+found directory.
The thing, is these files are all super small (4k or 16k) and the
scratch file system is going to get reformatted before it gets used
again.  So just dropping the delete is the simplest way to solve the
problem.

Fixes: 32e14cb90b88 ("fstests: don't use directory stacks")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/590: fix test failure when running against fs other than xfs

With commit ce79de11337e ("fstests: clean up loop device instantiation")
we started to always call _destroy_loop_device at the very end of the
test, but we only create a loop device if we are running against xfs,
so the call to _destroy_loop_device results in an error since no loop
device was setup.

For example running this test against btrfs or ext4 results in this
failure:

   $ ./check generic/590
   FSTYP         -- btrfs
   PLATFORM      -- Linux/x86_64 debian0 6.13.0-rc1-btrfs-next-181+ #1 SMP PREEMPT_DYNAMIC Tue Dec  3 13:03:23 WET 2024
   MKFS_OPTIONS  -- /dev/sdc
   MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1

   generic/590 29s ... [failed, exit status 1]- output mismatch (see /home/fdmanana/git/hub/xfstests/results//generic/590.out.bad)
      --- tests/generic/590.out 2020-06-10 19:29:03.858520038 +0100
      +++ /home/fdmanana/git/hub/xfstests/results//generic/590.out.bad 2024-12-11 10:48:43.946205346 +0000
       @@ -1,2 +1,5 @@
       QA output created by 590
      -Silence is golden
      +losetup: option requires an argument -- 'd'
      +Try 'losetup --help' for more information.
      +Cannot destroy loop device
      +(see /home/fdmanana/git/hub/xfstests/results//generic/590.full for details)
      ...
      (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/generic/590.out /home/fdmanana/git/hub/xfstests/results//generic/590.out.bad'  to see the entire diff)
   Ran: generic/590
   Failures: generic/590
   Failed 1 of 1 tests

Fix this by removing the call to _destroy_loop_device at the end of the
test, as it's unnecessary because we call it in the _cleanup function if
we have setup a loop device.

Fixes: ce79de11337e ("fstests: clean up loop device instantiation")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/442: fix failure due to missing test number argument for fsync-err

After commit 88c0291d297c ("fstests: per-test dmerror instances") the
script src/dmerror now has an extra argument, corresponding to a test's
sequence number, but generic/442 isn't passing that argument so the test
fails like this:

  $ ./check generic/442
  FSTYP         -- btrfs
  PLATFORM      -- Linux/x86_64 debian0 6.13.0-rc1-btrfs-next-181+ #1 SMP PREEMPT_DYNAMIC Tue Dec  3 13:03:23 WET 2024
  MKFS_OPTIONS  -- /dev/sdc
  MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1

  generic/442 4s ... - output mismatch (see /home/fdmanana/git/hub/xfstests/results//generic/442.out.bad)
      --- tests/generic/442.out 2020-06-10 19:29:03.850519863 +0100
      +++ /home/fdmanana/git/hub/xfstests/results//generic/442.out.bad 2024-12-10 17:35:59.746597468 +0000
      @@ -1,2 +1,3 @@
       QA output created by 442
      -Test passed!
      +Usage: /home/fdmanana/git/hub/xfstests/src/dmerror {load_error_table|load_working_table}
      +system: program exited: 1
      ...
      (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/generic/442.out /home/fdmanana/git/hub/xfstests/results//generic/442.out.bad'  to see the entire diff)
  Ran: generic/442
  Failures: generic/442
  Failed 1 of 1 tests

Fix this by passing the test's sequence number as an argument.

Fixes: 88c0291d297c ("fstests: per-test dmerror instances")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/146: fix failure due to missing test number argument for fsync-err

After commit 88c0291d297c ("fstests: per-test dmerror instances") the
script src/dmerror now has an extra argument, corresponding to a test's
sequence number, but btrfs/146 isn't passing that argument so the test
fails like this:

  $ ./check btrfs/146
  FSTYP         -- btrfs
  PLATFORM      -- Linux/x86_64 debian0 6.13.0-rc1-btrfs-next-181+ #1 SMP PREEMPT_DYNAMIC Tue Dec  3 13:03:23 WET 2024
  MKFS_OPTIONS  -- /dev/sdc
  MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1

  btrfs/146 3s ... - output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/146.out.bad)
      --- tests/btrfs/146.out 2020-06-10 19:29:03.818519162 +0100
      +++ /home/fdmanana/git/hub/xfstests/results//btrfs/146.out.bad 2024-12-10 17:19:40.363498130 +0000
      @@ -1,3 +1,4 @@
       QA output created by 146
       Format and mount
      -Test passed!
      +Usage: /home/fdmanana/git/hub/xfstests/src/dmerror {load_error_table|load_working_table}
      +system: program exited: 1
      ...
      (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/146.out /home/fdmanana/git/hub/xfstests/results//btrfs/146.out.bad'  to see the entire diff)
  Ran: btrfs/146
  Failures: btrfs/146
  Failed 1 of 1 tests

Fix this by passing the test's sequence number as an argument.

Fixes: 88c0291d297c ("fstests: per-test dmerror instances")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/142, btrfs/143: fix dmdust device names

After commit aaa132777476 ("fstests: per-test dmdust instances") the
dmdust device names are no longer a plain 'dust-test', they now have
a suffix matching '.N', where N is the test's sequence number.
The test cases btrf/142 and btrfs/143 are still using the old device
names, so they fail. Fix this my making them refer to 'dust-test.$seq'
instead of 'dust-test'.

Fixes: aaa132777476 ("fstests: per-test dmdust instances")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/100, btrfs/101: fix device name in their golden output

After commit 88c0291d297c ("fstests: per-test dmerror instances"), the dm
error device name changed so that it's no longer a plain 'error-test' but
instead it's that plus a prefix matching the test number, such as
'error-test.100' for example. The tests btrfs/100 and btrfs/101 are still
using the plain old name 'error-test' in their golden output, which makes
them fail. So update them to use 'error-test.100' and 'error-test.101'.

Fixes: 88c0291d297c ("fstests: per-test dmerror instances")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: fix argument passing to _run_fsstress and _run_fsstress_bg

After commit 8973af00ec21 ("fstests: cleanup fsstress process management")
test cases btrfs/007 and btrfs/284 started to fail. For example:

  $ ./check btrfs/007
  FSTYP         -- btrfs
  PLATFORM      -- Linux/x86_64 debian0 6.13.0-rc1-btrfs-next-181+ #1 SMP PREEMPT_DYNAMIC Tue Dec  3 13:03:23 WET 2024
  MKFS_OPTIONS  -- /dev/sdc
  MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1

  btrfs/007 1s ... [failed, exit status 1]- output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/007.out.bad)
      --- tests/btrfs/007.out 2020-06-10 19:29:03.810518987 +0100
      +++ /home/fdmanana/git/hub/xfstests/results//btrfs/007.out.bad 2024-12-10 16:09:56.345937619 +0000
      @@ -1,3 +1,4 @@
       QA output created by 007
       *** test send / receive
      -*** done
      +failed: '2097152000 200'
      +(see /home/fdmanana/git/hub/xfstests/results//btrfs/007.full for details)
      ...
      (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/007.out /home/fdmanana/git/hub/xfstests/results//btrfs/007.out.bad'  to see the entire diff)
  Ran: btrfs/007

The problem comes from _run_fsstress and _run_fsstress_bg using $*, which
splits the string argument for the -x command used by btrfs/007, so that
fsstress gets the argument for -x as simply:

   "btrfs"

Instead of:

   "btrfs subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/base"

Fix this by using "$@" instead of $*.

Fixes: 8973af00ec21 ("fstests: cleanup fsstress process management")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common: call _require_scratch_dedupe from _require_scratch_duperemove

_require_scratch_duperemove doesn't check if the scratch file system
actually supports dedup, so add the proper call for that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common: cleanup scratch_mkfs_sized

Move the XFS RT specific code into the file system type switch
statement.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common: loop device work on zoned file systems

Remove the call to _require_non_zoned_device from _require_loop because
there is nothing that prevents loop device to work on file systems on
zoned devices. But loop devices are not supported directly on top of
zoned block devices, so add a _require_non_zoned_device to generic/563
to not run that test on zoned block devices.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Zorro Lang <zlang@redhat.com>

f2fs: add commit id to _fixed_by_kernel_commit

The bugs related to f2fs/00[5-7] regression testcases have been fixed
by below commits:

- d5c367ef8287 ("f2fs: fix f2fs_bug_on when uninstalling filesystem call
f2fs_evict_inode.")

- 1acd73edbbfe ("f2fs: fix to account dirty data in __get_secs_required()")

- 26413ce18e85 ("f2fs: compress: fix inconsistent update of i_blocks in
release_compress_blocks and reserve_compress_blocks")

Let's add commit id to _fixed_by_kernel_commit in f2fs/00[5-7].

Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: check-parallel

Runs tests in parallel runner threads. Each runner thread has it's
own set of tests to run, and runs a separate instance of check
to run those tests.

check-parallel sets up loop devices, mount points, results
directories, etc for each instance and divides the tests up between
the runner threads.

It currently hard codes the XFS and generic test lists, and then
gives each check invocation an explicit list of tests to run. It
also passes through exclusions so that test exclude filtering is
still done by check.

This is far from ideal, but I didn't want to have to embark on a
major refactoring of check to be able to run stuff in parallel.
It was quite the challenge just to get all the tests and test
infrastructure up to the point where they can run reliably in
parallel.

Hence I've left the actual factoring of test selection and setup
out of the patchset for the moment. The plan is to factor both the
test setup and the test list runner loop out of check and share them
between check and check-parallel, hence not requiring check-parallel
to run check directly. That is future work, however.

With the current test runner setup, it is not uncommon to see >5000%
cpu usage, 150-200kiops and 4-5GB/s of disk bandwidth being used
when running 64 runners. This is a serious stress load as it is
constantly mounting and unmounting dozens of filesystems, creating
and destroying devices, dropping caches, running sync, running CPU
hot plug, running page cache migration, etc.

The massive amount of IO that load generates causes qemu hosts to
abort (i.e. crash) because they run out of vm map segments. Hence
bumping up the max_map_count on the host like so:

echo 1048576 > /proc/sys/vm/max_map_count

is necessary.

There is no significant memory pressure to speak of from running the
tests like this. I've seen a maximum of about 50GB of RAM used when
running tests like this, so running on a 64p/64GB VM the additional
concurrency doesn't really stress memory capacity like it does CPU
and IO.

All the runners are executed in private mount namespaces. This is
to prevent ephemeral mount namespace clones from taking a reference
to every mounted filesystem in the machine and so causing random
"device busy after unmount" failures in the tests that are running
concurrently with the mount namespace setup and teardown.

A typical `pstree -N mnt` looks like:

$ pstree -N mnt
[4026531841]
bash
bash───pstree
[0]
sudo───sudo───check-parallel─┬─check-parallel───nsexec───check───311─┬─cut
                             │                                       └─md5sum
                             ├─check-parallel───nsexec───check───750─┬─750───sleep
                             │                                       └─750.fsstress───4*[750.fsstress───{750.fsstress}]
                             ├─check-parallel───nsexec───check───013───013───sed
                             ├─check-parallel───nsexec───check───251───cp
                             ├─check-parallel───nsexec───check───467───open_by_handle
                             ├─check-parallel───nsexec───check───650─┬─650───sleep
                             │                                       └─650.fsstress─┬─61*[650.fsstress───{650.fsstress}]
                             │                                                      └─2*[650.fsstress]
                             ├─check-parallel───nsexec───check───707
                             ├─check-parallel───nsexec───check───705
                             ├─check-parallel───nsexec───check───416
                             ├─check-parallel───nsexec───check───477───2*[open_by_handle]
                             ├─check-parallel───nsexec───check───140───140
                             ├─check-parallel───nsexec───check───562
                             ├─check-parallel───nsexec───check───415───xfs_io───{xfs_io}
                             ├─check-parallel───nsexec───check───291
                             ├─check-parallel───nsexec───check───017
                             ├─check-parallel───nsexec───check───016
                             ├─check-parallel───nsexec───check───168───2*[168───168]
                             ├─check-parallel───nsexec───check───672───2*[672───672]
                             ├─check-parallel───nsexec───check───170─┬─170───170───170
                             │                                       └─170───170
                             ├─check-parallel───nsexec───check───531───122*[t_open_tmpfiles]
                             ├─check-parallel───nsexec───check───387
                             ├─check-parallel───nsexec───check───748
                             ├─check-parallel───nsexec───check───388─┬─388.fsstress───4*[388.fsstress───{388.fsstress}]
                             │                                       └─sleep
                             ├─check-parallel───nsexec───check───328───328
                             ├─check-parallel───nsexec───check───352
                             ├─check-parallel───nsexec───check───042
                             ├─check-parallel───nsexec───check───426───open_by_handle
                             ├─check-parallel───nsexec───check───756───2*[open_by_handle]
                             ├─check-parallel───nsexec───check───227
                             ├─check-parallel───nsexec───check───208───aio-dio-invalid───2*[aio-dio-invalid]
                             ├─check-parallel───nsexec───check───746───cp
                             ├─check-parallel───nsexec───check───187───187
                             ├─check-parallel───nsexec───check───027───8*[027]
                             ├─check-parallel───nsexec───check───045───xfs_io───{xfs_io}
                             ├─check-parallel───nsexec───check───044
                             ├─check-parallel───nsexec───check───204
                             ├─check-parallel───nsexec───check───186───186
                             ├─check-parallel───nsexec───check───449
                             ├─check-parallel───nsexec───check───231───su───fsx
                             ├─check-parallel───nsexec───check───509
                             ├─check-parallel───nsexec───check───127───5*[127───fsx]
                             ├─check-parallel───nsexec───check───047
                             ├─check-parallel───nsexec───check───043
                             ├─check-parallel───nsexec───check───475───pkill
                             ├─check-parallel───nsexec───check───299─┬─fio─┬─4*[fio]
                             │                                       │     ├─2*[fio───4*[{fio}]]
                             │                                       │     └─{fio}
                             │                                       └─pgrep
                             ├─check-parallel───nsexec───check───551───aio-dio-write-v
                             ├─check-parallel───nsexec───check───323───aio-last-ref-he───100*[{aio-last-ref-he}]
                             ├─check-parallel───nsexec───check───648───sleep
                             ├─check-parallel───nsexec───check───046
                             ├─check-parallel───nsexec───check───753─┬─753.fsstress───4*[753.fsstress]
                             │                                       └─pkill
                             ├─check-parallel───nsexec───check───507───507
                             ├─check-parallel───nsexec───check───629─┬─3*[629───xfs_io───{xfs_io}]
                             │                                       └─5*[629]
                             ├─check-parallel───nsexec───check───073───umount
                             ├─check-parallel───nsexec───check───615───615
                             ├─check-parallel───nsexec───check───176───punch-alternati
                             ├─check-parallel───nsexec───check───294
                             ├─check-parallel───nsexec───check───236───236
                             ├─check-parallel───nsexec───check───165─┬─165─┬─165─┬─cut
                             │                                       │     │     └─xfs_io───{xfs_io}
                             │                                       │     └─165───grep
                             │                                       └─165
                             ├─check-parallel───nsexec───check───259───sync
                             ├─check-parallel───nsexec───check───442───442.fsstress───4*[442.fsstress───{442.fsstress}]
                             ├─check-parallel───nsexec───check───558───255*[558]
                             ├─check-parallel───nsexec───check───358───358───358
                             ├─check-parallel───nsexec───check───169───169
                             └─check-parallel───nsexec───check───297─┬─297.fsstress─┬─284*[297.fsstress───{297.fsstress}]
                                                                     │              └─716*[297.fsstress]
                                                                     └─sleep

A typical test run looks like:

$ time sudo ./check-parallel /mnt/xfs -s xfs -x dump
Runner 63 Failures:  xfs/170
Runner 36 Failures:  xfs/050
Runner 30 Failures:  xfs/273
Runner 29 Failures:  generic/135
Runner 25 Failures:  generic/603
Tests run: 1140
Failure count: 5

Ten slowest tests - runtime in seconds:
xfs/013 454
generic/707 414
generic/017 398
generic/387 395
generic/748 390
xfs/140 351
generic/562 351
generic/705 347
generic/251 344
xfs/016 343

Cleanup on Aisle 5?

total 0
crw-------. 1 root root 10, 236 Nov 27 09:27 control
lrwxrwxrwx. 1 root root       7 Nov 27 09:27 fast -> ../dm-0
/dev/mapper/fast  1.4T  192G  1.2T  14% /mnt/xfs

real    9m29.056s
user    0m0.005s
sys     0m0.022s
$

Yeah, that runtime is real - under 10 minutes for a full XFS auto
group test run. When running this normally (i.e. via check) on this
machine, it usually takes just under 4 hours to run the same set
of tests. i.e. I can run ./check-parallel roughly 25x times on this
machine in the same time it takes to run ./check.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: quota grace periods unreliable under load

Starting the quota grace period doesn't necessary happen predictably
when the system is under heavy load. This results in random test
failures where grace period timeouts are expected.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/062: don't leave debug files in $here on failure

Move them to $seqres.<debug-file> so they don't pollute the source
tree that fstests is running from.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: always use fail-at-unmount semantics for XFS

Rather than require every test that tests unmount in failure
conditions have to set up fail-at-unmount semantics for the
underlying filesystem, use these semantics for all test and scratch
device mounts.

This currently only affects XFS filesystems, and helps prevent
unexpected unmount hangs in EIO tests because metadata writes are
configured to try forever by default.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: capture some failures to seqres.full

Whilst trying to debug test failures, I found a few places where
errors needed to be redirected to $seqres.full rather than
/dev/null. This is a collection of all the conversions that haven't
been captured by some other bug fix patch.

Note that calling _check_filesystems() after removing the
require_test/scratch files means it is a no-op, so I removed that
call at the same time as capturing unmount failures after the test
has failed.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/076: fix broken mkfs filtering

The test does not do what it says on the packet - the mkfs output is
not actually passed to the mkfs filter, so it doesn't know what
inode size mkfs actually used. Hence CHUNK_SIZE ends up being
calculated as 0, and that means it enters an endless loop because
offset never decreases.

Fix it by adding the missing line continuation.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/629: single extent files should be within tolerance

The test passes if we have between 2 and 40 extents (despite what
the comment says!), with the target being 20. There is absolutely no
reason for considering a single extent file a failure - that
indicates the filesystem completely defeated the fragmentation
behaviour the test was trying to cause. Hence expand the range of
"test pass" tolerance to 1-41 extents.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

vfstests: some tests require the testdir to be shared

This ensures that these tests will run successfully when the
parallel check infrastructure makes all the scratch and test
mounts private.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: clean up termination of various tests

Accumulated minor fixes to improve reliablity of the termination
of various tests when interrupted.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: clean up a couple of dm-flakey tests

Just little things I've found that should be cleaned up.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: don't use directory stacks

Using bash directory stacking (pushd, popd, etc) seems to be
somewhat unreliable. I've been seeing occasional random failures
from both pushd and popd commands that cause the test to fail, and
there does not appear to be any reason for the failures occurring.

Rather than wasting time chasing ghosts, just get rid of the
directory stacking altogether.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/085: general cleanup for reliability and debugging

This test was quite unreliable during development of the parallel
check runner. It redirects all errors to /dev/null, so there was no
way to debug it when it failed.

Use common mount/unmount helpers, redirect errors to $seqres.full,
make sure the cleanup code is always run at test exit and only
attempt to kill processes if they are still running during cleanup.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

filters: add a filter that accepts EIO instead of other errors

Running a dm-flakey or dm-error test that loads a table that returns
EIO to all IO and then running a command that is expected to fail
with a specific error is racy.

If there is memory pressure at the same time that the table is
loaded, cached inodes can be turfed from memory and the command then
needs to read the inode it is about to act on from disk again. This
results in the inode read getting EIO and failing (e.g. xfs_io will
return a stat() error) rather than having the desired operation
fail.

This results in spurious test failures that look like this:

generic/331       - output mismatch (see /mnt/xfs/runner-41/results-2024-11-20-10:57:31/xfs/generic/331.out.bad)
    --- tests/generic/331.out   2022-12-21 15:53:25.487044098 +1100
    +++ /mnt/xfs/runner-41/results-2024-11-20-10:57:31/xfs/generic/331.out.bad  2024-11-20 11:02:12.123572607 +1100
    @@ -5,7 +5,8 @@
     1886e67cf8783e89ce6ddc5bb09a3944  SCRATCH_MNT/test-331/file1
     1886e67cf8783e89ce6ddc5bb09a3944  SCRATCH_MNT/test-331/file2
     CoW and unmount
    -fdatasync: Input/output error
    +/mnt/xfs/runner-41/scratch/test-331/file2: Input/output error
    +stat: Input/output error
     Compare files
    ...

Add a new "flakey EIO filter" that will catch -any- EIO error from
the command and change it to the error we expected to see.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

filter: handle mount errors from CONFIG_BLK_DEV_WRITE_MOUNTED=y

Kernels post 6.x may have CONFIG_BLK_DEV_WRITE_MOUNTED=y which
prevents mount from opening the block device on a mounted
filesystem. This results in an error such as:

mount: <dev>: Can't open blockdev

which is not the error that callers of _filter_error_mount() are
looking for. It is, however, a direct result of the test trying
to mount an alreayd mounted filesystem, so it is reflecting the same
error case. Hence this mismatch in errors should not fail the test.

Catch this mount error and convert it to the expected
"already mounted" error for the tests that exercise this behaviour.

There is also a minor test change here to push mount failure
information to $seqres.full in the cases where mount errors occur.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/310: cleanup killing background processes

Use the trick we used with fsstress of copying the binary to a test
specific name so that we can simply use pkill to reliably kill the
background processes this test runs. Also use SIGPIPE to avoid
bash from throwing out "Killed" errors.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: scale some tests for high CPU count sanity

Several tests use lots of processes to stress the filesystem. many
of them haven't really considered what this means for running the
test on high CPU machines (e.g. >32p) and the potential contention
and performance issues this might trigger.

Some of these tests simply need to increase the size of the journal.
Some need to run on filesystems with high inherent concurrency (e.g.
larger AG count). Some need more efficient/faster file creation. And
so on.

This commit is a collection of those sorts of changes to improve
runtimes on high CPU count machines.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: stop using /tmp directly

Tests should be using $tmp, not /tmp. this causes problems when
multiple tests all use /tmp/foo as a temporary test state file
and then step on each other.

Note that there are some tests that use /tmp to store "test stop"
files for background processes. Those that have proven to be
unreliable at stopping tests when interrupted by ctrl-c are also
updated to track and kill background processes in the cleanup
function.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

dmesg: reduce noise from other tests

dmesg records everything from every test concurrently running, so
noise from other tests can cause multiple other tests to fail
because they detect something from another test. Update the filter
behaviour to minimise this crosstalk problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

quota: system project quota files need to be shared

Tests that treat them as exclusively owned end up tripping over
other tests that do the same. Fix this by using append and filter
techniques to update the files, then using different project quota
ids for each test.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/127: reduce runtime

...
generic/127 684
...

This takes a long time to run because it runs 6 individual
invocations of fsx sequentially. Make them run concurrently
as they can operate on separate files.

...
generic/127 168
...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: remove uses of killall where possible

there are many unnecessary uses of killall and stale checks for it's
existence. Parallel check execution means killall is considered
harmful, so get rid of these unneccesary uses.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/177: remove unused slab object count location checks

Stale code; we count XFS inodes through the sysfs stats code now
so remove it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/176: fix broken setup code

The test does not pass the mkfs output through the mkfs filter, so
the inode size is not set up correctly. Hence it calculates the
CHUNK_SIZE as 0, and it ends up getting stuck in an endless loop
throwing ENOSPC errors because the offset never changes.

While there, use 'echo -n' rather than 'touch' to create zero length
files much faster.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/442: rescale load so it's not exponential

....
xfs/442 491
....

xfs/442 takes a long time to run because it is scaling the load
by the number of processes it is going to run on twice. It scales
the number of operations by the number of processes it is going to
run, meaning that doubling the number of processes quadruples the
runtime.

Reduce it to scale linearly by fixing the number of ops it runs per
process.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: use udevadm wait in preference to settle

When running lots of tests in parallel, there are lots of
filesystems and block devices changing state. This generates a lot
of udev events when means the udev event queue is rarely empty.
Unfortunately, an empty event queue is what udev settling waits
upon. Hence calling UDEV_SETTLE_PROG can mean waiting for a lot of
time for other tests to stop generating udev events.

For the majority of cases, what we care about is that udev has
performed device node addition or removal, not that there are no
udev events pending. Recent(-ish) systemd releases support 'udevadm
wait' to wait for a specific file to be created or unlinked rather
than waiting for the event that does that work to be completed.

Hence we don't have to wait for the udev event queue to empty,
just for the udev event that does the device node manipulation to
complete.

Introduce detection of 'udevadm wait' support and a _udev_wait()
wrapper function to use it if it is available. If it isn't, the use
the existing UDEV_SETTLE_PROG behaviour.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: mark tests that are unreliable when run in parallel

Add a group named "unreliable_in_parallel" to mark tests that
do not give reliable results when multiple tests are run in
parallel. Generally this happens with tests that are reliant on
caching in some way, such as generating specific file layouts using
buffered IO or expecting inodes to be cached in memory. These are
perturbed by other tests running sync(), generating memory pressure,
dropping caches, etc.

Hence whether these tests pass or fail is wholly dependent on what
tests are running at the same time, and hence randomly fail when
nothing has actually gone wrong. Hence they are unreliable as
regression tests when running tests in parallel, so we add them to
the "unreliable_in_parallel" group and a parallel check can exclude
this group.

As tests are updated to be robust against external interference,
they can be removed from the unreliable_in_parallel group.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: xfs/227 is really slow

The slowest test tto run on my test VMs is xfs/227:

...
xfs/227 826
...

It is doing nested iteration on created filesets that are explicitly
defined, so separate the inner loop filesets and run the outer loops
in parallel.

Also reduce the number of times we have to execute setfattr and
xfs_io to once per created file instead of once per xattr/extent
count per file.

The result is test runtime reduction of ~60%.

....
xfs/227 336
....

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: clean up loop device instantiation

Lots of tests do there own special thing with loop devices rather
than using _create_loop_device() and _destroy_loop_device(). This
oftens means they do not clean up after themselves properly,
leaving stale loop devices around that result in unmountable test or
scratch devices. This is common when tests are killed by user
interrupt.

Even the tests that do use _destroy_loop_device and try to clean up
often do it incorrectly, leading to spurious error messages.

Some tests try to use dynamic instantiation via "mount -o loop",
but then don't clean up in the correct order or hack around to find
the loop device that was instantiated because the test needs to know
the instantiated device name

Clean this up by converting all the tests to use
_create_loop_device() and _destroy_loop_device(). In all the tests,
use the variable "loop_dev" for the device consistently. In
_destroy_loop_device(), test that a device name has been passed
so that we don't try to clean up the same device twice (e.g. once
before test exit and again from the _cleanup() function). When we
destroy a loop device, unset the variable used to hold the loop
device name so that we don't try to destroy it twice.

This results in much more reliable cleanup and clean exit from
fstests when killed by the user.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: clean up mount and unmount operations

The way tests run unmount is, at times, completely random.
Sometimes they call the correct _scratch_unmount function, sometimes
they open code it with a direct call to UMOUNT_PROG <dir>, sometimes
they run umount directly.

This makes it really hard to instrument unmount operations when
trying to work out why transient, unpredictable failures like
this occur randomly during a test run:

umount: /mnt/xfs/runner-17/test: target is busy.

Sometimes it happens on a test device mount, sometimes a scratch
device mount. Sometimes it happens to a test specific dm or loop
device mount. But without instrumenting every single unmount call in
every test, it's impossible to capture these failures easily.

Solve this problem by introducing the _unmount() wrapper. It is
simply a call to UMOUNT_PROG <dir>, but it provides a single point
were -every- unmount operation funnels through.

We already have a _mount wrapper for this reason. However, in trying
to work out why mounts were failing (because unmounts were failing),
I discovered that that_mount() is used inconsistently as well.

Sort this all out by adding and _unmount() wrapper to go with
_mount() and use them everywhere consistently.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: use syncfs rather than sync

sync(1) is a system wide sync and is implemented by iterating all
the superblocks in the system. In most cases, fstests require just
the filesystem under test to be synced - we require syncfs(2)
semantics but what we use is sync(2) semantics.

The result of this is that when running many concurrent fstests at
the same time, we can have *hundreds* of concurrent sync operations
in progress (thanks fsstress!) and this causes excessive
interference with other tests that are running on other filesystems.

For example, some tests try to specifically control extent layout
via specific write and fsync patterns. All these global syncs
perturb them and cause them to spuriously fail.

A random snapshot of running concurrent tests shows just how many
tests are explicitly blocked in sync(1):

check-parallel───check───077───077─┬─cut
      │                                    ├─du
      │                                    └─tail
      ├─check-parallel───check───311───xfs_scrub───{xfs_scrub}
      ├─check-parallel───check───531───128*[t_open_tmpfiles]
      ├─check-parallel───check───227
      ├─check-parallel───check───388
      ├─check-parallel───check───070───fsstress───fsstress───{fsstress+
      ├─check-parallel───check───232───fsstress───7*[fsstress───{fsstr+
      ├─check-parallel───check───648───sleep
      ├─check-parallel───check───409───sync
      ├─check-parallel───check───683───sync
      ├─check-parallel───check───013─┬─013───sleep
      │                              └─fsstress───2*[fsstress───{fsstr+
      ├─check-parallel───check───684───sync
      ├─check-parallel───check───673───sync
      ├─check-parallel───check───118───dd
      ├─check-parallel───check───467───open_by_handle
      ├─check-parallel───check─┬─622
      │                        └─check
      ├─check-parallel───check───685───sync
      ├─check-parallel───check───049───fsstress───fsstress───{fsstress+
      ├─check-parallel───check───599
      ├─check-parallel───check───426───open_by_handle
      ├─check-parallel───check───057───umount
      ├─check-parallel───check───390───fsstress─┬─18*[fsstress───{fsst+
      │                                         └─fsstress
      ├─check-parallel───check───158───fsstress───fsstress───{fsstress+
      ├─check-parallel───check───017
      ├─check-parallel───check───032───fsstress───fsstress
      ├─check-parallel───check───076
      ├─check-parallel───check───477───open_by_handle
      ├─check-parallel───check───170───2*[170───170]
      ├─check-parallel───check───112
      ├─check-parallel───check───686───sync
      ├─4*[check-parallel───check───check───xfs_scrub───{xfs_scrub}]
      ├─check-parallel───check───387───xfs_io───{xfs_io}
      ├─check-parallel───check───615───615
      ├─check-parallel───check─┬─051
      │                        └─check───xfs_repair
      ├─check-parallel───check───049
      ├─check-parallel───check───247
      ├─check-parallel───check───674───sync
      ├─check-parallel───check───040
      ├─check-parallel───check───560───fsstress───fsstress───{fsstress+
      ├─check-parallel───check───030─┬─030─┬─030───xfs_repair
      │                              │     └─030───perl
      │                              ├─sed
      │                              └─uniq
      ├─check-parallel───check───055───055
      ├─2*[check-parallel───check───check]
      ├─check-parallel───check───042
      ├─check-parallel───check───204
      ├─check-parallel───check───271─┬─271───sed
      │                              └─md5sum
      ├─check-parallel───check───091─┬─fsx
      │                              └─tee
      ├─check-parallel───check───063───sleep
      ├─check-parallel───check───026
      ├─check-parallel───check───459───lvm
      ├─check-parallel───check───495
      ├─check-parallel───check───141───fsstress───4*[fsstress]
      ├─check-parallel───check───011─┬─fsstress─┬─fsstress───{fsstress+
      │                              │          └─fsstress
      │                              └─sleep
      ├─check-parallel───check───328───sync
      ├─check-parallel───check───507───507
      ├─check-parallel───check
      ├─check-parallel───check───687───sync
      ├─check-parallel───check───109───mkfs.xfs
      ├─check-parallel───check───324
      ├─check-parallel───check───114───aio-dio-eof-rac
      └─check-parallel───check───503───xfs_scrub───2*[{xfs_scrub}]

There are ~10 sync(1) calls blocked and at least half of the 50-odd
fsstress processes currently running are also going to be stuck in
sync(2) calls.

They are stuck because the superblock iteration has to wait for
mount, unmount, freeze, thaw and any other operation that locks a
superblock exclusively. When running dozens of tests concurrently,
there can be tens of superblocks that are locked exclusively for
every second for significant lengths of time.

Hence the use of sync has impact on both performance and test
behaviour and we need to minimise the amount of sync(1) and
sync(2) usage as much as possible.

Introduce _test_sync() and _scratch_sync() so we can implement
a syncfs mechanism with a fallback to sync(1) if it is not supported
without dirtying all the test code unnecessarily. Then convert
fsstress to use syncfs(2) in preference to sync(2).

This commit changes all the generic and XFS tests to use the new
sync functions, other filesystem specific tests will eventually
need to be converted to avoid similar problems.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: fix DM device creation/removal vs udev races

When there is load on the system, newly created DM devices don't
seem to be created consistently. When a new device is created,
it is supposed to be created as /dev/dm-X, and then a udev rule
creates the symlink from /dev/mapper/<dev name> to /dev/dm-X.

Unfortunately, a lot of the tests that use dynamically created dm
devices (dmerror, dmflakey) are not being created with this device
node structure. This is resulting in getting the wrong short device
name for the block device and hence we can't find the filesystem
sysfs attribute directory for the filesystem on that block device.

For example, with added debug to check what device name was being
passed around and resolved:

eneric/489       - output mismatch (see /mnt/xfs/runner-10/results/xfs/generic/489.out.bad)
    --- tests/generic/489.out   2022-12-21 15:53:25.503043574 +1100
    +++ /mnt/xfs/runner-10/results/xfs/generic/489.out.bad      2024-10-24 10:27:29.767196340 +1100
    @@ -1,4 +1,10 @@
     QA output created by 489
    +./common/rc: line 4955: /sys/fs/xfs/flakey-test.489/error/fail_at_unmount: No such file or directory
    +dev: /dev/mapper/flakey-test.489
    +resolved dev: /dev/mapper/flakey-test.489
    +brw-rw----. 1 root disk 251, 5 Oct 24 10:27 /dev/mapper/flakey-test.489
    +./common/rc: line 4955: /sys/fs/xfs/flakey-test.489/error/metadata/EIO/max_retries: No such file or directory
    +./common/rc: line 4955: /sys/fs/xfs/flakey-test.489/error/metadata/EIO/retry_timeout_seconds: No such file or directory
    ...
    (Run 'diff -u /home/dave/src/xfstests-dev/tests/generic/489.out /mnt/xfs/runner-10/results/xfs/generic/489.out.bad'  to see the entire diff)

Here we see that the block device node is actually at
/dev/mapper/flakey-test.489, not a link to a /dev/dm-X device node.

This implies that the udev rule to create the /dev/dm-X node and
the symlink to it at /dev/mapper/flakey-test.489 has not run, and
something else created the device node.

That looks like a bug in _dmsetup_create(). It creates the new DM
device, then runs 'dmsetup mknodes', then waits for udev to settle.
This means the mknodes command - which makes sure the dm device
nodes exist - is racing with udev to create the device nodes. They
don't use the same rules to create nodes, so we end up with this
broken situation.

'dmsetup mknodes' is considered legacy functionality, intended for
systems that have no udev capability. For systems that have udev
enabled (i.e. all modern distros), mknodes should not be run because
it creates a different device node structure to what udev creates
and can race with udev as we see here.

Fix it by removing the 'dmsetup mknodes' as it is unnecessary to
create the correct device node layout the rest of the system is
expecting to see.

Additionally,_dmsetup_remove() calls 'dmsetup mknodes' and that can
also race with udev and cause issues. Hence we need to remove that
call from the remove operation as well.

Further, 'dmsetup remove' is also subject to races with udev which
results in device remove failing.  This problem is documented in the
dmsetup man page and suggests the use of the "--retry" option. This
means dmsetup will retry several times over a few seconds before
failing the removal.

This reduces the remove failure rate substantially,
but it can still occasionally fail when the system is under heavy
load and udev processing is very slow. This is fixable, but requires
fstests udev infrastructure changes as it requires udevadm
functionality that is relatively new. Hence that will be done as
a separate fix.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: per-test dmdelay instances

We can't run two tests that use dmdelay at the same time because
the device name is the same. hence they interfere with each other.
Give dmdelay devices their own per-test names to avoid this
problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: per-test dmdust instances

We can't run two tests that use dmdust at the same time because
the device name is the same. hence they interfere with each other.
Give dmdust devices their own per-test names to avoid this
problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: per-test dmthin instances

We can't run two tests that use dmthin at the same time because
the device name is the same. hence they interfere with each other.
Give dmthin devices their own per-test names to avoid this
problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: per-test dmhuge instances

We can't run two tests that use dmhuge at the same time because
the device name is the same. hence they interfere with each other.
Give dmhuge devices their own per-test names to avoid this
problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: per-test dmerror instances

We can't run two tests that use dmerror at the same time because
the device name is the same. hence they interfere with each other.
Give dmerror devices their own per-test names to avoid this
problem.

Note that we need a hack to pass the test sequence number through
to src/dmerror as used by generic/441 so that it can construct the
dmerror name correctly.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>