www.infradead.org Git - users/hch/xfstests-dev.git/log

fstests/btrfs: use _random_file() helper

Use _random_file() helper to choose a random file in a directory.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: introduce _random_file() helper

Currently, we use "ls ... | sort -R | head -n1" (or tail) to choose a
random file in a directory.It sorts the files with "ls", sort it randomly
and pick the first line, which wastes the "ls" sort.

Also, using "sort -R | head -n1" is inefficient. For example, in a
directory with 1000000 files, it takes more than 15 seconds to pick a file.

  $ time bash -c "ls -U | sort -R | head -n 1 >/dev/null"
  bash -c "ls -U | sort -R | head -n 1 >/dev/null"  15.38s user 0.14s system 99% cpu 15.536 total

  $ time bash -c "ls -U | shuf -n 1 >/dev/null"
  bash -c "ls -U | shuf -n 1 >/dev/null"  0.30s user 0.12s system 138% cpu 0.306 total

So, we should just use "ls -U" and "shuf -n 1" to choose a random file.
Introduce _random_file() helper to do it properly.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: Verify dir permissions when creating a stub subvolume

btrfs supports creating nesting subvolumes however snapshots are not
recurive. When a snapshot is taken of a volume which contains a subvolume
the subvolume is replaced with a stub subvolume which has the same name and
uses inode number 2. This test validates that the stub volume copies
permissions of the original volume.

Signed-off-by: Lee Trager <lee@trager.us>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/220: do not run async discard test on zoned device

The mount option "discard=async" is not meant to be used on the zoned mode.
Skip it from the test.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: drop 'fsck -f' parameter from _repair_test_fs

The '-f' parameter is fsck.ext# specific, where it's documented to:
Force checking even if filesystem is marked clean

_repair_test_fs() is only called on _check_test_fs() failure, so
dropping the parameter should be possible without changing ext#
behaviour.
Doing so fixes _repair_test_fs() on exfat, where fsck.exfat doesn't
support '-f'.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/{175,297,298}: fix use of uninitialized var

The truncate command in those tests use an uninitialized variable i.
in kdevops, i must contain some leftover, so we get errors like:
/data/fstests-install/xfstests/tests/generic/298: line 45: /dev/loop12):
syntax error: operand expected (error token is "/dev/loop12)")

Apparently, noone including the author of the tests knows why this
truncate command is in the test, so remove the wrong truncate command.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

check: fix parsing expunge file with comments

commit 60054d51 ("check: fix excluded tests are only expunged in the
first iteration") change to use exclude_tests array instead of file.

The check if a test is in expunge file was using grep -q $TEST_ID FILE
so it was checking if the test was a non-exact match to one of the
lines, for a common example: "generic/001 # exclude this test" would be
a match to test generic/001.

The commit regressed this example, because the new code checks for exact
match of [ "generic/001" == "generic/001 " ]. Change the code to match a
regular expression to deal with this case and any other suffix correctly.

NOTE that the original code would have matched test generic/100 with lines
like "generic/1000" when we get to 4 digit seqnum, so the regular
expression does an exact match to the first word of the line.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fsx: tidy options usage and format

1. Add missing options and wrap the cli example line.
2. Cleanup and also add missing "-K" operation for options description
part.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

t_ofd_locks: fix sem initialization sequence

The locker was waiting for sem_otime on sem0 to became non-zero after
incrementing sem0 himself. So sem_otime was never 0 at the time of
checking it, so the check was redundant/wrong.

This patch:
- moves the increment of sem1 to the lock-tester site
- lock-setter waits for that sem1 event, for which this patch replaces
the wait loop on sem_otime with GETVAL loop, adding a small sleep
- increment of sem0 to 2 moved past that sem1 event. That sem0 event
is currently not used/waited.

This guarantees that the lock-setter is working only after lock-getter
is fully initialized.

CC: fstests@vger.kernel.org
CC: Murphy Zhou <xzhou@redhat.com>
CC: Jeff Layton <jlayton@kernel.org>
CC: Zorro Lang <zlang@redhat.com>
Signed-off-by: Stas Sergeev <stsp2@yandex.ru>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

t_ofd_locks: fix stalled semaphore handling

Currently IPC_RMID was attempted on a semid returned after failed
semget() with flags=IPC_CREAT|IPC_EXCL. So nothing was actually removed.

This patch introduces the much more reliable scheme where the wrapper
script creates and removes semaphores, passing a sem key to the test
binary via new -K option.

This patch speeds up the test ~5 times by removing the sem-awaiting
loop in a lock-getter process. As the semaphore is now created before
the test process started, there is no need to wait for anything.

CC: fstests@vger.kernel.org
CC: Murphy Zhou <xzhou@redhat.com>
CC: Jeff Layton <jlayton@kernel.org>
CC: Zorro Lang <zlang@redhat.com>
Signed-off-by: Stas Sergeev <stsp2@yandex.ru>
Reviwed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/213: fix failure due to misspelled function name

The test is calling _not_run but it should be _notrun, so fix that.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: skip fragmentation tests when alwayscow mode is enabled, part 2

If the always_cow debugging flag is enabled, all file writes turn into
copy writes. This dramatically ramps up fragmentation in the filesystem
(intentionally!) so there's no point in complaining about fragmentation.

I missed these two in the original commit because readahead for md5sum
would create large folios at the start of the file. This resulted in
the fdatatasync after the random writes issuing writeback for the whole
large folio, which reduced file fragmentation to the point where this
test started passing.

With Ritesh's patchset implementing sub-folio dirty tracking, this test
goes back to failing due to high fragmentation (as it did before large
folios) so we need to mask these off too.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/642: fix SOAK_DURATION usage in generic/642

Misspelled variable name. Yay bash.

Fixes: 3e85dd4fe4 ("misc: add duration for long soak tests")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: add helper to canonicalize devices used to enable persistent disks

The filesystem configuration file does not allow you to use symlinks to
real devices given the existing sanity checks verify that the target end
device matches the source. Device mapper links work but not symlinks for
real drives do not.

Using a symlink is desirable if you want to enable persistent tests
across reboots. For example you may want to use /dev/disk/by-id/nvme-eui.*
so to ensure that the same drives are used even after reboot. This
is very useful if you are testing for example with a virtualized
environment and are using PCIe passthrough with other qemu NVMe drives
with one or many NVMe drives.

To enable support just add a helper to canonicalize devices prior to
running the tests.

This allows one test runner, kdevops, which I just extended with
support to use real NVMe drives it has support now to use nvme EUI
symlinks and fallbacks to nvme model + serial symlinks as not all
NVMe drives support EUIs. The drives it uses for the filesystem
configuration optionally is with NVMe eui symlinks so to allow
the same drives to be used over reboots.

For instance this works today with real nvme drives:

mkfs.xfs -f /dev/nvme0n1
mount /dev/nvme0n1 /mnt
TEST_DIR=/mnt TEST_DEV=/dev/nvme0n1 FSTYP=xfs ./check generic/110

FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 flax-mtr01 6.5.0-rc3-djwx #rc3 SMP PREEMPT_DYNAMIC Wed Jul 26 14:26:48 PDT 2023

generic/110        2s
Ran: generic/110
Passed all 1 tests

But this does not:

TEST_DIR=/mnt TEST_DEV=/dev/disk/by-id/nvme-eui.0035385411904c1e FSTYP=xfs ./check generic/110
mount: /mnt: /dev/disk/by-id/nvme-eui.0035385411904c1e already mounted on /mnt.
common/rc: retrying test device mount with external set
mount: /mnt: /dev/disk/by-id/nvme-eui.0035385411904c1e already mounted on /mnt.
common/rc: could not mount /dev/disk/by-id/nvme-eui.0035385411904c1e on /mnt

umount /mnt
TEST_DIR=/mnt TEST_DEV=/dev/disk/by-id/nvme-eui.0035385411904c1e FSTYP=xfs ./check generic/110
TEST_DEV=/dev/disk/by-id/nvme-eui.0035385411904c1e is mounted but not on TEST_DIR=/mnt - aborting
Already mounted result:
/dev/disk/by-id/nvme-eui.0035385411904c1e /mnt

This fixes this. This allows the same real drives for a test to be
used over and over after reboots.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

check: generate gcov code coverage reports at the end of each section

Support collecting kernel code coverage information as reported in
debugfs. At the start of each section, we reset the gcov counters;
during the section wrapup, we'll collect the kernel gcov data.

If lcov is installed and the kernel source code is available, it will
also generate a nice html report. If a CLI web browser is available, it
will also format the html report into text for easy grepping.

This requires the test runner to set REPORT_GCOV=1 explicitly and gcov
to be enabled in the kernel.

Cc: tytso@mit.edu
Cc: kent.overstreet@linux.dev
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/276: make test accurate regarding number of expected extents

btrfs/276 creates a 16G file with compression enabled in order to quickly
and efficiently create a file with many extents and have a fs tree with a
height of 3 (root node at level 2), so that it can test that fiemap is
correctly reporting extent sharedness when we have shared subtrees of the
fs tree due to a snapshot.

Compression results in extents with a maximum size of 128K and the test
is expecting only extents of 128K, which normally happens if the machine
has a large amount of RAM and writeback is not triggered before the xfs_io
command finishes. However if writeback is triggered in the meanwhile, due
to memory pressure for example, then we can get end up with some extents
that are smaller than 128K, therefore increasing the total number of
extents in the test file and make the test fail.

This seems to happen often on test machines with small amounts of RAM,
such as 4G, as reported by Qu in the following thread:

https://lore.kernel.org/linux-btrfs/20230801065529.50122-1-wqu@suse.com/

So to address this create a file with holes and direct IO to make sure we
always get a specific number of extents in the test file. To speedup the
test create 2000 64K extents, with holes in between them, so that it works
on a fs with any sector size, and then create a bunch of files with large
xattrs to quickly bump the fs tree height to 3 for any node size (4K to
64K). This also guarantees that the file extent items are spread over
multiples leaves, in order to exercise fiemap's correctness when reporting
shared extents due to shared subtrees.

Reported-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Tested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: add smoketest group

Darrick suggests that fstests can provide a simple smoketest, by
running several generic filesystem smoke testing for five minutes
apiece (SOAK_DURATION="5m"). Since there are only five smoke tests,
this is effectively a 20min super-quick test.

With gcov enabled, running these tests yields about ~75% coverage for
iomap and ~60% for xfs; or ~50% for ext4 and ~75% for ext4; and ~45%
for btrfs. Coverage was about ~65% for the pagecache.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/122: adjust test for flexarray conversions in 6.5

Adjust the output of this test to handle the conversion of flexarray
declaration conversions in linux v6.5, commit a49bbce58ea9 ("xfs:
convert flex-array declarations in xfs attr leaf blocks")

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: add a test for device removal without dirty data

Test the removal of the underlying device when the file system still
does not have dirty data.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: add a test for device removal with dirty data

Test the removal of the underlying device when the file system still
has dirty data.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add a test case to make sure scrub can repair parity corruption

There is a kernel regression caused by commit 75b470332965 ("btrfs:
raid56: migrate recovery and scrub recovery path to use error_bitmap"),
which leads to scrub not repairing corrupted parity stripes.

So here we add a test case to verify the P/Q stripe scrub behavior by:

- Create a RAID5 or RAID6 btrfs with minimal amount of devices
  This means 2 devices for RAID5, and 3 devices for RAID6.
  This would result the parity stripe to be a mirror of the only data
  stripe.

  And since we have control of the content of data stripes, the content
  of the P stripe is also fixed.

- Create an 64K file
  The file would cover one data stripe.

- Corrupt the P stripe

- Scrub the fs
  If scrub is working, the P stripe would be repaired.

  Unfortunately scrub can not report any P/Q corruption, limited by its
  reporting structure.
  So we can not use the return value of scrub to determine if we
  repaired anything.

- Verify the content of the P stripe

- Use "btrfs check --check-data-csum" to double check

By above steps, we can verify if the P stripe is properly fixed.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/294: reject zoned devices for now

The test case itself is utilizing RAID5/6, which is not yet supported on
zoned device.

In the future we would use raid-stripe-tree (RST) feature, but for now
just reject zoned devices completely.

And since we're here, also update the _fixed_by_kernel_commit lines, as
the proper fix is already merged upstream.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: install soak_duration.awk

Commit 3e85dd4fe423 ("misc: add duration for long soak tests") added a
helper executable, soak_duration.awk, is which used by the check
script if SOAK_DURATION is set. This script translates a
"human-friendly" time duration specifier, such as 4m or 2d into an
integer number of seconds. We need to make sure that this script is
installed or the check script will bomb out if SOAK_DURATION is set
(and if the fstests installation doesn't include a full set of fstests
source, but just those files installed by "make install").

Fixes: 3e85dd4fe423 ("misc: add duration for long soak tests")
Cc: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add a test case to verify that per-fs features directory gets updated

Although btrfs has a per-fs feature directory, it's not properly
refreshed after new features are enabled.

We had some attempts to do that properly, like commit 14e46e04958d
("btrfs: synchronize incompat feature bits with sysfs files").
But unfortunately that commit get later reverted as some call sites is
not safe to update sysfs files.

Now we have a new commit b7625f461da6 ("btrfs: sysfs: update fs features
directory asynchronously") to properly refresh that per-fs features
directory.

So it's time to add a test case for it. The test case itself is pretty
straightforward:

- Make a very basic 3 disks btrfs
  Only using the very basic profiles (DUP/SINGLE) so that even older
  mkfs.btrfs can support.

- Make sure per-fs features directory doesn't contain "raid1c34" file

- Balance the metadata to RAID1C3 profile

- Verify the per-fs features directory contains "raid1c34" feature file

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
[ Update commit log. Remove commented code. Add _fixed_by_kernel_commit.
  Check mkfs status. Add sync. ]
Signed-off-by: Anand Jain <anand.jain@oracle.com>

btrfs: add a test case to check btrfs won't crash on certain corruption

The test case would reproduce the situation by creating an empty fs,
with SINGLE metadata profile, then corrupt the tree root manually.
Finally try mounting the corrupted fs, the mount should fail while our
kernel should not fail.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
[ Update commit log. Fix a line gt 80 chars. Use append to $seqres.full.
Fix comment ]
Signed-off-by: Anand Jain <anand.jain@oracle.com>

btrfs: add a test case to verify the write behavior of large RAID5 data chunks

There is a recent regression during v6.4 merge window, that a u32 left
shift overflow can cause problems with large data chunks (over 4G)
sized.

This is especially nasty for RAID56, which can lead to ASSERT() during
regular writes, or corrupt memory if CONFIG_BTRFS_ASSERT is not enabled.

This is the regression test case for it.

Unlike btrfs/292, btrfs doesn't support trim inside RAID56 chunks, thus
the workflow is simplified:

- Create a RAID5 or RAID6 data chunk during mkfs

- Fill the fs with 5G data and sync
For unpatched kernel, the sync would crash the kernel.

- Make sure everything is fine

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/558: avoid forkbombs on filesystems with many free inodes

Mikulas reported that this test became a forkbomb on his system when he
tested it with bcachefs.  Unlike XFS and ext4, which have large inodes
consuming hundreds of bytes, bcachefs has very tiny ones.  Therefore, it
reports a large number of free inodes on a freshly mounted 1GB fs (~15
million), which causes this test to try to create 15000 processes.

There's really no reason to do that -- all this test wanted to do was to
exhaust the number of inodes as quickly as possible using all available
CPUs, and then it ran xfs_repair to try to reproduce a bug.  Set the
number of subshells to 4x the CPU count and spread the work among them
instead of forking thousands of processes.

Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: add a couple more tests for ascii-ci problems

Add some tests to make sure that userspace and the kernel actually
agree on how to do ascii case-insensitive directory lookups, and that
metadump can actually obfuscate such filesystems.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: cleanup old .kmemleak files

I've spent a non-negligible amount of time looking into a kmemleak that
didn't exist in the code I was testing because there was an old .kmemleak
file in the results directory. I don't think this is an intended behaviour,
so I'm proposing to remove these files everytime we capture the result of a
new scan.

Signed-off-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

overlay: Add test coverage for fs-verity support

This tests that the right xattrs are set during copy-up, and
that we properly fail on missing of erronous fs-verity digests
when validating.

We also ensure that verity=require fails if a metacopy has not
fs-verity, and doesn't do a meta-coopy-up if the base file lacks
verity.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

overlay: Add test for follow of lowerdata in data-only layers

Add test coverage for following metacopy from lower layer to
data-only lower layers.

Data-only lower layers are configured using the syntax:
lowerdir=<lowerdir1>:<lowerdir2>::<lowerdata1>::<lowerdata2>.

Test that lowerdata files can be followed only by absolute redirect
from lower layer.

Test that with two lowerdata dirs, we can lookup individual lowerdata
files in both, and that a shared file is resolved from the uppermost
lowerdata dir.

There is also test case for lazy-data lookups, where we remove the
lowerdata file and validate that we get metadata from the metacopy
file, but open fails.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

overlay/060: add test cases of follow to lowerdata

Add test coverage for following metacopy from lower layer to
lower data with absolute, relative and no redirect.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Alexander Larsson <alexl@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

overlay: add helper for mounting rdonly overlay

Allow passing empty upperdir to _overlay_mount_dirs().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Alexander Larsson <alexl@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

report: remove xmlns specifier

By specifying "xmlns=https://git.kernel.org/.../xfstests-dev.git",
this causes XML complaint parsers, such as the one used by the python
junitparser library, to put all of the XML elements into a namespace,
which then causes junitparser to toss its cookies.

This can be worked-around in a test runner script via:

sed -i.orig -e 's/xmlns=\".*\"//' "$RESULT_BASE/result.xml"

but it's better not to include the xmlns line at all in the first
place, since this may cause other users of fstests who are using
the Python junitparser library a lot of headaches.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

report: safely update the result.xml file

After every single test, we rewrite result.xml from scratch.  This
ensures that the XML file is always in a valid, parseable state, even
if the check script is killed or the machine crashes in the middle of
a test.

If the test is being run in a Cloud VM as a "spot" (Amazon, Azure, or
GCE) or "preemptible" (Oracle) instance, the VM can be halted whenever
the Cloud provider needs the capacity for customers who are willing to
pay full price.  ("Spot" instances can be 60% to 90% cheaper ---
allowing the frugal kernel developer to get up to 10 times more
testing for the same amount of money.  :-)

Since a "spot" VM can get terminated at any time, it is possible for
the check script to be killed while it is in the middle of rewriting
the result.xml file.  If the result.xml file is only partially
written, information regarding the tests run before VM termination
will be lost.

To address this race, write the new result.xml file as result.xml.new,
and only rename it to result.xml after the XML file is fully written
out.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: test growfs of the realtime device

Create a new test to make sure that growfs on the realtime device works
without corrupting anything.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/041: force create files on the data device

Since we're testing growfs of the data device, we should create the
files there, even if the mkfs configuration enables rtinherit on the
root dir.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/439: amend test to work with new log geometry validation

An upcoming patch moves more log validation checks to the superblock
verifier, so update this test as needed.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/569: skip post-test fsck run

This test examines the behavior of mkfs.xfs with specific filesystem
configuration files by formatting the scratch device directly with those
exact parameters.  IOWs, it doesn't include external log devices or
realtime devices.  If external devices are set up, the post-test fsck
run fails because the filesystem doesnt' use the (allegedly) configured
external devices.  Fix that by adding _require_scratch_nocheck.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/529: fix bogus failure when realtime is configured

If I have a realtime volume configured, this test will sometimes trip
over this:

XFS: Assertion failed: nmaps == 1, file: fs/xfs/xfs_dquot.c, line: 360
Call Trace:
xfs_dquot_disk_alloc+0x3dc/0x400 [xfs 97e1fa8953d397b1fb9732df4de7fa9070bda501]
xfs_qm_dqread+0xc9/0x190 [xfs 97e1fa8953d397b1fb9732df4de7fa9070bda501]
xfs_qm_dqget+0xa8/0x230 [xfs 97e1fa8953d397b1fb9732df4de7fa9070bda501]
xfs_qm_vop_dqalloc+0x160/0x600 [xfs 97e1fa8953d397b1fb9732df4de7fa9070bda501]
xfs_setattr_nonsize+0x318/0x520 [xfs 97e1fa8953d397b1fb9732df4de7fa9070bda501]
notify_change+0x30e/0x490
chown_common+0x13e/0x1f0
do_fchownat+0x8d/0xe0
__x64_sys_fchownat+0x1b/0x20
do_syscall_64+0x2b/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7fa6985e2cae

The test injects the bmap_alloc_minlen_extent error, which refuses to
allocate file space unless it's exactly minlen long. However, a
precondition of this injection point is that the free space on the data
device must be sufficiently fragmented that there are small free
extents.

However, if realtime and rtinherit are enabled, the punch-alternating
call will operate on a realtime file, which only serves to write 0x55
patterns into the realtime bitmap. Hence the test preconditions are not
satisfied, so the test is not serving its purpose.

Fix it by disabling rtinherit=1 on the rootdir so that we actually
fragment the bnobt/cntbt as required.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/btrfs: handle dmdust as mounted device in _btrfs_buffered_read_on_mirror()

[BUG]
After commit ab41f0bddb73 ("common/btrfs: use _scratch_cycle_mount to
ensure all page caches are dropped"), the test case btrfs/143 can fail
like below:

btrfs/143 6s ... [failed, exit status 1]- output mismatch (see ~/xfstests/results//btrfs/143.out.bad)
    --- tests/btrfs/143.out 2020-06-10 19:29:03.818519162 +0100
    +++ ~/xfstests/results//btrfs/143.out.bad 2023-06-19 17:04:00.575033899 +0100
    @@ -1,37 +1,6 @@
     QA output created by 143
     wrote 131072/131072 bytes
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................

[CAUSE]
Test case btrfs/143 uses dm-dust device to emulate read errors, this
means we can not use _scratch_cycle_mount to cycle mount $SCRATCH_MNT.

As it would go mount $SCRATCH_DEV, not the dm-dust device to
$SCRATCH_MNT.
This prevents us to trigger read-repair (since no error would be hit)
thus fail the test.

[FIX]
Since we can mount whatever device at $SCRATCH_MNT, we can not use
_scratch_cycle_mount in this case.

Instead implement a small helper to grab the mounted device and its
mount options, and use the same device and mount options to cycle
$SCRATCH_MNT mount.

This would fix btrfs/143 and hopefully future test cases which use dm
devices.

Reported-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add test case to verify the behavior with large RAID0 data chunks

There is a recent regression during v6.4 merge window, that a u32 left
shift overflow can cause problems with large data chunks (over 4G
sized).

This is the regression test case for it.

The test case itself would:

- Create a RAID0 chunk with a single 6G data chunk
  This requires at least 6 devices from SCRATCH_DEV_POOL, and each
  should be larger than 2G.

- Fill the fs with 5G data

- Make sure everything is fine
  Including the data csums.

- Delete half of the data

- Do a fstrim
  This would lead to out-of-boundary trim if the kernel is not patched.

- Make sure everything is fine again
  If not patched, we may have corrupted data due to the bad fstrim
  above.

For now, this test case only checks the behavior for RAID0.
As for RAID10, we need 12 devices, which is out-of-reach for a lot of
test envionrments.

For RAID56, they would have a different test case, as they don't support
discard inside the RAID56 chunks.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: test activating swapfile in the presence of snapshots

Test that if we have a subvolume with a non-active swap file, we can not
activate it if there are any snapshots. Also test that after all the
snapshots are removed, we will be able to activate the swapfile.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/604: Fix for overlayfs

Since v6.3, I noticed that generic/604 does not run on overlayfs
because:

  generic/604 -- upper fs needs to support d_type

This is odd because the base fs I am using (xfs) does support d_type.

The reason is that for overlayfs, this sequence run by this test:

  _scratch_unmount &
  _scratch_mount

Translates to:

  umount $OVL_MNT; umount $BASE_MNT &
  mount $BASE_MNT ...; mount $OVL_MNT ...

Which can end up reordred as:

  umount $OVL_MNT;
  mount $BASE_MNT ...
                  umount $BASE_MNT &
                  mount $OVL_MNT ...

and overlayfs is trying to use a non-existing upper fs.

Use UMOUNT_PROG directly instead of the _scratch_unmount
helper, to avoid unmounting the base fs.

Incidently, the only thing that has changed in overlayfs in v6.3
is idmapped mounts support and the test in question was run without
idmapped mounts enabled, so the cahnge in behavior must be related
to some subtle timing change.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

check: fix excluded tests are only expunged in the first iteration

If iterating more than once and excluding some tests, the
excluded tests are expunged in the first iteration, but run in
subsequent iterations. This is not expected.

The problem was caused by the temporary file saving the excluded
tests being deleted by `rm -f $tmp.*` in _wrapup() at the end of
the first iteration.

This commit saves the excluded tests into a variable instead of a
temporary file.

Signed-off-by: Yuezhang Mo <Yuezhang.Mo@foxmail.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/config: redirect modprobe helpinfo to stdout for busybox

Due to the busybox' modprobe writes help to stderr. We need to
redirect it to stdout, or it will end up in a test results.

Signed-off-by: Stas Sergeev <stsp2@yandex.ru>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: reduce runtime of check -n

kvm-xfstests invokes check -n twice to pre-process and generate the
tests-to-run list, which is then being passed as a long list of tests
for invkoing check in the command line.

check invokes dirname, basename and sed several times per test just
for doing basic string prefix/suffix trimming.
Use bash string pattern matching instead which is much faster.

Note that the following pattern matching expression change:
< test_dir=${test_dir#$SRC_DIR/*}
> t=${t#$SRC_DIR/}
does not change the meaning of the expression, because the
shortest match of "$SRC_DIR/*" that is being trimmed is "$SRC_DIR/"
and removing the tests/ prefix is what this code intended to do.

With check -n, there is no need to cleanup the results dir,
but check -n is doing that for every single listed test.
Move the cleanup of results dir to before actually running the test.

These improvements to check pre-test code cut down several minutes
from the time until tests actually start to run with kvm-xfstests.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: Enable _test_mkfs to force a mkfs on a xfs filesystem

Calling _test_mkfs on an already initialized xfs FS will fail as the
initialization is not enforced by '-f' argument, unless it's included in
MKFS_OPTIONS.
So, adding 'RECREATE_TEST_DEV=true' to the config file end up being useless
for xfs filesystems.

So, adding the a specific xfs optiong in _test_mkfs using -f argument
makes RECREATE_TEST_DEV actually useful.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: skip ceph-fuse when atime is required

Ceph won't maintain the atime, so just skip the tests when the atime
is required.

Fixes: https://tracker.ceph.com/issues/61551
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/020: add ceph-fuse support

For ceph fuse client the fs type will be "ceph-fuse".

Fixes: https://tracker.ceph.com/issues/61496
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/506: fix to call _scratch_enable_pquota()

Otherwise, this testcase will fail due to project quota feature is not
enabled in the image.

Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/btrfs: use _scratch_cycle_mount to ensure all page caches are dropped

[BUG]
There is a chance that btrfs/266 would fail on aarch64 with 64K page
size. (No matter if it's 4K sector size or 64K sector size)

The failure indicates that one or more mirrors are not properly fixed.

[CAUSE]
I have added some trace events into the btrfs IO paths, including
__btrfs_submit_bio() and __end_bio_extent_readpage().

When the test failed, the trace looks like this:

112819.764977: __btrfs_submit_bio: r/i=5/257 fileoff=0 mirror=1 len=196608 pid=33663
                                    ^^^ Initial read on the full 192K file
112819.766604: __btrfs_submit_bio: r/i=5/257 fileoff=0 mirror=2 len=65536 pid=21806
    ^^^ Repair on the first 64K block
Which would success
112819.766760: __btrfs_submit_bio: r/i=5/257 fileoff=65536 mirror=2 len=65536 pid=21806
    ^^^ Repair on the second 64K block
Which would fail
112819.767625: __btrfs_submit_bio: r/i=5/257 fileoff=65536 mirror=3 len=65536 pid=21806
    ^^^ Repair on the third 64K block
Which would success
112819.770999: end_bio_extent_readpage: read finished, r/i=5/257 fileoff=0 len=196608 mirror=1 status=0
^^^ The repair succeeded, the
     full 192K read finished.

112819.864851: __btrfs_submit_bio: r/i=5/257 fileoff=0 mirror=3 len=196608 pid=33665
112819.874854: __btrfs_submit_bio: r/i=5/257 fileoff=0 mirror=1 len=65536 pid=31369
112819.875029: __btrfs_submit_bio: r/i=5/257 fileoff=131072 mirror=1 len=65536 pid=31369
112819.876926: end_bio_extent_readpage: read finished, r/i=5/257 fileoff=0 len=196608 mirror=3 status=0

But above read only happen for mirror 1 and mirror 3, mirror 2 is not
involved.
This means by somehow, the read on mirror 2 didn't happen, mostly
due to something wrong during the drop_caches call.

It may be an async operation or some hardware specific behavior.

On the other hand, for test cases doing the same operation but utilizing
direct IO, aka btrfs/267, it never fails as we never populate the page
cache thus would always read from the disk.

[WORKAROUND]
The root cause is in the "echo 3 > drop_caches", which I'm still
debugging.

But at the same time, we can workaround the problem by forcing a
cycle mount of scratch device, inside _btrfs_buffered_read_on_mirror().

By this we can ensure all page caches are dropped no matter if
drop_caches is working correctly.

With this patch, I no longer hit the failure on aarch64 with 64K page
size anymore, while before this the test case would always fail during a
-I 10 run.

[zlang: remove the duplicated drop_caches line]

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/106: avoid hard coded output to handle different page sizes

[BUG]
Test case btrfs/106 is known to fail if the system has a page size other
than 4K.

This test case can fail like this:

    btrfs/106 5s ... - output mismatch (see ~/xfstests-dev/results//btrfs/106.out.bad)
    --- tests/btrfs/106.out     2022-11-24 19:53:53.140469437 +0800
    +++ ~/xfstests-dev/results//btrfs/106.out.bad      2023-06-02 16:12:57.014273249 +0800
    @@ -5,19 +5,19 @@
     File contents before unmount:
     0 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
     *
    -40
    +1000
     File contents after remount:
     0 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
    ...
    (Run 'diff -u ~/xfstests-dev/tests/btrfs/106.out /home/adam/xfstests-dev/results//btrfs/106.out.bad'  to see the entire diff)

This is particularly problematic for systems like Aarch64 or PPC64 which
supports 64K page size.

[CAUSE]
The test case is using page size to calculate the amount of data to
write, thus it doesn't support any page sizes other than 4K.

[FIX]
Instead of using the golden output, go with md5sum and compare them
before and after the remount.

The new md5sum would only go into $seqres.full for debugging, not into
golden output to avoid false alerts on different pages sizes.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/122: fix nodesize option in mfks.btrfs

btrf/122 is failing on a system with 64k page size:

     QA output created by 122
    +ERROR: illegal nodesize 16384 (smaller than 65536)
    +mount: /mnt/scratch: wrong fs type, bad option, bad superblock on /dev/vdb2, missing codepage or helper program, or other error.
    +mount /dev/vdb2 /mnt/scratch failed
    +(see /xfstests-dev/results//btrfs/122.full for details)

Mkfs.btrfs sets the default node size to 16K when the sector size is less
than 16K, and it matches the sector size when it's greater than 16K.
So, there's no need to explicitly set it.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Tested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/xfs: compress online repair rebuild output by default

Force-repairing the filesystem after a test can fill up /tmp with quite
a lot of logging message. We don't have a better place to stash that
output in case the scrub fails and we need to analyze it later, so
compress it with gzip and only decompress it later.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/503: don't rebuild the fs metadata when testing metadump

This test exercises metadump with the standard populate image. There's
no need to test rebuilding the entire fs every step of the way since
we're just going to metadump over it.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: disallow post-test online rebuilds when testing online fsck

If we're testing the online fsck code or running fuzz tests of the
filesystem, don't let the post-test checks rebuild the filesystem
metadata, because this is redundant with the test and will disturb
the metadata if the tools fail.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/155: improve logging in this test

If this test fails after a certain number of writes, we should state
the exact number of writes so that we can coordinate with 155.full.
Instead, we state the pre-randomization number, which isn't all that
helpful.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/155: discard stderr when checking for NEEDSREPAIR

This test deliberate crashes xfs_repair midway through writing metadata
to check that NEEDSREPAIR is always triggered by filesystem writes.
However, the subsequent scan for the NEEDSREPAIR feature bit prints
verifier errors to stderr.

On a filesystem with metadata directories, this leads to the test
failing with this recorded in the golden output:

+Metadata CRC error detected at 0x55c0a2dd0d38, xfs_dir3_block block 0xc0/0x1000
+dir block owner 0x82 doesnt match block 0xbb8cd37e44eb3623

This isn't specific to metadata directories -- any repair crash could
leave a metadata structure in a weird state such that starting xfs_db
will spray verifier errors. For _check_scratch_xfs_features here, we
don't care if the filesystem is corrupt; we /only/ care that the
superblock feature bit is set. Route all that noise to devnull.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/108: allow slightly higher block usage

With pmem and fsdax enabled, I occasionally see this test fail on XFS:

Mode: (0600/-rw-------) Uid: (1) Gid: (2)
Disk quotas for User #1 (1)
Filesystem Blocks Quota Limit Warn/Time Mounted on
-SCRATCH_DEV 48M 0 0 00 [------] SCRATCH_MNT
+SCRATCH_DEV 48.0M 0 0 00 [------] SCRATCH_MNT
Disk quotas for User #1 (1)
Filesystem Files Quota Limit Warn/Time Mounted on
SCRATCH_DEV 3 0 0 00 [------] SCRATCH_MNT

The cause of this failure is fragmentation in the file mappings that
results in a block mapping structure that no longer fits in the inode.
Hence the block usage is 49160K instead of the 49152K that was written.
Use some fugly sed duct tape to make this test accomodate this
possiblity.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: test shipped config files work properly with mkfs.xfs

Sanity check the shipped mkfs.xfs config files by using
them to format the scratch device.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add a test case to verify the scrub error reports

There is a regression in recent v6.4 cycle where a scrub rewrite changed
how we report errors, especially repairable errors.

Before the rewrite, we report the initial errors hit, and the amount of
repairable errors.
While after the rewrite, we no longer report the initial errors, but
only the number of repairable errors.

This behavior change is a regression, thus needs a test case to prevent
such problem from happening again.

The test case itself would:

- Create a btrfs using DUP data profile and 4K sector size

- Create a file with one 128K extent

- Corrupt the first mirror of that 128K extent

- Scrub and checks the detailed report
Both corrected errors and csum errors should be 32.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add a test case to verify read-only scrub

There is a regression on btrfs read-only scrub behavior.

The commit e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to
scrub_stripe infrastructure") makes btrfs scrub to ignore the read-only
flag completely, causing scrub to always fix the corruption.

This test case would create an fs with repairable corruptions, then run
a read-only scrub, and finally to make sure the corruption is not
repaired.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: Add mmap + DIO write test

This is the same as generic/647, but with an additional test that writes
a memory-mapped page onto itself using direct I/O.

For direct I/O writes, the kernel will invalidate the page cache before
and after carrying out the write. This puts filesystems that fault in
pages on demand and then carry out the write with page faults disabled
into a position in which they will not be able to make any progress.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: skip online scrub and health checks if not supported

Commit a27e6e6f4c18 introduced xfs health checking on no-repair fuzz,
which in turn requires scrub to be run before that. The health checks
are done only if scrub returns with an error (which is expected as an
indication that fuzzed metadata errors were picked up), but the code
does not discern between xfs_scrub returning an error because of
uncorrected metadata vs failing because the kernel does not support
scrub at all.

This causes all tests that do fuzzing with no-repair strategy to fail on
kernels compiled without online scrub support (CONFIG_XFS_ONLINE_SCRUB).

Skip scrub and health checks altogether, if the kernel does not support
it, since the tests are still valuable.

Fixes: a27e6e6f4c18 ("common: check xfs health after doing an online scrub")
Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/213: avoid occasional failure due to already finished balance

btrfs/213 writes data, in 1M extents, for 4 seconds into a file, then
triggers a balance and then after 2 seconds it tries to cancel the
balance operation. More often than not, this works because the balance
is still running after 2 seconds. However it also fails sporadically
because balance has finished in less than 2 seconds, which is plausible
since data and metadata are cached or other factors such as virtualized
environment. When that's the case, it fails like this:

  $ ./check btrfs/213
  FSTYP         -- btrfs
  PLATFORM      -- Linux/x86_64 debian0 6.4.0-rc1-btrfs-next-131+ #1 SMP PREEMPT_DYNAMIC Thu May 11 11:26:19 WEST 2023
  MKFS_OPTIONS  -- /dev/sdc
  MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1

  btrfs/213 51s ... - output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/213.out.bad)
      --- tests/btrfs/213.out 2020-06-10 19:29:03.822519250 +0100
      +++ /home/fdmanana/git/hub/xfstests/results//btrfs/213.out.bad 2023-05-17 15:39:32.653727223 +0100
      @@ -1,2 +1,3 @@
       QA output created by 213
      +ERROR: balance cancel on '/home/fdmanana/btrfs-tests/scratch_1' failed: Not in progress
       Silence is golden
      ...
      (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/213.out /home/fdmanana/git/hub/xfstests/results//btrfs/213.out.bad'  to see the entire diff)
  Ran: btrfs/213
  Failures: btrfs/213
  Failed 1 of 1 tests

To make it much less likely that balance has already finished before we
try to cancel it, unmount and mount again the filesystem before starting
balance, to clear cached metadata and data, and also double the time we
spend writing 1M data extents. Also make the test not run with an
informative message if we detect that balance finished before we could
cancel it.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/287: add git commit hash for the kernel fix

The respective fix landed on kernel 6.4-rc2, commit:

0cad8f14d70c "btrfs: fix backref walking not returning all inode refs"

So update the test to include the commit hash.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/213: add _fixed_by_kernel_commit tag and remove from dangerous group

Add a _fixed_by_kernel_commit to identify the commit the fixed the issue
the test is trying to reproduce, which was:

1dae7e0e58b4 "btrfs: reloc: clear DEAD_RELOC_TREE bit for orphan roots to prevent runaway balance"

introduced in kernel 5.8-rc1. Also remove it from the dangerous group, as
the fix is from 2020 and it was backported to stable releases.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/708: fix commit subject and add its git hash

The test refers to a patch that ended up not committed to Linus' tree, as
the fix evolved through several patchset versions. The up to date fix
landed on kernel 6.4-rc1 and is the following:

b73a6fd1b1ef "btrfs: split partial dio bios before submit"

So updated the test to point to that.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: Add a test for xattr ctime updates

The NFS client wasn't updating ctime after a setxattr request. This is a
test written while fixing the bug.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/{filter,quota}, xfs/1{06,52}: fix grep pattern

Newer(3.9) grep is complaining about these unnecessary
backslashes before # and -, and breaking the golden output.

Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: fix compilation error in splice2pipe on old (<C99) compilers

Compilation fails on system with compiler which uses older C dialect
(e.g. RHEL7; gcc with gnu90) with:

splice2pipe.c: In function 'prepare_pipe':
splice2pipe.c:54:2: error: 'for' loop initial declarations are only allowed in C99 mode
for (unsigned r = pipe_size; r > 0;) {

Fix it by declaring 'r' outside of the loop.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/{094,225}: drop the file allocation unit requirements

Drop these two test precondition requirements now that we've fixed
fiemap-tester to handle unwritten preallocations that are mapped to
unwritten parts of files and/or mapped beyond EOF.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fiemap: FIEMAP_EXTENT_LAST denotes the last record in the recordset

Remove this check because FIEMAP_EXTENT_LAST denotes the last space
mapping record in the recordset of space mappings attached to the file.
That last mapping might actually map space beyond EOF, in the case of
(a) speculative post-eof preallocations, (b) stripe-aligned allocations
on XFS, or (c) realtime files in XFS.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fiemap-tester: holes can be backed by unwritten extents

Filesystem behavior is pretty open-ended for sparse ranges (i.e. holes)
of a file that have not yet been written to. That space can remain
unmapped, it can be mapped to written space that has been zeroed, or it
can be mapped to unwritten extents.

This program trips over that last condition on XFS. If the file is
located on a data device with a raid stripe geometry or on a realtime
device with a realtime extent size larger than 1 filesystem block, it's
possible for unwritten areas to be backed by unwritten preallocations or
unwritten rt blocks, respectively.

Fix the test to skip unwritten extents here.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/{243,245,272,274}: ignore raid alignment flags in bmap output

This test doesn't care about the RAID alignment status of the mappings
that it finds; it only cares about shared and unwritten. Ignore the
mapping stripe alignment flags in the bmapx output. This fixes this
test when the fs has su=128k,sw=4.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/724,xfs/791: adjust test preconditions for post-EOF stripe zeroing

I recently introduced a new fstests config with explicitly specified
stripe geometry of 128k stripe units and a stripe width of 4.  This
broke both of these tests because I hadn't counted on a few things:

1) The write to $SCRATCH_MNT/b at 768k would a 128k delalloc extent
2) This delalloc extent would extend beyond EOF
3) Increasing the file size from 832k to 1m would cause iomap to zero
   the pagecache for the parts of the delalloc extent beyond EOF
4) The newly dirtied posteof delalloc areas would get written to disk
   with a real space allocation

Under these circumstances, FIEXCHRANGE with SKIP_FILE1_HOLES sees a
written extent containing zeroes in file B between 832k and 1m.  File
A has a written extent containing 'X' in the same range, so it exchanges
the two.  When RAID geometry is disabled, the area between 832k and 1m
is usually a hole, so FIEXCHRANGE does nothing.  This causes the md5sum
of the two files to be different, and the test fails.

Fix the test by truncating B to 1m before writing anything to it.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/262: remove dangerous labels

This test starts with a consistent filesystem and should end that way
too. In other words, it isn't dangerous, so drop that from the tags.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fsx: fix indenting of columns in bad buffers report

When file corruption is detected, make the columns of the report line
up correctly even in the diff output.  Although the .out.bad file
contains this (with spaces to demonstrate unequivocally what happens
when tabs are formatted as 8-column indent):

OFFSET  GOOD    BAD     RANGE
0x2c000 0x0000  0xd6c1  0x00000

diffing the good and bad golden output yields poorly formatted output:

+OFFSET GOOD    BAD     RANGE
+0x2c000        0x0000  0xd6c1  0x00000

Replace the tabs with columns indented with printf width specifiers so
that the test output gets this:

OFFSET      GOOD    BAD     RANGE
0x2c000     0x0000  0xd6c1  0x0

...which always lines up the columns regardless of the user's tab
display settings or diff inserting plus signs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/228: sync filesystem after creating subvolume

Test case btrfs/228 creates a subvolume and then calls the dump-tree
command from btrfs-progs. The tree dumping accesses the device directly
and therefore can only see committed metadata - this used to work because
subvolume creation used to commit the transaction that was used to create
the subvolume, however it is no longer the case after a recent patch that
currently is only on the btrfs integration branch "misc-next". That patch
has the following subject:

"btrfs: don't commit transaction for every subvol create"

So explicitly sync the filesystem before calling the dump-tree command,
commenting why we do it. This way the test works before and after that
patch, for any kernel release.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/254: correct subject of the relevant kernel patch and add git commit

The top comment of test btrfs/254 mentions a kernel patch with a subject
of:

"btrfs: harden identification of the stale device"

but that is actually not correct, as the subject was slightly changed when
the patch was picked to:

"btrfs: harden identification of a stale device"

So fix that by removing the comment and use instead a call to
_fixed_by_kernel_commit, which also allows us to specify git commit id.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

groups: add logical_resolve group for btrfs

Add a 'logical_resolve' group to identify tests that use the btrfs
logical-resolve command, which exercises btrfs' logical to ino ioctl.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add a test case for the logical to ino ioctl

Add a test case to exercise the logical to ino ioctl, including the v2
version (which adds the ignore offset option). This is motivated by the
fact that we have no test cases giving full coverage to that ioctl, only
two test cases partially exercise it (btrfs/004 and btrfs/299) and
absolutely no coverage for the v2 ioctl. This resulted in a recent
regression fixed by the patch mentioned in the _fixed_by_kernel_commit
tag of the introduced test case.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/btrfs: add helper to get the bytenr for a file extent item

In upcoming changes there will be the need to find out the logical disk
address (bytenr) that a particular file extent item points to. This is
already implemented as local functions in the test btrfs/299, which is
a bit limited but works fine for that test. Some important or subtle
details why it works for this test:

1) It dumps all trees of the filesystem;

2) It relies on fsync'ing a file and then finding the desired file
   extent item in the log tree from the dump;

3) There's a single subvolume, so it always finds the correct file extent
   item. In case there were multiple subvolumes, it could pick the wrong
   file extent item in case we have inodes with the same number on
   multiple subvolumes (inode numbers are unique only within a subvolume,
   they are not unique across an entire filesystem).

So add a helper to get the bytenr associated to a file extent item to
common/btrfs and use it at btrfs/299 and the upcoming changes.

This helper will dump only the tree of the default subvolume, will sync
the filesystem to commit any open transaction and works only in case the
filesystem is using the scratch device. This is explicitly documented.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

misc: add duration for recovery loop tests

Make it so that we can run recovery loop tests for an exact number of
seconds.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

misc: add duration for long soak tests

Make it so that test runners can schedule long soak stress test programs
for an exact number of seconds by setting the SOAK_DURATION config
variable. Change the definition of the 'soak' test to specify that
these tests can be controlled via SOAK_DURATION.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

readme: document TIME/LOAD_FACTOR

Document these two variables so that we have /some/ reference for what
they're supposed to do.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/476: reclassify this test as a long running soak stress test

This test is a long(ish) running stress test that uses fsstress, so
alter its group membership as follows:

long_rw: because this can read and write to the fs for a long period of
time

stress: because this test employs fsstress

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: Doc changes for afs

Documentation changes for afs.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/696: AFS doesn't support the "noacl" command line option

AFS doesn't support the "noacl" command line option. ACLs are mandatory
and interpreted by the server.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/531: Check for O_TMPFILE

Make generic/531 check that the filesystem under test supports O_TMPFILE
before attempting to test it.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/123, generic/128, afs: Allow for an fs that does its own perm management

The AFS filesystem has its own distributed permission management system
that's based on a per-cell user and group database used in conjunction with
ACLs.  The user is determined by the authentication token acquired from the
kaserver or Kerberos, not by the local fsuid/fsgid.  For the most part, the
uid, gid and mask on a file are ignored.

The generic/123 and generic/128 tests check that the UNIX permission bits do
what would normally be expected of them - but this fails on AFS.  Using "su"
to change the user is not effective on AFS.  Instead, "keyctl session" would
need to be used and an alternative authentication token would need to be
obtained.

Provide a "_require_unix_perm_checking" clause so that these tests can be
suppressed in cases such as AFS.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/317, afs: Allow for a filesystem not to honour the local uid/gid

Each AFS cell has it's own set of user IDs that is uses internally, in its
ACL system and in its protection management protocol. The user ID used by
the fileserver is selected from the set belonging to the fileserver's cell
according to the authentication token associated with an RPC operation -
and this is set as a file's user ID when it is created.

This means that tests that expect to set a UID and see the same UID still
set afterwards will fail.

Add a "_require_use_local_uidgid" clause to indicate that a test expects
internal UID/GID information to be seen in the stat output and should be
skipped if AFS's case.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/314, afs: Allow for a filesystem that doesn't honour SGID inheritance

The AFS filesystem doesn't do any special handling for the SUID, SGID and
SVTX bits and doesn't perform any sort of propagation. Further, only a
user with cell admin rights can set non-0777 bits.

Handle this by adding a "_require_sgid_inheritance" clause and labelling
the test with it, thereby skipping for filesystems that don't support it.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
cc: linux-afs@lists.infradead.org
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: add AFS support

Add support for the AFS filesystem.  AFS is a network filesystem and there
are a number of features it doesn't support.

- No mkfs.  (Kind of.  An AFS volume server can be asked to create a new
   volume, but that's probably best left to AFS-specific test suites.
   Further, a volume would need to be destroyed before another of the same
   name could be created; it's not simply a matter of overwriting the old
   one as it is on a blockdev with a block-based filesystem.)

- No fsck.  (Kind of - the server can be asked to salvage a volume, but it
   may involve taking the server offline).

- No richacls.  AFS has its own ACL system.

- No atimes.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

src: Don't include <sys/mount.h> and <linux/mount.h> together

Newer glibc such as glibc 2.36 also defines 'struct mount_attr'
in addition to <linux/mount.h>.

Usually we should use glibc header instead of kernel header.
But now mount.h is a special case because both new glibc header and
kernel header all define "struct mount_attr'. They also define MS*
macro.

Since we have some syscall wrapper in vfs/missing.h, we can use
<linux.mount.h> directly instead of <sys/mount.h> for
detached_mounts_propagation.c.

For utils.c, it doesn't use the macro or function in <sys/mount.h>,
so remove it directly.

In fact, newer glibc(2.37-1)[1] has sloved conflict problem between
<sys/mount.h> and <linux/mount.h>. In the future(maybe ten years), we
can remove this kernel header and use glibc header.

[1]https://sourceware.org/git/?p=glibc.git;a=commit;h=774058d72942249f71d74e7f2b639f77184160a6

Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Yang Xu <xuyang2018.jy@fujitsu.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Tested-by: Ziyang Zhang <ZiyangZhang@linux.alibaba.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: skip ceph when atime is required

Ceph won't maintain the atime, so just skip the tests when the atime
is required.

URL: https://tracker.ceph.com/issues/53844
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

check: _check_filesystems for errors even if test failed

Previously, we would only run _check_filesystems to ensure that a test
that appeared to pass did not have any filesystem corruption. However,
in _check_filesystems, we also repair any errors found in the filesystem.
Let's do this even if we already know the test failed so that subsequent
tests aren't affected.

Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

check: try to fix the test device if it gets corrupted

If the test device gets corrupted all subsequent tests will fail. To
prevent this from causing all subsequent tests to be useless, try
repair the file system on TEST_DEV if possible. We don't need to do
this with the scratch device since that file system gets recreated
each time anyway.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/517: add missing freeze command

This test is supposed to race fsstress, fsmap, and freezing for a while,
but when we converted it to use _scratch_xfs_stress_scrub, the freeze
loop fell off by accident. Add it back.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>