www.infradead.org Git - users/hch/xfstests-dev.git/log

xfs/076: fix broken mkfs filtering

The test does not do what it says on the packet - the mkfs output is
not actually passed to the mkfs filter, so it doesn't know what
inode size mkfs actually used. Hence CHUNK_SIZE ends up being
calculated as 0, and that means it enters an endless loop because
offset never decreases.

Fix it by adding the missing line continuation.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/629: single extent files should be within tolerance

The test passes if we have between 2 and 40 extents (despite what
the comment says!), with the target being 20. There is absolutely no
reason for considering a single extent file a failure - that
indicates the filesystem completely defeated the fragmentation
behaviour the test was trying to cause. Hence expand the range of
"test pass" tolerance to 1-41 extents.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

vfstests: some tests require the testdir to be shared

This ensures that these tests will run successfully when the
parallel check infrastructure makes all the scratch and test
mounts private.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: clean up termination of various tests

Accumulated minor fixes to improve reliablity of the termination
of various tests when interrupted.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: clean up a couple of dm-flakey tests

Just little things I've found that should be cleaned up.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: don't use directory stacks

Using bash directory stacking (pushd, popd, etc) seems to be
somewhat unreliable. I've been seeing occasional random failures
from both pushd and popd commands that cause the test to fail, and
there does not appear to be any reason for the failures occurring.

Rather than wasting time chasing ghosts, just get rid of the
directory stacking altogether.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/085: general cleanup for reliability and debugging

This test was quite unreliable during development of the parallel
check runner. It redirects all errors to /dev/null, so there was no
way to debug it when it failed.

Use common mount/unmount helpers, redirect errors to $seqres.full,
make sure the cleanup code is always run at test exit and only
attempt to kill processes if they are still running during cleanup.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

filters: add a filter that accepts EIO instead of other errors

Running a dm-flakey or dm-error test that loads a table that returns
EIO to all IO and then running a command that is expected to fail
with a specific error is racy.

If there is memory pressure at the same time that the table is
loaded, cached inodes can be turfed from memory and the command then
needs to read the inode it is about to act on from disk again. This
results in the inode read getting EIO and failing (e.g. xfs_io will
return a stat() error) rather than having the desired operation
fail.

This results in spurious test failures that look like this:

generic/331       - output mismatch (see /mnt/xfs/runner-41/results-2024-11-20-10:57:31/xfs/generic/331.out.bad)
    --- tests/generic/331.out   2022-12-21 15:53:25.487044098 +1100
    +++ /mnt/xfs/runner-41/results-2024-11-20-10:57:31/xfs/generic/331.out.bad  2024-11-20 11:02:12.123572607 +1100
    @@ -5,7 +5,8 @@
     1886e67cf8783e89ce6ddc5bb09a3944  SCRATCH_MNT/test-331/file1
     1886e67cf8783e89ce6ddc5bb09a3944  SCRATCH_MNT/test-331/file2
     CoW and unmount
    -fdatasync: Input/output error
    +/mnt/xfs/runner-41/scratch/test-331/file2: Input/output error
    +stat: Input/output error
     Compare files
    ...

Add a new "flakey EIO filter" that will catch -any- EIO error from
the command and change it to the error we expected to see.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

filter: handle mount errors from CONFIG_BLK_DEV_WRITE_MOUNTED=y

Kernels post 6.x may have CONFIG_BLK_DEV_WRITE_MOUNTED=y which
prevents mount from opening the block device on a mounted
filesystem. This results in an error such as:

mount: <dev>: Can't open blockdev

which is not the error that callers of _filter_error_mount() are
looking for. It is, however, a direct result of the test trying
to mount an alreayd mounted filesystem, so it is reflecting the same
error case. Hence this mismatch in errors should not fail the test.

Catch this mount error and convert it to the expected
"already mounted" error for the tests that exercise this behaviour.

There is also a minor test change here to push mount failure
information to $seqres.full in the cases where mount errors occur.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/310: cleanup killing background processes

Use the trick we used with fsstress of copying the binary to a test
specific name so that we can simply use pkill to reliably kill the
background processes this test runs. Also use SIGPIPE to avoid
bash from throwing out "Killed" errors.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: scale some tests for high CPU count sanity

Several tests use lots of processes to stress the filesystem. many
of them haven't really considered what this means for running the
test on high CPU machines (e.g. >32p) and the potential contention
and performance issues this might trigger.

Some of these tests simply need to increase the size of the journal.
Some need to run on filesystems with high inherent concurrency (e.g.
larger AG count). Some need more efficient/faster file creation. And
so on.

This commit is a collection of those sorts of changes to improve
runtimes on high CPU count machines.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: stop using /tmp directly

Tests should be using $tmp, not /tmp. this causes problems when
multiple tests all use /tmp/foo as a temporary test state file
and then step on each other.

Note that there are some tests that use /tmp to store "test stop"
files for background processes. Those that have proven to be
unreliable at stopping tests when interrupted by ctrl-c are also
updated to track and kill background processes in the cleanup
function.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

dmesg: reduce noise from other tests

dmesg records everything from every test concurrently running, so
noise from other tests can cause multiple other tests to fail
because they detect something from another test. Update the filter
behaviour to minimise this crosstalk problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

quota: system project quota files need to be shared

Tests that treat them as exclusively owned end up tripping over
other tests that do the same. Fix this by using append and filter
techniques to update the files, then using different project quota
ids for each test.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/127: reduce runtime

...
generic/127 684
...

This takes a long time to run because it runs 6 individual
invocations of fsx sequentially. Make them run concurrently
as they can operate on separate files.

...
generic/127 168
...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: remove uses of killall where possible

there are many unnecessary uses of killall and stale checks for it's
existence. Parallel check execution means killall is considered
harmful, so get rid of these unneccesary uses.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/177: remove unused slab object count location checks

Stale code; we count XFS inodes through the sysfs stats code now
so remove it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/176: fix broken setup code

The test does not pass the mkfs output through the mkfs filter, so
the inode size is not set up correctly. Hence it calculates the
CHUNK_SIZE as 0, and it ends up getting stuck in an endless loop
throwing ENOSPC errors because the offset never changes.

While there, use 'echo -n' rather than 'touch' to create zero length
files much faster.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/442: rescale load so it's not exponential

....
xfs/442 491
....

xfs/442 takes a long time to run because it is scaling the load
by the number of processes it is going to run on twice. It scales
the number of operations by the number of processes it is going to
run, meaning that doubling the number of processes quadruples the
runtime.

Reduce it to scale linearly by fixing the number of ops it runs per
process.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: use udevadm wait in preference to settle

When running lots of tests in parallel, there are lots of
filesystems and block devices changing state. This generates a lot
of udev events when means the udev event queue is rarely empty.
Unfortunately, an empty event queue is what udev settling waits
upon. Hence calling UDEV_SETTLE_PROG can mean waiting for a lot of
time for other tests to stop generating udev events.

For the majority of cases, what we care about is that udev has
performed device node addition or removal, not that there are no
udev events pending. Recent(-ish) systemd releases support 'udevadm
wait' to wait for a specific file to be created or unlinked rather
than waiting for the event that does that work to be completed.

Hence we don't have to wait for the udev event queue to empty,
just for the udev event that does the device node manipulation to
complete.

Introduce detection of 'udevadm wait' support and a _udev_wait()
wrapper function to use it if it is available. If it isn't, the use
the existing UDEV_SETTLE_PROG behaviour.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: mark tests that are unreliable when run in parallel

Add a group named "unreliable_in_parallel" to mark tests that
do not give reliable results when multiple tests are run in
parallel. Generally this happens with tests that are reliant on
caching in some way, such as generating specific file layouts using
buffered IO or expecting inodes to be cached in memory. These are
perturbed by other tests running sync(), generating memory pressure,
dropping caches, etc.

Hence whether these tests pass or fail is wholly dependent on what
tests are running at the same time, and hence randomly fail when
nothing has actually gone wrong. Hence they are unreliable as
regression tests when running tests in parallel, so we add them to
the "unreliable_in_parallel" group and a parallel check can exclude
this group.

As tests are updated to be robust against external interference,
they can be removed from the unreliable_in_parallel group.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: xfs/227 is really slow

The slowest test tto run on my test VMs is xfs/227:

...
xfs/227 826
...

It is doing nested iteration on created filesets that are explicitly
defined, so separate the inner loop filesets and run the outer loops
in parallel.

Also reduce the number of times we have to execute setfattr and
xfs_io to once per created file instead of once per xattr/extent
count per file.

The result is test runtime reduction of ~60%.

....
xfs/227 336
....

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: clean up loop device instantiation

Lots of tests do there own special thing with loop devices rather
than using _create_loop_device() and _destroy_loop_device(). This
oftens means they do not clean up after themselves properly,
leaving stale loop devices around that result in unmountable test or
scratch devices. This is common when tests are killed by user
interrupt.

Even the tests that do use _destroy_loop_device and try to clean up
often do it incorrectly, leading to spurious error messages.

Some tests try to use dynamic instantiation via "mount -o loop",
but then don't clean up in the correct order or hack around to find
the loop device that was instantiated because the test needs to know
the instantiated device name

Clean this up by converting all the tests to use
_create_loop_device() and _destroy_loop_device(). In all the tests,
use the variable "loop_dev" for the device consistently. In
_destroy_loop_device(), test that a device name has been passed
so that we don't try to clean up the same device twice (e.g. once
before test exit and again from the _cleanup() function). When we
destroy a loop device, unset the variable used to hold the loop
device name so that we don't try to destroy it twice.

This results in much more reliable cleanup and clean exit from
fstests when killed by the user.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: clean up mount and unmount operations

The way tests run unmount is, at times, completely random.
Sometimes they call the correct _scratch_unmount function, sometimes
they open code it with a direct call to UMOUNT_PROG <dir>, sometimes
they run umount directly.

This makes it really hard to instrument unmount operations when
trying to work out why transient, unpredictable failures like
this occur randomly during a test run:

umount: /mnt/xfs/runner-17/test: target is busy.

Sometimes it happens on a test device mount, sometimes a scratch
device mount. Sometimes it happens to a test specific dm or loop
device mount. But without instrumenting every single unmount call in
every test, it's impossible to capture these failures easily.

Solve this problem by introducing the _unmount() wrapper. It is
simply a call to UMOUNT_PROG <dir>, but it provides a single point
were -every- unmount operation funnels through.

We already have a _mount wrapper for this reason. However, in trying
to work out why mounts were failing (because unmounts were failing),
I discovered that that_mount() is used inconsistently as well.

Sort this all out by adding and _unmount() wrapper to go with
_mount() and use them everywhere consistently.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: use syncfs rather than sync

sync(1) is a system wide sync and is implemented by iterating all
the superblocks in the system. In most cases, fstests require just
the filesystem under test to be synced - we require syncfs(2)
semantics but what we use is sync(2) semantics.

The result of this is that when running many concurrent fstests at
the same time, we can have *hundreds* of concurrent sync operations
in progress (thanks fsstress!) and this causes excessive
interference with other tests that are running on other filesystems.

For example, some tests try to specifically control extent layout
via specific write and fsync patterns. All these global syncs
perturb them and cause them to spuriously fail.

A random snapshot of running concurrent tests shows just how many
tests are explicitly blocked in sync(1):

check-parallel───check───077───077─┬─cut
      │                                    ├─du
      │                                    └─tail
      ├─check-parallel───check───311───xfs_scrub───{xfs_scrub}
      ├─check-parallel───check───531───128*[t_open_tmpfiles]
      ├─check-parallel───check───227
      ├─check-parallel───check───388
      ├─check-parallel───check───070───fsstress───fsstress───{fsstress+
      ├─check-parallel───check───232───fsstress───7*[fsstress───{fsstr+
      ├─check-parallel───check───648───sleep
      ├─check-parallel───check───409───sync
      ├─check-parallel───check───683───sync
      ├─check-parallel───check───013─┬─013───sleep
      │                              └─fsstress───2*[fsstress───{fsstr+
      ├─check-parallel───check───684───sync
      ├─check-parallel───check───673───sync
      ├─check-parallel───check───118───dd
      ├─check-parallel───check───467───open_by_handle
      ├─check-parallel───check─┬─622
      │                        └─check
      ├─check-parallel───check───685───sync
      ├─check-parallel───check───049───fsstress───fsstress───{fsstress+
      ├─check-parallel───check───599
      ├─check-parallel───check───426───open_by_handle
      ├─check-parallel───check───057───umount
      ├─check-parallel───check───390───fsstress─┬─18*[fsstress───{fsst+
      │                                         └─fsstress
      ├─check-parallel───check───158───fsstress───fsstress───{fsstress+
      ├─check-parallel───check───017
      ├─check-parallel───check───032───fsstress───fsstress
      ├─check-parallel───check───076
      ├─check-parallel───check───477───open_by_handle
      ├─check-parallel───check───170───2*[170───170]
      ├─check-parallel───check───112
      ├─check-parallel───check───686───sync
      ├─4*[check-parallel───check───check───xfs_scrub───{xfs_scrub}]
      ├─check-parallel───check───387───xfs_io───{xfs_io}
      ├─check-parallel───check───615───615
      ├─check-parallel───check─┬─051
      │                        └─check───xfs_repair
      ├─check-parallel───check───049
      ├─check-parallel───check───247
      ├─check-parallel───check───674───sync
      ├─check-parallel───check───040
      ├─check-parallel───check───560───fsstress───fsstress───{fsstress+
      ├─check-parallel───check───030─┬─030─┬─030───xfs_repair
      │                              │     └─030───perl
      │                              ├─sed
      │                              └─uniq
      ├─check-parallel───check───055───055
      ├─2*[check-parallel───check───check]
      ├─check-parallel───check───042
      ├─check-parallel───check───204
      ├─check-parallel───check───271─┬─271───sed
      │                              └─md5sum
      ├─check-parallel───check───091─┬─fsx
      │                              └─tee
      ├─check-parallel───check───063───sleep
      ├─check-parallel───check───026
      ├─check-parallel───check───459───lvm
      ├─check-parallel───check───495
      ├─check-parallel───check───141───fsstress───4*[fsstress]
      ├─check-parallel───check───011─┬─fsstress─┬─fsstress───{fsstress+
      │                              │          └─fsstress
      │                              └─sleep
      ├─check-parallel───check───328───sync
      ├─check-parallel───check───507───507
      ├─check-parallel───check
      ├─check-parallel───check───687───sync
      ├─check-parallel───check───109───mkfs.xfs
      ├─check-parallel───check───324
      ├─check-parallel───check───114───aio-dio-eof-rac
      └─check-parallel───check───503───xfs_scrub───2*[{xfs_scrub}]

There are ~10 sync(1) calls blocked and at least half of the 50-odd
fsstress processes currently running are also going to be stuck in
sync(2) calls.

They are stuck because the superblock iteration has to wait for
mount, unmount, freeze, thaw and any other operation that locks a
superblock exclusively. When running dozens of tests concurrently,
there can be tens of superblocks that are locked exclusively for
every second for significant lengths of time.

Hence the use of sync has impact on both performance and test
behaviour and we need to minimise the amount of sync(1) and
sync(2) usage as much as possible.

Introduce _test_sync() and _scratch_sync() so we can implement
a syncfs mechanism with a fallback to sync(1) if it is not supported
without dirtying all the test code unnecessarily. Then convert
fsstress to use syncfs(2) in preference to sync(2).

This commit changes all the generic and XFS tests to use the new
sync functions, other filesystem specific tests will eventually
need to be converted to avoid similar problems.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: fix DM device creation/removal vs udev races

When there is load on the system, newly created DM devices don't
seem to be created consistently. When a new device is created,
it is supposed to be created as /dev/dm-X, and then a udev rule
creates the symlink from /dev/mapper/<dev name> to /dev/dm-X.

Unfortunately, a lot of the tests that use dynamically created dm
devices (dmerror, dmflakey) are not being created with this device
node structure. This is resulting in getting the wrong short device
name for the block device and hence we can't find the filesystem
sysfs attribute directory for the filesystem on that block device.

For example, with added debug to check what device name was being
passed around and resolved:

eneric/489       - output mismatch (see /mnt/xfs/runner-10/results/xfs/generic/489.out.bad)
    --- tests/generic/489.out   2022-12-21 15:53:25.503043574 +1100
    +++ /mnt/xfs/runner-10/results/xfs/generic/489.out.bad      2024-10-24 10:27:29.767196340 +1100
    @@ -1,4 +1,10 @@
     QA output created by 489
    +./common/rc: line 4955: /sys/fs/xfs/flakey-test.489/error/fail_at_unmount: No such file or directory
    +dev: /dev/mapper/flakey-test.489
    +resolved dev: /dev/mapper/flakey-test.489
    +brw-rw----. 1 root disk 251, 5 Oct 24 10:27 /dev/mapper/flakey-test.489
    +./common/rc: line 4955: /sys/fs/xfs/flakey-test.489/error/metadata/EIO/max_retries: No such file or directory
    +./common/rc: line 4955: /sys/fs/xfs/flakey-test.489/error/metadata/EIO/retry_timeout_seconds: No such file or directory
    ...
    (Run 'diff -u /home/dave/src/xfstests-dev/tests/generic/489.out /mnt/xfs/runner-10/results/xfs/generic/489.out.bad'  to see the entire diff)

Here we see that the block device node is actually at
/dev/mapper/flakey-test.489, not a link to a /dev/dm-X device node.

This implies that the udev rule to create the /dev/dm-X node and
the symlink to it at /dev/mapper/flakey-test.489 has not run, and
something else created the device node.

That looks like a bug in _dmsetup_create(). It creates the new DM
device, then runs 'dmsetup mknodes', then waits for udev to settle.
This means the mknodes command - which makes sure the dm device
nodes exist - is racing with udev to create the device nodes. They
don't use the same rules to create nodes, so we end up with this
broken situation.

'dmsetup mknodes' is considered legacy functionality, intended for
systems that have no udev capability. For systems that have udev
enabled (i.e. all modern distros), mknodes should not be run because
it creates a different device node structure to what udev creates
and can race with udev as we see here.

Fix it by removing the 'dmsetup mknodes' as it is unnecessary to
create the correct device node layout the rest of the system is
expecting to see.

Additionally,_dmsetup_remove() calls 'dmsetup mknodes' and that can
also race with udev and cause issues. Hence we need to remove that
call from the remove operation as well.

Further, 'dmsetup remove' is also subject to races with udev which
results in device remove failing.  This problem is documented in the
dmsetup man page and suggests the use of the "--retry" option. This
means dmsetup will retry several times over a few seconds before
failing the removal.

This reduces the remove failure rate substantially,
but it can still occasionally fail when the system is under heavy
load and udev processing is very slow. This is fixable, but requires
fstests udev infrastructure changes as it requires udevadm
functionality that is relatively new. Hence that will be done as
a separate fix.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: per-test dmdelay instances

We can't run two tests that use dmdelay at the same time because
the device name is the same. hence they interfere with each other.
Give dmdelay devices their own per-test names to avoid this
problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: per-test dmdust instances

We can't run two tests that use dmdust at the same time because
the device name is the same. hence they interfere with each other.
Give dmdust devices their own per-test names to avoid this
problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: per-test dmthin instances

We can't run two tests that use dmthin at the same time because
the device name is the same. hence they interfere with each other.
Give dmthin devices their own per-test names to avoid this
problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: per-test dmhuge instances

We can't run two tests that use dmhuge at the same time because
the device name is the same. hence they interfere with each other.
Give dmhuge devices their own per-test names to avoid this
problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: per-test dmerror instances

We can't run two tests that use dmerror at the same time because
the device name is the same. hence they interfere with each other.
Give dmerror devices their own per-test names to avoid this
problem.

Note that we need a hack to pass the test sequence number through
to src/dmerror as used by generic/441 so that it can construct the
dmerror name correctly.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: per-test dmflakey instances

We can't run two tests that use dmflakey at the same time because
the device name is the same. hence they interfere with each other.
Given dmflakey devices their own per-test names to avoid this
problem.

Also, drop_and_remount is about to fail the fs during unmount, so
ensure the filesystem is going to fail the IO during unmount rather
than retrying forever.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fuzzy: don't use killall

Having test cleanup call 'killall xfs_io fsx xfs_scrub' results in a
system wide process kill, rather than just the processes the test is
running directly.

Make sure we only kill processes the fuzz test directly owns. We can
do this with 'pkill --parent $$ <process names>' to limit the search
for processes to kill to just the children of the current process.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: cleanup fsstress process management

Lots of tests run fsstress in the background and then have to kill
it and/or provide special cleanup functions to kill the background
fsstress processes. They typically use $KILLALL_PROG for this.

Use of killall is problematic for running multiple tests in parallel
in that one test can kill other tests' processes. However, because
fsstress itself forks and runs children, there are very few avenues
for shell scripts to ensure all the fsstress processes actually die.

With bash, it is especially nasty, because sending SIGTERM will
result in bash outputting error messages ("Killed: ..." that will
cause golden output mismatches and hence test failures. Hence we
also need to be able to tell the main fstress process to die without
triggering these messages.

To avoid the process tracking problems, we change to use pkill
rather than killall (more options for process selection) and we
stop using the $here/ltp/fsstress binary. Instead, we copy the
$here/ltp/fsstress to $TEST_DIR/$seq.fsstress so that the test has
a unique fsstress binary name. This allows the pkill filter to
select just the fsstress processes the test has run. The fsstress
binary name is held in _FSSTRESS_NAME, and the program to run is
_FSSTRESS_PROG.

We also track the primary fsstress process ID, and store that in
_FSSTRESS_PID. We do this so that we have a PID to wait against so
that we don't return before the fsstress processes are dead. To this
end, we add a SIGPIPE handler to the primary process so that it
dying doesn't trigger bash 'killed' message output. We can
send 'pkill -PIPE $_FSSTRESS_NAME' to all the fsstress processes and
the primary process will then enter the "wait for children to die"
processing loop before it exits. In this way, we can wait for the
primary fsstress process and when it exits we know that all it's
children have also finished and gone away. This makes killing
fsstress invocations reliable and noise free.

This is accomplished by the helpers added to common/rc:

_run_fsstress
_run_fsstress_bg
_wait_for_fsstress
_kill_fstress

This also means that all fsstress invocations now obey
FSSTRESS_AVOID environment restrictions, many of which didn't.

We add a call to _kill_fstress into the generic _cleanup() function.
This means that tests using fsstress don't need to add a special
local _cleanup function just to call _kill_fsstress() so that
background fsstress processes are killed when the user interrupts
the tests with ctrl-c.

Further, killall in the _cleanup() function is often used to attempt
to expedite killing of foreground execution fsstress processes. This
doesn't actually work because of the way bash processes interupt
signals. That is, it waits for the currently executing process to
finish execution, then runs the trap function. Hence a foreground
fsstress won't ever be interrupted by ctrl-c. By implementing
_run_fsstress() as a background process and a wait call, the wait()
call is interrupted by the signal and the cleanup trap is run
immediately. Hence the fsstress processes are killed immediately and
the test exits cleanly almost immediately.

The result of all this is common, clean handling of fsstress
execution and termination. There are a few exceptions for special
cases, but the vast majority of tests that run fsstress use the
above four wrapper functions exclusively.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/448: get rid of assert-on-failure

The bug this problem exercises has been fixed for quite some time,
but the test does not run on XFS debug kernels that have fatal
asserts enable. There is no reason for this now that the test does
not assert fail on most kernels regularly tested, so kill the
check and enable the test to run on all XFS configs.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfstests: add a multithreaded mode to bstat

For benchmarking of bulkstat, add a multithreaded mode that spawns a
thread per AG and runs bulkstat on every AG in parallel. There is a
small amount of overlap between each AG because of the way the
interface works only on inode numbers, so some inodes are reported
twice. A real implementation of this sort of parallelism would be
greatly helped by adding an AG parameter to the bulkstat interface.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/327: add a test case to verify inline extent data read

[BUG]
When developing sector size < page size handling for btrfs, I'm hitting
a data corruption, which is only possible with the following out-of-tree
patches:

  btrfs: allow inline data extents creation if sector size < page size
  btrfs: allow buffered write to skip full page if it's sector aligned

[CAUSE]
Thankfully no upstream kernels are affected, even if some one is
mounting a btrfs created by x86_64 with inlined data extents, they won't
hit the corruption.

The root cause is that when reading inline extents, we zero out the
whole remaining range until folio end.

This means such zeroing out can cover ranges that is dirtied but not yet
written back, thus lead to data corruption.

This needs all the following conditions to be met:

- Sector size < page size
  So no x86_64 is affected. The most common users should be Asahi Linux.
  But they are safe due to the next two conditions.

- Inline data extents are present
  For sector size < page size cases, we do not allow creating new inline
  data extents but only reading it.

  But even all above cases are met by using a x86_64 created btrfs with
  inlined data extents, the next point will still save us.

- Partial uptodate folios are allowed
  This requires the out-of-tree patch "btrfs: allow buffered write to skip
  full page if it's sector aligned", or buffered write will read out the
  whole folio before dirting any range.

So end users are completely safe.

[TEST CASE]
The test case itself is pretty straightforward:

- Buffered write [0, 4k)
- Drop all page cache
- Buffered write [8k, 12k)
- Verify the file content

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: fix blksize_t printf format warnings across architectures

Fix format string warnings when printing blksize_t values that vary
across architectures. The warning occurs because blksize_t is defined
differently between architectures: aarch64 architectures blksize_t is
int, on x86-64 it's long-int. Cast the values to long. Fixes warnings
as below.

seek_sanity_test.c:110:45: warning: format '%ld' expects argument of type
'long int', but argument 3 has type 'blksize_t' {aka 'int'}

attr_replace_test.c:70:22: warning: format '%ld' expects argument of type
'long int', but argument 3 has type '__blksize_t' {aka 'int'}

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/459: prevent collisions between test VMs backed by a shared disk pool

If you happen to be running fstests on a bunch of VMs and the VMs all
have access to a shared disk pool, then it's possible that two VMs could
be running generic/459 at exactly the same time.  In that case, it's a
VERY bad thing to have two nodes trying to create an LVM volume group
named "vg_459" because one node will succeed, after which the other node
will see the vg_459 volume group that it didn't create:

  A volume group called vg_459 already exists.
  Logical volume pool_459 already exists in Volume group vg_459.
  Logical Volume "lv_459" already exists in volume group "vg_459"

But then, because this is bash, we don't abort the test script and
continue executing.  If we're lucky this fails when /dev/vg_459/lv_459
disappears before mkfs can run:

  Error accessing specified device /dev/mapper/vg_459-lv_459: No such file or directory
  Usage: mkfs.xfs

But in the bad case both nodes write filesystems to the same device and
then they trample all over each other.  Fix this by adding the hostname
and pid to all the LVM names so that they won't collide.

Fixes: 461dad511f6b91 ("generic: Test filesystem lockup on full overprovisioned dm-thin")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/122: add tests for commitrange structures

Update this test to check the ioctl structure for XFS_IOC_COMMIT_RANGE,
which was added in 6.12. This will be the last ever addition to
xfs/122, because in 6.13 we moved the ondisk structure checks to libxfs
after which we'll be able to _notrun this test on newer codebases.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/454: actually set attr value for llamapirate subtest

Ted reported that this test fails on his setup, and I noticed that I
forgot to actually set a value for the xattr. In theory filesystems
support zero-byte xattrs, but we might as well set and check the values
so that we can make sure nobody got confused.

The actual test failure comes from attr 2.4.47 refusing to set a
zero-legnth xattr, whereas 2.5 and newer will. That was changed in the
attr commit 0550d2bc989d39 ("Properly set and report empty attribute
values") prior to 2.4.48:

https://git.savannah.nongnu.org/cgit/attr.git/commit/?id=0550d2bc989d390eb25f7004ee0fae2dbc693a0d

Cc: fstests@vger.kernel.org # v2024.10.28
Fixes: 9c3762ceafd430 ("misc: amend unicode confusing name tests to check for hidden tag characters")
Reported-and-tested-by: tytso@mit.edu
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/366: fix directio requirements checking

On a system with 4k-sector storage devices, this test fails with:

--- /tmp/fstests/tests/generic/366.out 2024-11-17 09:04:53.161104479 -0800
+++ /var/tmp/fstests/generic/366.out.bad 2024-11-20 21:02:30.948000000 -0800
@@ -1,2 +1,34 @@
QA output created by 366
+fio: io_u error on file /opt/file1: Invalid argument: read offset=15360, buflen=512
+fio: io_u error on file /opt/file1: Invalid argument: read offset=15360, buflen=512

The cause of this failure is that we cannot do 512byte directios to a
device with 4k LBAs. Update the precondition checking to exclude this
scenario.

Cc: fstests@vger.kernel.org # v2024.11.17
Fixes: 4c1629ae3a3a56 ("generic: new test case to verify if certain fio load will hang the filesystem")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/157: do not drop necessary mkfs options

To give the test option "-L oldlabel" to _scratch_mkfs_sized, xfs/157
does:

  MKFS_OPTIONS="-L oldlabel $MKFS_OPTIONS" _scratch_mkfs_sized $fs_size

but the _scratch_mkfs_sized trys to keep the $fs_size, when mkfs
fails with incompatible $MKFS_OPTIONS options, likes this:

  ** mkfs failed with extra mkfs options added to "-L oldlabel -m rmapbt=1" by test 157 **
  ** attempting to mkfs using only test 157 options: -d size=524288000 -b size=4096 **

but the "-L oldlabel" is necessary, we shouldn't drop it. To avoid
that, we give the "-L oldlabel" to _scratch_mkfs_sized through
function parameters, not through global MKFS_OPTIONS.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: fix more string quoting issues]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: _scratch_mkfs_sized supports extra arguments

To give more arguments to _scratch_mkfs_sized, we generally do as:

  MKFS_OPTIONS="-L oldlabel $MKFS_OPTIONS" _scratch_mkfs_sized $fs_size

to give "-L oldlabel" to it. But if _scratch_mkfs_sized fails, it
will get rid of the whole MKFS_OPTIONS and try to mkfs again.
Likes:

  ** mkfs failed with extra mkfs options added to "-L oldlabel -m rmapbt=1" by test 157 **
  ** attempting to mkfs using only test 157 options: -d size=524288000 -b size=4096 **

But that's not the fault of "-L oldlabel". So for keeping the mkfs
options ("-L oldlabel") we need, we'd better to let the
scratch_mkfs_sized to support extra arguments, rather than using
global MKFS_OPTIONS.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: fix string quoting issues]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/251: don't copy the fsstress source code

Run fsstress for a short time to generate test data to replicate on the
scratch device so that we don't blow out the test runtimes on
unintentionally copying .git directories or large corefiles from the
developer's systems, etc.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/251: constrain runtime via time/load/soak factors

On my test fleet, this test can run for well in excess of 20 minutes:

   613 generic/251
   616 generic/251
   624 generic/251
   630 generic/251
   634 generic/251
   652 generic/251
   675 generic/251
   749 generic/251
   777 generic/251
   808 generic/251
   832 generic/251
   946 generic/251
  1082 generic/251
  1221 generic/251
  1241 generic/251
  1254 generic/251
  1305 generic/251
  1366 generic/251
  1646 generic/251
  1936 generic/251
  1952 generic/251
  2358 generic/251
  4359 generic/251
  5325 generic/251
34046 generic/251

because it hardcodes 20 threads and 10 copies.  It's not great to have a
test that results in a significant fraction of the total test runtime.
Fix the looping and load on this test to use LOAD and TIME_FACTOR to
scale up its operations, along with the usual SOAK_DURATION override.
That brings the default runtime down to less than a minute.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/251: use sentinel files to kill the fstrim loop

Apparently the subshell kill doesn't always take, and then the test runs
for hours and hours because nothing stops it. Instead, use a sentinel
file to detect when fstrim_loop should stop execing background fstrims.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/009: allow logically contiguous preallocations

The new rtgroups feature implements a simplistic rotor to pick the
rtgroup for an initial allocation to a file.  This causes test failures
if the preallocations are spread across two rtgroups, which happens if
there are more subtests than rtgroups.

One way to fix this would be to reset the rotor then each subtest starts
allocating from rtgroup 0, but the only way to do that is to cycle the
scratch mount, which is a bit gross.

Instead, report logically contiguous mappings as a single mapping even
if the physical space is not contiguous.  Unfortunately, there's not
enough context in the comments to know if the test actually was checking
for physical contiguity?  Or if this is just an exerciser of the old
preallocation calls, and it's fine as long as the file ranges are mapped
(or unmapped) as desired.

Messing with some awk is a lot cheaper than umount/mount cycling.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/163: skip test if we can't shrink due to enospc issues

If this test fails due to insufficient space, skip this test. This can
happen if a realtime volume is enabled on the filesystem and we cannot
shrink due to the rtbitmap.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/562: handle ENOSPC while cloning gracefully

This test creates a couple of patterned files on a tiny filesystem,
fragments the free space, clones one patterned file to the other, and
checks that the entire file was cloned.

However, this test doesn't work on a 64k fsblock filesystem because
we've used up all the free space reservation for the rmapbt, and that
causes the FICLONE to error out with ENOSPC partway through. Hence we
need to detect the ENOSPC and _notrun the test.

That said, it turns out that XFS has been silently dropping error codes
if we managed to make some progress cloning extents. That's ok if the
operation has REMAP_FILE_CAN_SHORTEN like copy_file_range does, but
FICLONE/FICLONERANGE do not permit partial results, so the dropped error
codes is actually an error.

Therefore, this testcase now becomes a regression test for the patch to
fix that.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: capture dmesg when oom kills happen

Capture the dmesg output if the OOM killer is invoked during fstests.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/508: fix test for 64k blocksize

It turns out that icreate transactions will try to reserve quite a bit
of space on a 64k fsblock filesystem -- enough to handle the worst case
parent directory expansion, a new inode chunk, and these days a parent
pointer as well. This can work out to quite a bit of space:

fsblock reservation
1k 172K
4k 368K
16k 1136K
64k 3650K

Unfortunately, this test sets its block quota limits at 1-2MB, so we
can't even create a child file. Bump the limits up by 10x so that this
test will pass even if there's more metadata size creep in the future.

Fixes: f769a923f576df ("xfs: project quota ineritance flag test")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/113: fix failure to corrupt the entire directory

This test tries to corrupt the data blocks of a directory, but it
doesn't take into account the fact that __populate_check_xfs_dir can
remove enough entries to cause sparse holes in the directory.  If that
happens, this "file data block is unmapped" logic will cause the
corruption loop to exit early.  Then we can add to the directory, which
causes the test to fail.

Instead, create a list of mappable dir block offsets, and run 100
corruptions at a time to reduce the amount of time we spend initializing
xfs_db.  This fixes the regressions that I see with 32k/64k block sizes.

Cc: fstests@vger.kernel.org # v2022.05.01
Fixes: c8e6dbc8812653 ("xfs: test directory metadata corruption checking and repair")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/757: convert to thinp

Convert this test to use dm-thinp so that discards always zero the data.
This prevents weird replay problems if the scratch device doesn't
guarantee that read after discard returns zeroes.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/757: fix various bugs in this test

Fix this test so the check doesn't fail on XFS, and restrict runtime to
100 loops because otherwise this test takes many hours.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/028: kill lingering processes when test is interrupted

If we interrupt the test after it spawned the fsstress and balance
processes (while it's sleeping for 30 seconds * $TIME_FACTOR), we don't
kill them and they stay around for a long time, making it impossible to
unmount the scratch filesystem (failing with -EBUSY).

Fix this by adding a _cleanup function that kills the processes and
waits for them to exit.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: Addition of new tests for extsize hints

This commit adds new tests that checks the behaviour of xfs/ext4
filesystems when extsize hint is set on file with inode size as 0,
non-empty files with allocated and delalloc extents and so on.
Although currently this test is placed under tests/generic, it
only runs on xfs and there is an ongoing patch series[1] to
enable extsize hints for ext4 as well.

[1] https://lore.kernel.org/linux-ext4/cover.1726034272.git.ojaswin@linux.ibm.com/

Reviewed-by Ritesh Harjani (IBM) <ritesh.list@gmail.com>

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Suggested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Nirjhar Roy <nirjhar@linux.ibm.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc: Add a new _require_scratch_extsize helper function

_require_scratch_extsize helper function will be used in the
the next patch to make the test run only on filesystems with
extsize support.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Nirjhar Roy <nirjhar@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/rc,xfs/207: Add a common helper function to check xflag bits

This patch defines a common helper function to test whether any of
fsxattr xflags field is set or not. We will use this helper in
an upcoming patch for checking extsize (e) flag.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Nirjhar Roy <nirjhar@linux.ibm.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/229: call on the test directory

xfs/229 operates on a directory that is forced to the data volume, but
it calls _require_fs_space on $TEST_DIR which might point to the RT
device when -d rtinherit is set.

Call _require_fs_space on $TDIR after it is created to check for the
space actually used by the test.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: "Hans Holmberg" <hans.holmberg@wdc.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/185: don't fail when rtfile is larger than rblocks

This test creates a 200MB rt volume on a file-backed loopdev.  However,
if the size of the loop file is not congruent with the rt extent size
(e.g.  28k) then the rt volume will not use all 200MB because we cannot
have partial rt extents.  Because of this rounding, we can end up with
an fsmap listing that covers fewer sectors than the bmap of the loop
file.

Fix the test to allow this case.

Cc: fstests@vger.kernel.org # v2022.05.01
Fixes: 410a2e3186a1e8 ("xfs: regresion test for fsmap problems with realtime")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/273: check thoroughness of the mappings

Enhance this test to make sure that there are no gaps in the fsmap
records, and (especially) that they we report all the way to the end of
the device.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/136: check for ext3 support

Test-case btrfs/136 requires ext3 support, so check for ext3 using
_require_extra_fs.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/casefold: Support for tmpfs casefold test

Test casefold support for tmpfs.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/321: make the filter to handle older btrfs-progs

[FALSE ALERT]
With much older distros like SLE12SP5, which is using btrfs-progs 4.5.3,
test case btrfs/321 fails like this:

btrfs/321       QA output created by 321
unable to locate the last csum tree leaf
(see /opt/xfstests/results//btrfs/321.full for details)
[failed, exit status 1]- output mismatch (see /opt/xfstests/results//btrfs/321.out.bad)
     --- tests/btrfs/321.out 2024-10-28 07:03:54.000000000 -0400
     +++ /opt/xfstests/results//btrfs/321.out.bad 2024-11-07 09:33:58.238442033 -0500
     @@ -1,2 +1,3 @@
      QA output created by 321
     -Silence is golden
     +unable to locate the last csum tree leaf
     +(see /opt/xfstests/results//btrfs/321.full for details)
     ...
     (Run diff -u /opt/xfstests/tests/btrfs/321.out /opt/xfstests/results//btrfs/321.out.bad  to see the entire diff)

[CAUSE]
The full output shows the regular csum tree as usual:

btrfs-progs v4.5.3+20160729
checksum tree key (CSUM_TREE ROOT_ITEM 0)
node 4247552 level 1 items 9 free 112 generation 7 owner 7
fs uuid 5623d533-ff79-4ddf-b9a1-7d359fa97c48
chunk uuid 0af5a7bd-d2d8-4146-ada8-444f2a2f5351
key (EXTENT_CSUM EXTENT_CSUM 20971520) block 4243456 (1036) gen 7
key (EXTENT_CSUM EXTENT_CSUM 25006080) block 4251648 (1038) gen 7
key (EXTENT_CSUM EXTENT_CSUM 29040640) block 4255744 (1039) gen 7
key (EXTENT_CSUM EXTENT_CSUM 33075200) block 4259840 (1040) gen 7
key (EXTENT_CSUM EXTENT_CSUM 37109760) block 4263936 (1041) gen 7
key (EXTENT_CSUM EXTENT_CSUM 41144320) block 4268032 (1042) gen 7
key (EXTENT_CSUM EXTENT_CSUM 45178880) block 4272128 (1043) gen 7
key (EXTENT_CSUM EXTENT_CSUM 49213440) block 4276224 (1044) gen 7
key (EXTENT_CSUM EXTENT_CSUM 53248000) block 4280320 (1045) gen 7
leaf 4243456 items 1 free space 30 generation 7 owner 7
fs uuid 5623d533-ff79-4ddf-b9a1-7d359fa97c48
chunk uuid 0af5a7bd-d2d8-4146-ada8-444f2a2f5351
item 0 key (EXTENT_CSUM EXTENT_CSUM 20971520) itemoff 55 itemsize 3940
extent csum item
[...]
leaf 4280320 items 1 free space 2722 generation 7 owner 7
fs uuid 5623d533-ff79-4ddf-b9a1-7d359fa97c48
chunk uuid 0af5a7bd-d2d8-4146-ada8-444f2a2f5351
item 0 key (EXTENT_CSUM EXTENT_CSUM 53248000) itemoff 2747 itemsize 1248
extent csum item
total bytes 25768755200
bytes used 34213888
uuid 5623d533-ff79-4ddf-b9a1-7d359fa97c48

But notice the header for each leaf, there is no flags for the leaf.
On newer btrfs-progs, the leaf header lines looks like this:

leaf 5423104 items 1 free space 2918 generation 7 owner CSUM_TREE
leaf 5423104 flags 0x1(WRITTEN) backref revision 1

It's two lines, not the old one line output.
The new behavior is introduced in btrfs-progs commit 9cc9c9ab3220
("btrfs-progs: print the eb flags for nodes as well"), included by v5.10
release.

So the test case doesn't handle older output format and failed to locate
the target leaf.

[FIX]
Instead of relying on the leaf flags line, use the much older
"leaf <bytenr> items" line as the filter target, so we can support much
older distros.

Reported-by: Long An <lan@suse.com>
Link: https://bugzilla.suse.com/show_bug.cgi?id=1233303
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

ext4/032: add a new testcase in online resize tests

Add a new testcase for [1] commit in ext4 online resize testsuite.

Link: https://lore.kernel.org/linux-ext4/20240927133329.1015041-1-libaokun@huaweicloud.com
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: a new test case to verify mount behavior with background remounting

[BUG]
When there is a process in the background remounting a btrfs, switching
between RO/RW, then another process try to mount another subvolume of
the same btrfs read-only, we can hit a race causing the RW mount to fail
with -EBUSY:

[CAUSE]
During the btrfs mount, to support mounting different subvolumes with
different RO/RW flags, we have a small hack during the mount:

  Retry with matching RO flags if the initial mount fail with -EBUSY.

The problem is, during that retry we do not hold any super block lock
(s_umount), this meanings there can be a remount process changing the RO
flags of the original fs super block.

If so, we can have an EBUSY error during retry.
And this time we treat any failure as an error, without any retry and
cause the above EBUSY mount failure.

[FIX]
The fix is already sent to the mailing list.
The fix is to allow btrfs to have different RO flag between super block
and mount point during mount, and if the RO flag mismatch, reconfigure
the fs to RW with s_umount hold, so that there will be no race.

[TEST CASE]
The test case will create two processes:

- Remounting an existing subvolume mount point
  Switching between RO and RW

- Mounting another subvolume RW
  After a successful mount, unmount and retry.

This is enough to trigger the -EBUSY error in less than 5 seconds.
To be extra safe, the test case will run for 10 seconds at least, and
follow TIME_FACTOR for extra loads.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add a test for defrag of contiguous file extents

Test that defrag merges adjacent extents that are contiguous.
This exercises a regression fixed by a patchset for the kernel that is
comprissed of the following patches:

btrfs: fix extent map merging not happening for adjacent extents
btrfs: fix defrag not merging contiguous extents due to merged extent maps

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: online grow vs. log recovery stress test (realtime version)

This is fundamentally the same as the previous growfs vs. log
recovery test, with tweaks to support growing the XFS realtime
volume on such configurations. Changes include using the appropriate
mkfs params, growfs params, and enabling realtime inheritance on the
scratch fs.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zorro Lang <zlang@redaht.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: online grow vs. log recovery stress test

fstests includes decent functional tests for online growfs and
shrink, and decent stress tests for crash and log recovery, but no
combination of the two. This test combines bits from a typical
growfs stress test like xfs/104 with crash recovery cycles from a
test like generic/388. As a result, this reproduces at least a
couple recently fixed issues related to log recovery of online
growfs operations.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/287: make the test work when compression is enabled

When running btrfs/287 with compression enabled (mount options), the test
fails because it expects to find 4M extents, however compression limits
the maximum size of extents to 128K, breaking the tests' expectations.

Example:

  $ MOUNT_OPTIONS="-o compress" ./check btrfs/287
  FSTYP         -- btrfs
  PLATFORM      -- Linux/x86_64 debian0 6.12.0-rc4-btrfs-next-177+ #1 SMP PREEMPT_DYNAMIC Thu Oct 24 17:14:37 WEST 2024
  MKFS_OPTIONS  -- /dev/sdc
  MOUNT_OPTIONS -- -o compress /dev/sdc /home/fdmanana/btrfs-tests/scratch_1

  btrfs/287 2s ... - output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/287.out.bad)
      --- tests/btrfs/287.out 2024-10-19 18:21:30.451644840 +0100
      +++ /home/fdmanana/git/hub/xfstests/results//btrfs/287.out.bad 2024-10-29 16:31:20.926612583 +0000
      @@ -25,22 +25,14 @@
       resolve first extent with ignore offset option:
       inode 257 offset 16777216 root 5
       inode 257 offset 8388608 root 5
      -inode 257 offset 2097152 root 5
       resolve first extent +1M offset:
      -inode 257 offset 17825792 root 5
      -inode 257 offset 9437184 root 5
      ...
      (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/287.out /home/fdmanana/git/hub/xfstests/results//btrfs/287.out.bad'  to see the entire diff)

  HINT: You _MAY_ be missing kernel fix:
        0cad8f14d70c btrfs: fix backref walking not returning all inode refs

  Ran: btrfs/287
  Failures: btrfs/287
  Failed 1 of 1 tests

Fix this by creating the two 4M extents with fallocate, so that the test
works regardless of compression being enabled or not.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

f2fs/007: add testcase to check consistency of compressed inode metadata

metadata of compressed inode should always be consistent after file
compression, reservation, releasement and decompression, let's add
a testcase to check it.

Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Qi Han <hanqi@vivo.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Zorro Lang <zlang@kernel.org>

f2fs/006: add testcase to check out-of-space case

This is a regression test to check whether f2fs handles dirty
data correctly when checkpoint is disabled, if lfs mode is on,
it will trigger OPU for all overwritten data, this will cost
free segments, so f2fs must account overwritten data as OPU
data when calculating free space, otherwise, it may run out
of free segments in f2fs' allocation function. If kernel config
CONFIG_F2FS_CHECK_FS is on, it will cause system panic, otherwise,
dd may encounter I/O error.

Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

f2fs/005: add testcase to check checkpoint disabling functionality

This patch introduce a regression test to check whether f2fs handles
dirty inode correctly when checkpoint is disabled in a corner case,
it may hang umount before the bug is fixed.

Cc: Qi Han <hanqi@vivo.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: new test case to verify if certain fio load will hang the filesystem

[BUG]
During the development to make btrfs pass generic/563 (which needs to
make btrfs to support partial folios), generic/095 causes hangs
during tests.

The call trace for the hanging process looks like this:

  __switch_to+0xf8/0x168
  __schedule+0x328/0x8a8
  schedule+0x54/0x140
  io_schedule+0x44/0x68
  folio_wait_bit_common+0x198/0x3f8
  __folio_lock+0x24/0x40
  extent_write_cache_pages+0x2e0/0x4c0 [btrfs]
  btrfs_writepages+0x94/0x158 [btrfs]
  do_writepages+0x74/0x190
  filemap_fdatawrite_wbc+0x88/0xc8
  __filemap_fdatawrite_range+0x6c/0xa8
  filemap_fdatawrite_range+0x1c/0x30
  btrfs_start_ordered_extent+0x264/0x2e0 [btrfs]
  btrfs_lock_and_flush_ordered_range+0x8c/0x160 [btrfs]
  __get_extent_map+0xa0/0x220 [btrfs]
  btrfs_do_readpage+0x1bc/0x5d8 [btrfs]
  btrfs_read_folio+0x50/0xa0 [btrfs]
  filemap_read_folio+0x54/0x110
  filemap_update_page+0x2e0/0x3b8
  filemap_get_pages+0x228/0x4d8
  filemap_read+0x11c/0x3b8
  btrfs_file_read_iter+0x74/0x90 [btrfs]
  new_sync_read+0xd0/0x1d0
  vfs_read+0x1a0/0x1f0

[CAUSE]
The root cause is a btrfs specific behavior that during a folio read, we
can trigger writeback of the same folio, which will try to lock the same
folio already locked by the read process.

The fix is already sent to the mailing list:
https://lore.kernel.org/linux-btrfs/62bf73ada7be2888d45a787c2b6fd252103a5d25.1729725088.git.wqu@suse.com/

This problem can only happen if all the following conditions are met:

- The sector size of btrfs is smaller than page size
  To have partial uptodate folios.

- Btrfs won't read the full folio if buffered write is block aligned
  This is done by the not yet merged patch:
  https://lore.kernel.org/linux-btrfs/ac2639ec4e9ac176d33e95ef7ecf008fa6be5461.1727833878.git.wqu@suse.com/

[TEST CASE]
During the debugging of that generic/095 hang, I extracted a minimal
reproducer which is much smaller and faster, although it still requires
several runs to trigger a hang.

The test case will run the fio workload 32 times by default, which is
more than enough to trigger the hang.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs: notrun if kernel xfs not supports ascii-ci feature

As the ascii-ci feature is deprecated, if linux build without the
CONFIG_XFS_SUPPORT_ASCII_CI, mount xfs with "-n version=ci" will
get EINVAL. So let's notrun if it's not supported by kernel.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: test remount with "compress" clears "compress-force"

Test that remounting with the "compress" mount option clears the
"compress-force" mount option previously specified.

This tests a regression introduced with kernel 6.8 and recently fixed by
the following kernel commit:

3510e684b8f6 ("btrfs: clear force-compress on remount when compress mount option is given")

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: increase file size to match CoW delayed allocation for XFS 64k bs

generic/305,326,328 have been failing for 32k and 64k blocksizes.

We do the following in the test 305 and 326 (highlighting only the part
that is related to failure):

- create a 1M test-1/file1
- reflink test-1/file2 and test-1/file3 based on test-1/file1
- Overwrite first half of test-1/file2 to do a CoW operation
- Expect the size of the test-1 dir to be 3M

The test is failing for 32k and 64k blocksizes as the number of blocks
(direct + delayed) is higher than number of blocks allocated for
blocksizes < 32k in XFS, resulting in size of test-1 to be more than 3M.
Though generic/328 has a different IO pattern, the reason for failure is
the same.

This is the failure output :
    --- tests/generic/305.out   2024-06-05 11:52:27.430262812 +0000
    +++ /root/results//64k_4ks/generic/305.out.bad      2024-10-23 10:56:57.643986870 +0000
    @@ -11,7 +11,7 @@
     CoW one of the files
     root 0 0 0
     nobody 0 0 0
    -fsgqa 3072 0 0
    +fsgqa 4608 0 0
     Remount the FS to see if accounting changes
     root 0 0 0

In these tests, XFS is doing a delayed allocation of
XFS_DEFAULT_COWEXTSIZE_HINT(32). Increase the size of the file so that
the CoW write(sz/2) matches the maximum size of the delayed allocation
for the max blocksize of 64k. This will ensure that all parts of the
delayed extents are converted to real extents for all blocksizes.

Even though this is not the most complete solution to fix these tests,
the objective of these tests are to test quota and not the effect of delayed
allocations.

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/219: use filesystem blocksize while calculating the file size

generic/219 was failing for XFS with 32k and 64k blocksize. Even though
we do only 48k IO, XFS will allocate blocks rounded to the nearest
blocksize.

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

f2fs/004: add missing _fixed_by_kernel_commit line

The bug related to this regression testcase has been fixed by commit
b2c160f4f3cf ("f2fs: atomic: fix to forbid dio in atomic_file"), let's
add missing _fixed_by_kernel_commit line for this testcase.

Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: btrfs/002: fix the OOM caused by too large block size

[BUG]
When running the test case btrfs/002, with 64K page size and 64K sector
size, and the VM doesn't have much memory (in my case 4G Vram), the test
case will trigger OOM and fail:

btrfs/002 4s ... [failed, exit status 1]- output mismatch (see /home/adam/xfstests-dev/results//btrfs/002.out.bad)
    --- tests/btrfs/002.out 2024-04-25 18:13:45.035555469 +0930
    +++ /home/adam/xfstests-dev/results//btrfs/002.out.bad 2024-10-12 17:19:48.785156223 +1030
    @@ -1,2 +1 @@
     QA output created by 002
    -Silence is golden
    ...

The OOM is triggered by the dd process, and a lot of dd processes are
using too much memory:

dd invoked oom-killer: gfp_mask=0x140dca(GFP_HIGHUSER_MOVABLE|__GFP_COMP|__GFP_ZERO), order=0, oom_score_adj=250
CPU: 0 UID: 0 PID: 185764 Comm: dd Not tainted 6.12.0-rc2-custom+ #76
Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
Tasks state (memory values in pages):
[  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
[ 185665]     0 185665     8688     3840     3840        0         0   458752     4832           250 dd
[ 185672]     0 185672     8688     2432     2432        0         0   393216     5312           250 dd
[ 185680]     0 185680     8688     2016     2016        0         0   458752     4960           250 dd
[ 185686]     0 185686     8688     2080     2080        0         0   458752     3584           250 dd
[ 185693]     0 185693     8688     2144     2144        0         0   458752     4384           250 dd
[ 185700]     0 185700     8688     2176     2176        0         0   458752     3584           250 dd
[ 185707]     0 185707     8688     1792     1792        0         0   524288     3616           250 dd
[ 185714]     0 185714     8688     2304     2304        0         0   458752     3488           250 dd
[ 185721]     0 185721     8688     1920     1920        0         0   458752     2624           250 dd
[ 185728]     0 185728     8688     2272     2272        0         0   393216     2528           250 dd
[ 185735]     0 185735     8688     2048     2048        0         0   393216     3552           250 dd
[ 185742]     0 185742     8688     1984     1984        0         0   458752     2816           250 dd
[ 185751]     0 185751     8688     1600     1600        0         0   458752     2784           250 dd
[ 185756]     0 185756     8688     1120     1120        0         0   458752     2400           250 dd
[ 185764]     0 185764     8688     1504     1504        0         0   393216     2240           250 dd
[ 185772]     0 185772     8688     1504     1504        0         0   458752     1984           250 dd
[ 185777]     0 185777     8688     1280     1280        0         0   393216     2336           250 dd
[ 185784]     0 185784     8688     2144     2144        0         0   393216     2272           250 dd
[ 185791]     0 185791     8688     2176     2176        0         0   458752      576           250 dd
[ 185798]     0 185798     8688     1696     1696        0         0   458752     1536           250 dd
[ 185806]     0 185806     8688     1728     1728        0         0   393216      544           250 dd
[ 185815]     0 185815     8688     2240     2240        0         0   458752        0           250 dd
[ 185819]     0 185819     8688     1504     1504        0         0   458752      384           250 dd
[ 185826]     0 185826     8688     1536     1536        0         0   458752      160           250 dd
[ 185833]     0 185833     8688     2944     2944        0         0   458752       64           250 dd
[ 185838]     0 185838     8688     2400     2400        0         0   458752        0           250 dd
[ 185847]     0 185847     8688      864      864        0         0   458752        0           250 dd
[ 185853]     0 185853     8688     1088     1088        0         0   393216        0           250 dd
[ 185860]     0 185860     8688      416      416        0         0   393216        0           250 dd
[ 185867]     0 185867     8688      352      352        0         0   458752        0           250 dd

[CAUSE]
The test case workload _fill_blk() is going to fill the file to its block
boundary.

But the implementation is not taking larger blocks into consideration.

FSIZE=`stat -t $i | cut -d" " -f2`
BLKS=`stat -c "%B" $i`
NBLK=`stat -c "%b" $i`
FALLOC=$(($BLKS * $NBLK))
WS=$(($FALLOC - $FSIZE))

$FSIZE is the file size, $BLKS is the size of each reported block,
$NBLK is the number of blocks the file takes, thus $FALLOC is the
rounded up block size.

For 64K sector size, the BLKS is 512, and NBLKS is 128 (one 64K sector).
$FALLOC is the correct value of 64K (10K rounded up to 64K).

Then the problem comes to how the write is done:

_ddt of=$i oseek=$FSIZE obs=$WS count=1 status=noxfer 2>/dev/null &

Unfrotunately the above command is using output block size of 54K, and
need to skip 10K * 54K bytes, resulting a file size of 540M.

So far although it's not the correct intention, it's not yet causing
problem.

But during _append_file(), we further enlarge the file by:

FSIZE=`stat -t $i | cut -d" " -f2`
dd if=$X of=$i seek=1 bs=$FSIZE obs=$FSIZE count=1 status=noxfer 2>/dev/null &

In above case, since the previous file is 540M size, the output block
size will also be 540M, taking a lot of memory.

Furthermore since the workload is run in background, we can have many dd
processes taking up at least 540M, causing huge memory usage and trigger
OOM.

[FIX]
The original code is already not doing what it should do, just get rid of
the cursed dd command usage inside _fill_blk(), and use pwrite from
xfs_io instead.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

misc: amend unicode confusing name tests to check for hidden tag characters

The Unicode consortium has twice defined (and later deprecated) special
"tag" codepoints. These tag codepoints are not supposed to be rendered
(i.e. they're invisible) but you can certainly encode them in
directories and labels to try to confuse users.

xfs_scrub already knows how complain about these tag characters because
libicu can detect both their presence and their use in confusing name
attacks, so add this as an explicit regression test.

Link: https://embracethered.com/blog/posts/2024/hiding-and-finding-text-with-unicode-tags/
Link: https://arstechnica.com/security/2024/10/ai-chatbots-can-read-and-write-invisible-text-creating-an-ideal-covert-channel/
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fstests: btrfs/330: enable the test case for both new and old APIs

[BUG]
If the mount tool is utilizing the new fs-based API
(e.g. util-linux 2.40.2 from Archlinux), btrfs' per-subvolume RO/RW mount
is broken again:

  # mount -o subvol=subv1,ro /dev/test/scratch1 /mnt/test
  # mount -o rw,subvol=subv2 /dev/test/scratch1  /mnt/scratch
  # mount | grep mnt
  /dev/mapper/test-scratch1 on /mnt/test type btrfs (ro,relatime,discard=async,space_cache=v2,subvolid=256,subvol=/subv1)
  /dev/mapper/test-scratch1 on /mnt/scratch type btrfs (ro,relatime,discard=async,space_cache=v2,subvolid=257,subvol=/subv2)
  # touch /mnt/scratch/foobar
  touch: cannot touch '/mnt/scratch/foobar': Read-only file system

[CAUSE]
Btrfs has an extra remount hack to handle above case, which will
re-configure the super block to be RW on the first RW mount.

The initial promise is, the new fd-based API will not set ro FLAG, but
only MOUNT_ATTR_RDONLY, so that btrfs will skip the remount hack for new
API based mount request.

However it's not the case, the first RO subvolume mount will set ro flag
at fsconfig(), and also set MOUNT_ATTR_RDONLY attribute for the mount
point:

  # strace  mount -o subvol=subv1,ro /dev/test/scratch1 /mnt/test/
  ...
  fsconfig(3, FSCONFIG_SET_STRING, "source", "/dev/mapper/test-scratch1", 0) = 0
  fsconfig(3, FSCONFIG_SET_STRING, "subvol", "subv1", 0) = 0
  fsconfig(3, FSCONFIG_SET_FLAG, "ro", NULL, 0) = 0
  fsconfig(3, FSCONFIG_CMD_CREATE, NULL, NULL, 0) = 0
  fsmount(3, FSMOUNT_CLOEXEC, 0)          = 4
  mount_setattr(4, "", AT_EMPTY_PATH, {attr_set=MOUNT_ATTR_RDONLY, attr_clr=0, propagation=0 /* MS_??? */, userns_fd=0}, 32) = 0
  move_mount(4, "", AT_FDCWD, "/mnt/test", MOVE_MOUNT_F_EMPTY_PATH) = 0

This will result exactly the same behavior,  no matter if it's the new
API or not.

Furthermore we can even have corner cases like mounting the initial RO
subvolume using the old API, then mount the RW subvolume using the new
API.

So even using the new API, there is no guarantee to keep the
per-subvolume RO/RW mount feature.
We have to do the reconfigure anyway.

[FIX]
The kernel fix is already submitted, but for the test case part, we
should enable btrfs/330 for all mount tools, no matter the API it
utilizes.

The only difference for the new API based mount is the new
_fixed_by_kernel_commit call, to show the proper fix.

Now it can properly detects the broken feature.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/012: fix false alerts when SELinux is enabled

[FALSE FAILURE]
If SELinux is enabled, the test btrfs/012 will fail due to metadata
mismatch:

FSTYP         -- btrfs
PLATFORM      -- Linux/x86_64 localhost 6.4.0-150600.23.25-default #1 SMP PREEMPT_DYNAMIC Tue Oct  1 10:54:01 UTC 2024 (ea7c56d)
MKFS_OPTIONS  -- /dev/loop1
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/loop1 /mnt/scratch

btrfs/012       - output mismatch (see /home/adam/xfstests-dev/results//btrfs/012.out.bad)
    --- tests/btrfs/012.out 2024-10-18 10:15:29.132894338 +1030
    +++ /home/adam/xfstests-dev/results//btrfs/012.out.bad 2024-10-18 10:25:51.834819708 +1030
    @@ -1,6 +1,1390 @@
     QA output created by 012
     Checking converted btrfs against the original one:
    -OK
    +metadata mismatch in /p0/d2/f4
    +metadata mismatch in /p0/d2/f5
    +metadata and data mismatch in /p0/d2/
    +metadata and data mismatch in /p0/
    ...

[CAUSE]
All the mismatch happens in the metadata, to be more especific, it's the
security xattrs.

Although btrfs-convert properly convert all xattrs including the
security ones, at mount time we will get new SELinux labels, causing the
mismatch between the converted and original fs.

[FIX]
Override SELINUX_MOUNT_OPTIONS so that we will not touch the security
xattrs, and that should fix the false alert.

Reported-by: Long An <lan@suse.com>
Link: https://bugzilla.suse.com/show_bug.cgi?id=1231524
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/161: adapt the test case for LBS filesystem

This test fails for >= 64k filesystem block size on a 4k PAGE_SIZE
system(see LBS efforts[1]). Adapt the blksz so that we create more than
one block for the testcase.

Cap the blksz to be at least 64k to retain the same behaviour as before
for smaller filesystem blocksizes.

[1] LBS effort: https://lore.kernel.org/lkml/20230915183848.1018717-1-kernel@pankajraghav.com/

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/xfs: _notrun tests that fail due to block size < sector size

It makes no sense to fail a test that failed to format a filesystem with
a block size smaller than the sector size since the test preconditions
are not valid.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic: add test for missing btrfs csums in log when doing async on subpage vol

Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
race could mean that csums weren't getting written to the log tree,
leading to corruption when it was replayed.

The patches to detect log this tree corruption are in btrfs-progs 6.11.

Signed-off-by: Mark Harmstone <maharmstone@fb.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: add test for cleaner thread under seed-sprout

We have a longstanding bug that creating a seed sprout fs with the
ro->rw transition done with

mount -o remount,rw $mnt

instead of

umount $mnt
mount $sprout_dev $mnt

results in an fs without BTRFS_FS_OPEN set, which fails to ever run the
critical btrfs cleaner thread.

This test reproduces that bug and detects it by creating and deleting a
subvolume, then triggering the cleaner thread. The expected behavior is
for the cleaner thread to delete the stale subvolume and for the list to
show no entries. Without the fix, we see a DELETED entry for the subvol.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: Zorro Lang <zlang@kernel.org>

src/fiexchange.h: add the start-commit/commit-range ioctls

Add these two ioctls as well, since they're a part of the file content
exchange functionality.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

fsstress: add support for FALLOC_FL_UNSHARE_RANGE

Teach fsstress to try to unshare file blocks on filesystems, seeing how
the recent addition to fsx has uncovered a lot of bugs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/config: fix RECREATE_TEST_DEV initialization

Do not allow the overwriting of the RECREATE_TEST_DEV variable. When
this variable is enabled, common/rc -> common/config will reset it
to false after the test device recreation process. This allows for
differentiation in mount options for SCRATCH and TEST.

Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/populate: fix bash syntax error in _fill_fs

In bash, one does not set a variable by prepending the dollar sign to
the variable name. Amazingly, this was copied verbatim from generic/256
in 2016 and hasn't been caught since its introduction in 2011. :(

Cc: allison.henderson@oracle.com
Fixes: 815015e9ee ("generic: make 17[1-4] work well when btrfs compression is enabled")
Fixes: b55fb0807c ("xfstests: Add ENOSPC Hole Punch Test")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/315: update filter to match mount cmd

Mount error info changed since util-linux v2.40
(91ea38e libmount: report failed syscall name).
So update _filter_mount_error() to match it.

Signed-off-by: An Long <lan@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/322: add git commit ID

The corresponding btrfs kernel patch was merged into Linus' tree and
included in kernel 6.12-rc2, so update the test with the commit ID.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs: update some tests to be able to run with btrfs-progs v6.11

In btrfs-progs v6.11 the output of the "filesystem show" command changed
so that it no longers prints blank lines. This happened with commit
4331bfb011bd ("btrfs-progs: fi show: remove stray newline in filesystem
show").

We have some tests that expect the blank lines in their golden output,
and therefore they fail with btrfs-progs v6.11.

So update the filter _filter_btrfs_filesystem_show to remove blank lines
and change the golden output of the tests to not expect the blank lines,
making the tests work with btrfs-progs v6.11 and older versions.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

btrfs/318: add _require_loop

btrfs/318 uses loopback devices, but was missing a call to _require_loop
to print the correct message if CONFIG_LOOP is not set.

Signed-off-by: Mark Harmstone <maharmstone@fb.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common/fail_make_request: fix error message

fail_make_request depends on the kernel option CONFIG_FAIL_MAKE_REQUEST
to function, not CONFIG_FAULT_INJECTION_DEBUG_FS.

Signed-off-by: Mark Harmstone <maharmstone@fb.com>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

generic/694: sync before sampling i_blocks

Without a sync there might still be temporary blocks in i_blocks like
indirect block reservations or additional blocks reserved for out of
place writes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

xfs/157,xfs/547,xfs/548: switch to using _scratch_mkfs_sized

These test cases specify small -d sizes which combined with a rt dev of
unrestricted size and the rtrmap feature can cause mkfs to fail with
error:

mkfs.xfs: cannot handle expansion of realtime rmap btree; need <x> free
blocks, have <y>

This is due to that the -d size is not big enough to support the
metadata space allocation required for the rt groups.

Switch to use _scratch_mkfs_sized that sets up the -r size parameter
to avoid this. If -r size=x and -d size=x we will not risk running
out of space on the ddev as the metadata size is just a fraction of
the rt data size.

Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

common: make rt_ops local in _try_scratch_mkfs_sized

If we call _try_scratch_mkfs_size with $SCRATCH_RTDEV set followed by
a call with $SCRATCH_RTDEV cleared, rt_ops will have stale size
parameters that will cause mkfs.xfs to fail with:
"size specified for non-existent rt subvolume"

Make rt_ops local to fix this.

Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>