]> www.infradead.org Git - users/hch/xfstests-dev.git/commit
fstests: cleanup fsstress process management
authorDave Chinner <dchinner@redhat.com>
Tue, 26 Nov 2024 20:41:25 +0000 (07:41 +1100)
committerZorro Lang <zlang@kernel.org>
Sun, 8 Dec 2024 13:59:15 +0000 (21:59 +0800)
commit8973af00ec212fd7d863998762e1099871729e00
treea361e24824299d36c6953279fe2388d4e073c690
parent800681ebd77c5dfd105e62167fc1d624d5fa5bd2
fstests: cleanup fsstress process management

Lots of tests run fsstress in the background and then have to kill
it and/or provide special cleanup functions to kill the background
fsstress processes. They typically use $KILLALL_PROG for this.

Use of killall is problematic for running multiple tests in parallel
in that one test can kill other tests' processes.  However, because
fsstress itself forks and runs children, there are very few avenues
for shell scripts to ensure all the fsstress processes actually die.

With bash, it is especially nasty, because sending SIGTERM will
result in bash outputting error messages ("Killed: ..." that will
cause golden output mismatches and hence test failures. Hence we
also need to be able to tell the main fstress process to die without
triggering these messages.

To avoid the process tracking problems, we change to use pkill
rather than killall (more options for process selection) and we
stop using the $here/ltp/fsstress binary. Instead, we copy the
$here/ltp/fsstress to $TEST_DIR/$seq.fsstress so that the test has
a unique fsstress binary name. This allows the pkill filter to
select just the fsstress processes the test has run. The fsstress
binary name is held in _FSSTRESS_NAME, and the program to run is
_FSSTRESS_PROG.

We also track the primary fsstress process ID, and store that in
_FSSTRESS_PID. We do this so that we have a PID to wait against so
that we don't return before the fsstress processes are dead. To this
end, we add a SIGPIPE handler to the primary process so that it
dying doesn't trigger bash 'killed' message output. We can
send 'pkill -PIPE $_FSSTRESS_NAME' to all the fsstress processes and
the primary process will then enter the "wait for children to die"
processing loop before it exits. In this way, we can wait for the
primary fsstress process and when it exits we know that all it's
children have also finished and gone away. This makes killing
fsstress invocations reliable and noise free.

This is accomplished by the helpers added to common/rc:

_run_fsstress
_run_fsstress_bg
_wait_for_fsstress
_kill_fstress

This also means that all fsstress invocations now obey
FSSTRESS_AVOID environment restrictions, many of which didn't.

We add a call to _kill_fstress into the generic _cleanup() function.
This means that tests using fsstress don't need to add a special
local _cleanup function just to call _kill_fsstress() so that
background fsstress processes are killed when the user interrupts
the tests with ctrl-c.

Further, killall in the _cleanup() function is often used to attempt
to expedite killing of foreground execution fsstress processes. This
doesn't actually work because of the way bash processes interupt
signals. That is, it waits for the currently executing process to
finish execution, then runs the trap function. Hence a foreground
fsstress won't ever be interrupted by ctrl-c. By implementing
_run_fsstress() as a background process and a wait call, the wait()
call is interrupted by the signal and the cleanup trap is run
immediately. Hence the fsstress processes are killed immediately and
the test exits cleanly almost immediately.

The result of all this is common, clean handling of fsstress
execution and termination. There are a few exceptions for special
cases, but the vast majority of tests that run fsstress use the
above four wrapper functions exclusively.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Zorro lang <zlang@redhat.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>
105 files changed:
common/fuzzy
common/preamble
common/rc
ltp/fsstress.c
tests/btrfs/004
tests/btrfs/007
tests/btrfs/012
tests/btrfs/028
tests/btrfs/049
tests/btrfs/057
tests/btrfs/060
tests/btrfs/061
tests/btrfs/062
tests/btrfs/063
tests/btrfs/064
tests/btrfs/065
tests/btrfs/066
tests/btrfs/067
tests/btrfs/068
tests/btrfs/069
tests/btrfs/070
tests/btrfs/071
tests/btrfs/072
tests/btrfs/073
tests/btrfs/074
tests/btrfs/078
tests/btrfs/100
tests/btrfs/101
tests/btrfs/136
tests/btrfs/192
tests/btrfs/195
tests/btrfs/212
tests/btrfs/232
tests/btrfs/252
tests/btrfs/261
tests/btrfs/284
tests/btrfs/286
tests/btrfs/320
tests/btrfs/332
tests/ext4/004
tests/ext4/057
tests/ext4/058
tests/ext4/307
tests/generic/013
tests/generic/019
tests/generic/051
tests/generic/055
tests/generic/068
tests/generic/070
tests/generic/076
tests/generic/076.out
tests/generic/083
tests/generic/083.out
tests/generic/117
tests/generic/232
tests/generic/232.out
tests/generic/269
tests/generic/270
tests/generic/388
tests/generic/390
tests/generic/409
tests/generic/410
tests/generic/411
tests/generic/461
tests/generic/475
tests/generic/476
tests/generic/482
tests/generic/547
tests/generic/560
tests/generic/561
tests/generic/579
tests/generic/585
tests/generic/589
tests/generic/642
tests/generic/648
tests/generic/650
tests/generic/750
tests/generic/753
tests/overlay/019
tests/overlay/021
tests/xfs/006
tests/xfs/011
tests/xfs/013
tests/xfs/017
tests/xfs/017.out
tests/xfs/032
tests/xfs/049
tests/xfs/051
tests/xfs/057
tests/xfs/077
tests/xfs/079
tests/xfs/104
tests/xfs/137
tests/xfs/141
tests/xfs/158
tests/xfs/167
tests/xfs/168
tests/xfs/264
tests/xfs/270
tests/xfs/297
tests/xfs/305
tests/xfs/442
tests/xfs/538
tests/xfs/609
tests/xfs/610