]> www.infradead.org Git - users/hch/xfsprogs.git/log
users/hch/xfsprogs.git
12 months agoxfs: realtime rmap btree transaction reservations
Darrick J. Wong [Wed, 3 Jul 2024 21:22:16 +0000 (14:22 -0700)]
xfs: realtime rmap btree transaction reservations

Make sure that there's enough log reservation to handle mapping
and unmapping realtime extents.  We have to reserve enough space
to handle a split in the rtrmapbt to add the record and a second
split in the regular rmapbt to record the rtrmapbt split.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: define the on-disk realtime rmap btree format
Darrick J. Wong [Wed, 3 Jul 2024 21:22:16 +0000 (14:22 -0700)]
xfs: define the on-disk realtime rmap btree format

Start filling out the rtrmap btree implementation. Start with the
on-disk btree format; add everything needed to read, write and
manipulate rmap btree blocks. This prepares the way for connecting the
btree operations implementation.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: introduce realtime rmap btree definitions
Darrick J. Wong [Wed, 3 Jul 2024 21:22:15 +0000 (14:22 -0700)]
xfs: introduce realtime rmap btree definitions

Add new realtime rmap btree definitions. The realtime rmap btree will
be rooted from a hidden inode, but has its own shape and therefore
needs to have most of its own separate types.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: simplify the xfs_rmap_{alloc,free}_extent calling conventions
Darrick J. Wong [Wed, 3 Jul 2024 21:22:15 +0000 (14:22 -0700)]
xfs: simplify the xfs_rmap_{alloc,free}_extent calling conventions

Simplify the calling conventions by allowing callers to pass a fsbno
(xfs_fsblock_t) directly into these functions, since we're just going to
set it in a struct anyway.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: allow inode-based btrees to reserve space in the data device
Darrick J. Wong [Wed, 29 May 2024 04:11:34 +0000 (21:11 -0700)]
xfs: allow inode-based btrees to reserve space in the data device

Create a new space reservation scheme so that btree metadata for the
realtime volume can reserve space in the data device to avoid space
underruns.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: attach rtgroup objects to btree cursors
Darrick J. Wong [Wed, 3 Jul 2024 21:22:15 +0000 (14:22 -0700)]
xfs: attach rtgroup objects to btree cursors

Make it so that we can attach realtime group objects to btree cursors.
This will be crucial for enabling rmap btrees in realtime groups.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: update btree keys correctly when _insrec splits an inode root block
Darrick J. Wong [Wed, 3 Jul 2024 21:22:15 +0000 (14:22 -0700)]
xfs: update btree keys correctly when _insrec splits an inode root block

In commit 2c813ad66a72, I partially fixed a bug wherein xfs_btree_insrec
would erroneously try to update the parent's key for a block that had
been split if we decided to insert the new record into the new block.
The solution was to detect this situation and update the in-core key
value that we pass up to the caller so that the caller will (eventually)
add the new block to the parent level of the tree with the correct key.

However, I missed a subtlety about the way inode-rooted btrees work.  If
the full block was a maximally sized inode root block, we'll solve that
fullness by moving the root block's records to a new block, resizing the
root block, and updating the root to point to the new block.  We don't
pass a pointer to the new block to the caller because that work has
already been done.  The new record will /always/ land in the new block,
so in this case we need to use xfs_btree_update_keys to update the keys.

This bug can theoretically manifest itself in the very rare case that we
split a bmbt root block and the new record lands in the very first slot
of the new block, though I've never managed to trigger it in practice.
However, it is very easy to reproduce by running generic/522 with the
realtime rmapbt patchset if rtinherit=1.

Fixes: 2c813ad66a72 ("xfs: support btrees with overlapping intervals for keys")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: support storing records in the inode core root
Darrick J. Wong [Wed, 3 Jul 2024 21:22:14 +0000 (14:22 -0700)]
xfs: support storing records in the inode core root

Add the necessary flags and code so that we can support storing leaf
records in the inode root block of a btree.  This hasn't been necessary
before, but the realtime rmapbt will need to be able to do this.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: hoist the node iroot update code out of xfs_btree_kill_iroot
Darrick J. Wong [Wed, 3 Jul 2024 21:22:14 +0000 (14:22 -0700)]
xfs: hoist the node iroot update code out of xfs_btree_kill_iroot

In preparation for allowing records in an inode btree root, hoist the
code that copies keyptrs from an existing node child into the root block
to a separate function.  Remove some unnecessary conditionals and clean
up a few function calls in the new function.  Note that this change
reorders the ->free_block call with respect to the change in bc_nlevels
to make it easier to support inode root leaf blocks in the next patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: hoist the node iroot update code out of xfs_btree_new_iroot
Darrick J. Wong [Wed, 3 Jul 2024 21:22:14 +0000 (14:22 -0700)]
xfs: hoist the node iroot update code out of xfs_btree_new_iroot

In preparation for allowing records in an inode btree root, hoist the
code that copies keyptrs from an existing node root into a child block
to a separate function.  Note that the new function explicitly computes
the keys of the new child block and stores that in the root block; while
the bmap btree could rely on leaving the key alone, realtime rmap needs
to set the new high key.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: support leaves in the incore btree root block in xfs_iroot_realloc
Darrick J. Wong [Wed, 3 Jul 2024 21:22:14 +0000 (14:22 -0700)]
xfs: support leaves in the incore btree root block in xfs_iroot_realloc

Add some logic to xfs_iroot_realloc so that we can handle leaf records
in the btree root block correctly.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: generalize the btree root reallocation function
Darrick J. Wong [Wed, 3 Jul 2024 21:22:13 +0000 (14:22 -0700)]
xfs: generalize the btree root reallocation function

In preparation for storing realtime rmap btree roots in an inode fork,
make xfs_iroot_realloc take an ops structure that takes care of all the
btree-specific geometry pieces.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: standardize the btree maxrecs function parameters
Darrick J. Wong [Wed, 3 Jul 2024 21:22:13 +0000 (14:22 -0700)]
xfs: standardize the btree maxrecs function parameters

Standardize the parameters in xfs_{alloc,bm,ino,rmap,refcount}bt_maxrecs
so that we have consistent calling conventions.  This doesn't affect the
kernel that much, but enables us to clean up userspace a bit.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: rearrange xfs_iroot_realloc a bit
Darrick J. Wong [Wed, 3 Jul 2024 21:22:13 +0000 (14:22 -0700)]
xfs: rearrange xfs_iroot_realloc a bit

Rearrange the innards of xfs_iroot_realloc so that we can reduce
duplicated code prior to genericizing the function.  No functional
changes.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: move the zero records logic into xfs_bmap_broot_space_calc
Darrick J. Wong [Wed, 3 Jul 2024 21:22:13 +0000 (14:22 -0700)]
xfs: move the zero records logic into xfs_bmap_broot_space_calc

The bmap btree cannot ever have zero records in an incore btree block.
If the number of records drops to zero, that means we're converting the
fork to extents format and are trying to remove the tree.  This logic
won't hold for the future realtime rmap btree, so move the logic into
the bmbt code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: hoist the code that moves the incore inode fork broot memory
Darrick J. Wong [Wed, 3 Jul 2024 21:22:13 +0000 (14:22 -0700)]
xfs: hoist the code that moves the incore inode fork broot memory

Whenever we change the size of the memory buffer holding an inode fork
btree root block, we have to copy the contents over.  Refactor all this
into a single function that handles both, in preparation for making
xfs_iroot_realloc more generic.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: fix a sloppy memory handling bug in xfs_iroot_realloc
Darrick J. Wong [Wed, 3 Jul 2024 21:22:12 +0000 (14:22 -0700)]
xfs: fix a sloppy memory handling bug in xfs_iroot_realloc

While refactoring code, I noticed that when xfs_iroot_realloc tries to
shrink a bmbt root block, it allocates a smaller new block and then
copies "records" and pointers to the new block.  However, bmbt root
blocks cannot ever be leaves, which means that it's not technically
correct to copy records.  We /should/ be copying keys.

Note that this has never resulted in actual memory corruption because
sizeof(bmbt_rec) == (sizeof(bmbt_key) + sizeof(bmbt_ptr)).  However,
this will no longer be true when we start adding realtime rmap stuff,
so fix this now.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: refactor creation of bmap btree roots
Darrick J. Wong [Wed, 3 Jul 2024 21:22:12 +0000 (14:22 -0700)]
xfs: refactor creation of bmap btree roots

Now that we've created inode fork helpers to allocate and free btree
roots, create a new bmap btree helper to create a new bmbt root, and
refactor the extents <-> btree conversion functions to use our new
helpers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: refactor the allocation and freeing of incore inode fork btree roots
Darrick J. Wong [Wed, 3 Jul 2024 21:22:12 +0000 (14:22 -0700)]
xfs: refactor the allocation and freeing of incore inode fork btree roots

Refactor the code that allocates and freese the incore inode fork btree
roots.  This will help us disentangle some of the weird logic when we're
creating and tearing down inode-based btrees.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: replace shouty XFS_BM{BT,DR} macros
Darrick J. Wong [Wed, 3 Jul 2024 21:22:12 +0000 (14:22 -0700)]
xfs: replace shouty XFS_BM{BT,DR} macros

Replace all the shouty bmap btree and bmap disk root macros with actual
functions, and fix a type handling error in the xattr code that the
macros previously didn't care about.

sed \
 -e 's/XFS_BMBT_BLOCK_LEN/xfs_bmbt_block_len/g' \
 -e 's/XFS_BMBT_REC_ADDR/xfs_bmbt_rec_addr/g' \
 -e 's/XFS_BMBT_KEY_ADDR/xfs_bmbt_key_addr/g' \
 -e 's/XFS_BMBT_PTR_ADDR/xfs_bmbt_ptr_addr/g' \
 -e 's/XFS_BMDR_REC_ADDR/xfs_bmdr_rec_addr/g' \
 -e 's/XFS_BMDR_KEY_ADDR/xfs_bmdr_key_addr/g' \
 -e 's/XFS_BMDR_PTR_ADDR/xfs_bmdr_ptr_addr/g' \
 -e 's/XFS_BMAP_BROOT_PTR_ADDR/xfs_bmap_broot_ptr_addr/g' \
 -e 's/XFS_BMAP_BROOT_SPACE_CALC/xfs_bmap_broot_space_calc/g' \
 -e 's/XFS_BMAP_BROOT_SPACE/xfs_bmap_broot_space/g' \
 -e 's/XFS_BMDR_SPACE_CALC/xfs_bmdr_space_calc/g' \
 -e 's/XFS_BMAP_BMDR_SPACE/xfs_bmap_bmdr_space/g' \
 -i $(git ls-files fs/xfs/*.[ch] fs/xfs/libxfs/*.[ch] fs/xfs/scrub/*.[ch])

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agomkfs: format realtime groups
Darrick J. Wong [Wed, 3 Jul 2024 21:22:11 +0000 (14:22 -0700)]
mkfs: format realtime groups

Create filesystems with the realtime group feature enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agomkfs: add headers to realtime summary blocks
Darrick J. Wong [Wed, 3 Jul 2024 21:22:11 +0000 (14:22 -0700)]
mkfs: add headers to realtime summary blocks

When the rtgroups feature is enabled, format rtsummary blocks with the
appropriate block headers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agomkfs: add headers to realtime bitmap blocks
Darrick J. Wong [Wed, 3 Jul 2024 21:22:11 +0000 (14:22 -0700)]
mkfs: add headers to realtime bitmap blocks

When the rtgroups feature is enabled, format rtbitmap blocks with the
appropriate block headers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_scrub: use histograms to speed up phase 8 on the realtime volume
Darrick J. Wong [Wed, 3 Jul 2024 21:22:11 +0000 (14:22 -0700)]
xfs_scrub: use histograms to speed up phase 8 on the realtime volume

Use the same statistical methods that we use on the data volume to
compute the minimum threshold size for fstrims on the realtime volume.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_scrub: trim realtime volumes too
Darrick J. Wong [Wed, 3 Jul 2024 21:22:11 +0000 (14:22 -0700)]
xfs_scrub: trim realtime volumes too

On the kernel side, the XFS realtime groups patchset added support for
FITRIM of the realtime volume.  This support doesn't actually require
there to be any realtime groups, so teach scrub to run through the whole
region.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_scrub: call GETFSMAP for each rt group in parallel
Darrick J. Wong [Wed, 3 Jul 2024 21:22:10 +0000 (14:22 -0700)]
xfs_scrub: call GETFSMAP for each rt group in parallel

If realtime groups are enabled, we should take advantage of the sharding
to speed up the spacemap scans.  Do so by issuing per-rtgroup GETFSMAP
calls.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_scrub: scrub realtime allocation group metadata
Darrick J. Wong [Wed, 3 Jul 2024 21:22:10 +0000 (14:22 -0700)]
xfs_scrub: scrub realtime allocation group metadata

Scan realtime group metadata as part of phase 2, just like we do for AG
metadata.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_spaceman: report on realtime group health
Darrick J. Wong [Wed, 3 Jul 2024 21:22:10 +0000 (14:22 -0700)]
xfs_spaceman: report on realtime group health

Add the realtime group status to the health reporting done by
xfs_spaceman.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_io: display rt group in verbose fsmap output
Darrick J. Wong [Wed, 3 Jul 2024 21:22:10 +0000 (14:22 -0700)]
xfs_io: display rt group in verbose fsmap output

Display the rt group number in the fsmap output, just like we do for
regular data files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_io: display rt group in verbose bmap output
Darrick J. Wong [Wed, 3 Jul 2024 21:22:10 +0000 (14:22 -0700)]
xfs_io: display rt group in verbose bmap output

Display the rt group number in the bmap -v output, just like we do for
regular data files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_io: add a command to display realtime group information
Darrick J. Wong [Wed, 3 Jul 2024 21:22:09 +0000 (14:22 -0700)]
xfs_io: add a command to display realtime group information

Add a new 'rginfo' command to xfs_io so that we can display realtime
group geometry.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_io: add a command to display allocation group information
Darrick J. Wong [Wed, 3 Jul 2024 21:22:09 +0000 (14:22 -0700)]
xfs_io: add a command to display allocation group information

Add a new 'aginfo' command to xfs_io so that we can display allocation
group geometry.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_io: support scrubbing rtgroup metadata
Darrick J. Wong [Wed, 3 Jul 2024 21:22:09 +0000 (14:22 -0700)]
xfs_io: support scrubbing rtgroup metadata

Support scrubbing all rtgroup metadata with a scrubv call.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_mdrestore: restore rt group superblocks to realtime device
Darrick J. Wong [Wed, 3 Jul 2024 21:22:09 +0000 (14:22 -0700)]
xfs_mdrestore: restore rt group superblocks to realtime device

Support restoring realtime device metadata to the realtime device, if
the dumped filesystem had one.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: dump rt summary blocks
Darrick J. Wong [Wed, 3 Jul 2024 21:22:09 +0000 (14:22 -0700)]
xfs_db: dump rt summary blocks

Now that rtsummary blocks have a header, make it so that xfs_db can
analyze the structure.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: dump rt bitmap blocks
Darrick J. Wong [Wed, 3 Jul 2024 21:22:08 +0000 (14:22 -0700)]
xfs_db: dump rt bitmap blocks

Now that rtbitmap blocks have a header, make it so that xfs_db can
analyze the structure.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: hoist bit scraping function
Darrick J. Wong [Wed, 3 Jul 2024 21:22:08 +0000 (14:22 -0700)]
xfs_db: hoist bit scraping function

Hoist the bit scraping code into a separate helper function so that we
can create a _le version later, and also enable some testing functions.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: metadump realtime devices
Darrick J. Wong [Wed, 3 Jul 2024 21:22:08 +0000 (14:22 -0700)]
xfs_db: metadump realtime devices

Teach the metadump device to dump the filesystem metadata of a realtime
device to the metadump file.  Currently, this is limited to the realtime
superblock.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: report rtgroups via version command
Darrick J. Wong [Wed, 3 Jul 2024 21:22:08 +0000 (14:22 -0700)]
xfs_db: report rtgroups via version command

Report the rtgroups feature in the version command output.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: enable conversion of rt space units
Darrick J. Wong [Wed, 3 Jul 2024 21:22:07 +0000 (14:22 -0700)]
xfs_db: enable conversion of rt space units

Teach the xfs_db convert function about realtime extents, blocks, and
realtime group numbers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: implement check for rt superblocks
Darrick J. Wong [Wed, 3 Jul 2024 21:22:07 +0000 (14:22 -0700)]
xfs_db: implement check for rt superblocks

Implement the bare minimum needed to avoid xfs_check regressions when
realtime groups are enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: listify the definition of dbm_t
Darrick J. Wong [Wed, 3 Jul 2024 21:22:07 +0000 (14:22 -0700)]
xfs_db: listify the definition of dbm_t

Convert this enum definition to a list so that code adding elements to
the enum do not have to reflow the whole thing.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: support changing the label and uuid of rt superblocks
Darrick J. Wong [Wed, 3 Jul 2024 21:22:07 +0000 (14:22 -0700)]
xfs_db: support changing the label and uuid of rt superblocks

Update the label and uuid commands to change the rt superblocks along
with the filesystem superblocks.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: support dumping realtime superblocks
Darrick J. Wong [Wed, 3 Jul 2024 21:22:06 +0000 (14:22 -0700)]
xfs_db: support dumping realtime superblocks

Allow debugging of realtime superblocks, and add the relevant fields in
the fs superblock that point us at the existence and location of the rt
supers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: listify the definition of enum typnm
Darrick J. Wong [Wed, 3 Jul 2024 21:22:06 +0000 (14:22 -0700)]
xfs_db: listify the definition of enum typnm

Convert the enum definition into a list so that future patches adding
things to enum typnm don't have to reflow the entire thing.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: support adding rtgroups to a filesystem
Darrick J. Wong [Wed, 3 Jul 2024 21:22:06 +0000 (14:22 -0700)]
xfs_repair: support adding rtgroups to a filesystem

Allow users to add the rtgroups feature to a filesystem if the
filesystem does not already have a realtime volume attached.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: repair rtsummary block headers
Darrick J. Wong [Wed, 3 Jul 2024 21:22:06 +0000 (14:22 -0700)]
xfs_repair: repair rtsummary block headers

Check and repair the new block headers attached to rtsummary blocks.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: repair rtbitmap block headers
Darrick J. Wong [Wed, 3 Jul 2024 21:22:06 +0000 (14:22 -0700)]
xfs_repair: repair rtbitmap block headers

Check and repair the new block headers attached to rtbitmap blocks.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: improve rtbitmap discrepancy reporting
Darrick J. Wong [Wed, 3 Jul 2024 21:22:05 +0000 (14:22 -0700)]
xfs_repair: improve rtbitmap discrepancy reporting

Improve the reporting of discrepancies in the realtime bitmap and
summary files by creating a separate helper function that will pinpoint
the exact (word) locations of mismatches.  This will help developers to
diagnose problems with the rtgroups feature and users to figure out
exactly what's bad in a filesystem.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: support realtime groups
Darrick J. Wong [Wed, 3 Jul 2024 21:22:05 +0000 (14:22 -0700)]
xfs_repair: support realtime groups

Support the realtime group feature.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agolibxfs: implement some sanity checking for enormous rgcount
Darrick J. Wong [Wed, 3 Jul 2024 21:22:05 +0000 (14:22 -0700)]
libxfs: implement some sanity checking for enormous rgcount

Similar to what we do for suspiciously large sb_agcount values, if
someone tries to get libxfs to load a filesystem with a very large
realtime group count, let's do some basic checks of the rt device to
see if it's really that large.  If the read fails, only load the first
rtgroup and warn the user.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agolibfrog: report rt groups in output
Darrick J. Wong [Wed, 3 Jul 2024 21:22:05 +0000 (14:22 -0700)]
libfrog: report rt groups in output

Report realtime group geometry.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: scrub each rtgroup's portion of the rtbitmap separately
Darrick J. Wong [Wed, 3 Jul 2024 21:22:04 +0000 (14:22 -0700)]
xfs: scrub each rtgroup's portion of the rtbitmap separately

Create a new scrub type code so that userspace can scrub each rtgroup's
portion of the rtbitmap file separately.  This reduces the long tail
latency that results from scanning the entire bitmap all at once, and
prepares us for future patchsets, wherein we'll need to be able to lock
a specific rtgroup so that we can rebuild that rtgroup's part of the
rtbitmap contents from the rtgroup's rmap btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: scrub the realtime group superblock
Darrick J. Wong [Wed, 3 Jul 2024 21:22:04 +0000 (14:22 -0700)]
xfs: scrub the realtime group superblock

Enable scrubbing of realtime group superblocks.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: use realtime EFI to free extents when rtgroups are enabled
Darrick J. Wong [Wed, 3 Jul 2024 21:22:04 +0000 (14:22 -0700)]
xfs: use realtime EFI to free extents when rtgroups are enabled

When rmap is enabled, XFS expects a certain order of operations, which
is: 1) remove the file mapping, 2) remove the reverse mapping, and then
3) free the blocks.  When reflink is enabled, XFS replaces (3) with a
deferred refcount decrement operation that can schedule freeing the
blocks if that was the last refcount.

For realtime files, xfs_bmap_del_extent_real tries to do 1 and 3 in the
same transaction, which will break both rmap and reflink unless we
switch it to use realtime EFIs.  Both rmap and reflink depend on the
rtgroups feature, so let's turn on EFIs for all rtgroups filesystems.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: hold an active reference to an rtgroup while processing an EFI
Darrick J. Wong [Wed, 3 Jul 2024 21:22:04 +0000 (14:22 -0700)]
xfs: hold an active reference to an rtgroup while processing an EFI

While we're processing an EFI log item, maintain an active reference to
the rtgroup object so that it cannot go away underneath us.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: use an incore rtgroup rotor for rtpick
Darrick J. Wong [Wed, 3 Jul 2024 21:22:04 +0000 (14:22 -0700)]
xfs: use an incore rtgroup rotor for rtpick

During the 6.7 merge window, Linus noticed that the realtime allocator
was doing some sketchy things trying to encode a u64 sequence counter
into the rtbitmap file's atime.  The sketchy casting of a struct pointer
to a u64 pointer has subtly broken several times over the past decade as
the codebase has transitioned to using the VFS i_atime field and that
field has changed in size and layout over time.

Since the goal of the rtpick code is to _suggest_ a starting place for
new rt file allocations, the repeated breakage has not resulted in
inconsistent metadata.  IOWs, it's a hint.

For rtgroups, we don't need this complex code to cut the rtextents space
into fractions.  Add an rtgroup rotor and use that for rtpick, similar
to AG rotoring on the data device.  The new rotor does not persist,
which reduces the logging overhead slightly.

Between this and the new restrictions on open-by-handle on metadir, it's
no longer possible for userspace to control the rtpick rotor.

Link: https://lore.kernel.org/linux-xfs/CAHk-=wj3oM3d-Hw2vvxys3KCZ9De+gBN7Gxr2jf96OTisL9udw@mail.gmail.com/
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: store rtgroup information with a bmap intent
Darrick J. Wong [Wed, 3 Jul 2024 21:22:03 +0000 (14:22 -0700)]
xfs: store rtgroup information with a bmap intent

Make the bmap intent items take an active reference to the rtgroup
containing the space that is being mapped or unmapped.  We will need
this functionality once we start enabling rmap and reflink on the rt
volume.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: encode the rtsummary in big endian format
Darrick J. Wong [Wed, 3 Jul 2024 21:22:03 +0000 (14:22 -0700)]
xfs: encode the rtsummary in big endian format

Currently, the ondisk realtime summary file counters are accessed in
units of 32-bit words.  There's no endian translation of the contents of
this file, which means that the Bad Things Happen(tm) if you go from
(say) x86 to powerpc.  Since we have a new feature flag, let's take the
opportunity to enforce an endianness on the file.  Encode the summary
information in big endian format, like most of the rest of the
filesystem.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: add block headers to realtime summary blocks
Darrick J. Wong [Wed, 3 Jul 2024 21:22:03 +0000 (14:22 -0700)]
xfs: add block headers to realtime summary blocks

Upgrade rtsummary blocks to have self describing metadata like most
every other thing in XFS.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: encode the rtbitmap in big endian format
Darrick J. Wong [Wed, 3 Jul 2024 21:22:03 +0000 (14:22 -0700)]
xfs: encode the rtbitmap in big endian format

Currently, the ondisk realtime bitmap file is accessed in units of
32-bit words.  There's no endian translation of the contents of this
file, which means that the Bad Things Happen(tm) if you go from (say)
x86 to powerpc.  Since we have a new feature flag, let's take the
opportunity to enforce an endianness on the file.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: add block headers to realtime bitmap blocks
Darrick J. Wong [Wed, 3 Jul 2024 21:22:02 +0000 (14:22 -0700)]
xfs: add block headers to realtime bitmap blocks

Upgrade rtbitmap blocks to have self describing metadata like most every
other thing in XFS.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: export the geometry of realtime groups to userspace
Darrick J. Wong [Wed, 3 Jul 2024 21:22:02 +0000 (14:22 -0700)]
xfs: export the geometry of realtime groups to userspace

Create an ioctl so that the kernel can report the status of realtime
groups to userspace.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: define locking primitives for realtime groups
Darrick J. Wong [Wed, 3 Jul 2024 21:22:02 +0000 (14:22 -0700)]
xfs: define locking primitives for realtime groups

Define helper functions to lock all metadata inodes related to a
realtime group.  There's not much to look at now, but this will become
important when we add per-rtgroup metadata files and online fsck code
for them.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: record rt group superblock errors in the health system
Darrick J. Wong [Wed, 3 Jul 2024 21:22:02 +0000 (14:22 -0700)]
xfs: record rt group superblock errors in the health system

Record the state of per-rtgroup metadata sickness in the rtgroup
structure for later reporting.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: add frextents to the lazysbcounters when rtgroups enabled
Darrick J. Wong [Wed, 3 Jul 2024 21:22:01 +0000 (14:22 -0700)]
xfs: add frextents to the lazysbcounters when rtgroups enabled

Make the free rt extent count a part of the lazy sb counters when the
realtime groups feature is enabled.  This is possible because the patch
to recompute frextents from the rtbitmap during log recovery predates
the code adding rtgroup support, hence we know that the value will
always be correct during runtime.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: add a helper to prevent bmap merges across rtgroup boundaries
Christoph Hellwig [Wed, 3 Jul 2024 21:22:01 +0000 (14:22 -0700)]
xfs: add a helper to prevent bmap merges across rtgroup boundaries

Except for the rt superblock, realtime groups do not store any metadata
at the start (or end) of the group.  There is nothing to prevent the
bmap code from merging allocations from multiple groups into a single
bmap record.  Add a helper to check for this case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: massage the commit message after pulling this into rtgroups]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: check that rtblock extents do not break rtsupers or rtgroups
Darrick J. Wong [Wed, 3 Jul 2024 21:22:01 +0000 (14:22 -0700)]
xfs: check that rtblock extents do not break rtsupers or rtgroups

Check that rt block pointers do not point to the realtime superblock and
that allocated rt space extents do not cross rtgroup boundaries.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: create a separate helper to validate rt freespace extents
Darrick J. Wong [Wed, 3 Jul 2024 21:22:01 +0000 (14:22 -0700)]
xfs: create a separate helper to validate rt freespace extents

Realtime allocation groups are not like AGs on the data device which
have a superblock to ensure that a space extent cannot ever cross an AG
boundary.  To make sharding and fsck easier, we would like for allocated
space extents in the realtime volume never to cross an rtgroup boundary.

Therefore, we need two rtblock/rtextent predicates here.  In the next
patch, the xfs_verify_rtbext helper will check that the arguments do not
cross an rtgroup boundary.

However, the rtbitmap/summary files can describe free space extents that
/do/ cross rtgroup boundaries.  For them, create a xfs_verify_rt_freesp
helper that doesn't care about rtgroup boundaries.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: export realtime group geometry via XFS_FSOP_GEOM
Darrick J. Wong [Wed, 3 Jul 2024 21:22:01 +0000 (14:22 -0700)]
xfs: export realtime group geometry via XFS_FSOP_GEOM

Export the realtime geometry information so that userspace can query it.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: grow the realtime section when realtime groups are enabled
Darrick J. Wong [Wed, 3 Jul 2024 21:22:00 +0000 (14:22 -0700)]
xfs: grow the realtime section when realtime groups are enabled

Enable growing the rt section when realtime groups are enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: update primary realtime super every time we update the primary fs super
Darrick J. Wong [Wed, 3 Jul 2024 21:22:00 +0000 (14:22 -0700)]
xfs: update primary realtime super every time we update the primary fs super

Every time we update parts of the primary filesystem superblock that are
echoed in the primary rt super, we should update that primary realtime
super.  Avoid an ondisk log format change by using ordered buffers to
write the primary rt super.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: define the format of rt groups
Darrick J. Wong [Wed, 3 Jul 2024 21:22:00 +0000 (14:22 -0700)]
xfs: define the format of rt groups

Define the ondisk format of realtime group metadata, and a superblock
for realtime volumes.  rt supers are protected by a separate rocompat
bit so that we can leave them off if the rt device is zoned.

Add a xfs_sb_version_hasrtgroups so that xfs_repair knows how to zero
the tail of superblocks.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: create incore realtime group structures
Darrick J. Wong [Wed, 3 Jul 2024 21:22:00 +0000 (14:22 -0700)]
xfs: create incore realtime group structures

Create an incore object that will contain information about a realtime
allocation group.  This will eventually enable us to shard the realtime
section in a similar manner to how we shard the data section.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_logprint: report realtime EFIs
Darrick J. Wong [Wed, 3 Jul 2024 21:21:59 +0000 (14:21 -0700)]
xfs_logprint: report realtime EFIs

Decode the EFI format just enough to report if an EFI targets the
realtime device or not.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: support error injection when freeing rt extents
Darrick J. Wong [Wed, 3 Jul 2024 21:21:59 +0000 (14:21 -0700)]
xfs: support error injection when freeing rt extents

A handful of fstests expect to be able to test what happens when extent
free intents fail to actually free the extent.  Now that we're
supporting EFIs for realtime extents, add to xfs_rtfree_extent the same
injection point that exists in the regular extent freeing code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: support logging EFIs for realtime extents
Darrick J. Wong [Wed, 3 Jul 2024 21:21:59 +0000 (14:21 -0700)]
xfs: support logging EFIs for realtime extents

Teach the EFI mechanism how to free realtime extents.  We're going to
need this to enforce proper ordering of operations when we enable
realtime rmap.

Declare a new log intent item type (XFS_LI_EFI_RT) and a separate defer
ops for rt extents.  This keeps the ondisk artifacts and processing code
completely separate between the rt and non-rt cases.  Hopefully this
will make it easier to debug filesystem problems.

Previous versions of this patch accomplished this by setting the high
bit in each rt EFI extent.  This was found to be less transparent by
reviewers.

[Contains a bug fix and cleanups from hch]

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: simplify xfs_rtalloc_query_range
Christoph Hellwig [Wed, 3 Jul 2024 21:21:59 +0000 (14:21 -0700)]
xfs: simplify xfs_rtalloc_query_range

There isn't much of a good reason to pass the xfs_rtalloc_rec structures
that describe extents to xfs_rtalloc_query_range as we really just want
a lower and upper bound xfs_rtxnum_t.  Pass the rtxnum directly and
simply the interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: remove xfs_rtb_to_rtxrem
Christoph Hellwig [Wed, 3 Jul 2024 21:21:59 +0000 (14:21 -0700)]
xfs: remove xfs_rtb_to_rtxrem

Simplify the number of block number conversion helpers by removing
xfs_rtb_to_rtxrem.  Any recent compiler is smart enough to eliminate
the double divisions if using separate xfs_rtb_to_rtx and
xfs_rtb_to_rtxoff calls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agolibxfs: remove duplicate rtalloc declarations in libxfs.h
Christoph Hellwig [Tue, 9 Jul 2024 05:46:15 +0000 (22:46 -0700)]
libxfs: remove duplicate rtalloc declarations in libxfs.h

These already come from xfs_rtbitmap.h.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: libxfs, not xfs]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs: remove XFS_ILOCK_RT*
Darrick J. Wong [Wed, 3 Jul 2024 21:21:58 +0000 (14:21 -0700)]
xfs: remove XFS_ILOCK_RT*

Now that we've centralized the realtime metadata locking routines, get
rid of the ILOCK subclasses since we now use explicit lockdep classes.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: allow setting current address to log blocks
Darrick J. Wong [Wed, 3 Jul 2024 21:21:58 +0000 (14:21 -0700)]
xfs_db: allow setting current address to log blocks

Add commands so that users can target blocks on an external log device.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: convert rtsummary geometry
Darrick J. Wong [Wed, 3 Jul 2024 21:21:58 +0000 (14:21 -0700)]
xfs_db: convert rtsummary geometry

Teach the rtconvert command to be able to convert realtime blocks and
extents to locations within the rt summary.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: convert rtbitmap geometry
Darrick J. Wong [Wed, 3 Jul 2024 21:21:58 +0000 (14:21 -0700)]
xfs_db: convert rtbitmap geometry

Teach the rtconvert command to be able to convert realtime blocks and
extents to locations within the rt bitmap.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: enable conversion of rt space units
Darrick J. Wong [Wed, 3 Jul 2024 21:21:58 +0000 (14:21 -0700)]
xfs_db: enable conversion of rt space units

Teach the xfs_db convert function about rt extents, rt block numbers,
and how to compute offsets within the rt bitmap and summary files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: access arbitrary realtime blocks and extents
Darrick J. Wong [Wed, 3 Jul 2024 21:21:57 +0000 (14:21 -0700)]
xfs_db: access arbitrary realtime blocks and extents

Add two commands to xfs_db so that we can point ourselves at any
arbitrary realtime block or extent.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: access realtime file blocks
Darrick J. Wong [Wed, 3 Jul 2024 21:21:57 +0000 (14:21 -0700)]
xfs_db: access realtime file blocks

Now that we have the ability to point the io cursor at the realtime
device, let's make it so that the "dblock" command can walk the contents
of realtime files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: make the daddr command target the realtime device
Darrick J. Wong [Wed, 3 Jul 2024 21:21:57 +0000 (14:21 -0700)]
xfs_db: make the daddr command target the realtime device

Make it so that users can issue the command "daddr -r XXX" to select
disk block XXX on the realtime device.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: report the realtime device when associated with each io cursor
Darrick J. Wong [Wed, 3 Jul 2024 21:21:57 +0000 (14:21 -0700)]
xfs_db: report the realtime device when associated with each io cursor

When db is reporting on an io cursor and the cursor points to the
realtime device, print that fact.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_db: support passing the realtime device to the debugger
Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
xfs_db: support passing the realtime device to the debugger

Create a new -R flag so that sysadmins can pass the realtime device to
the xfs debugger.  Since we can now have superblocks on the rt device,
we need this to be able to inspect/dump/etc.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agomkfs: add a utility to generate protofiles
Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
mkfs: add a utility to generate protofiles

Add a new utility to generate mkfs protofiles from a directory tree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agomkfs.xfs: enable metadata directories
Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
mkfs.xfs: enable metadata directories

Enable formatting filesystems with metadata directories.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: allow sysadmins to add metadata directories
Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
xfs_repair: allow sysadmins to add metadata directories

Allow the sysadmin to use xfs_repair to upgrade an existing filesystem
to support metadata directories.  This will be needed to upgrade
filesystems to support realtime rmap and reflink.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: do not count metadata directory files when doing quotacheck
Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
xfs_repair: do not count metadata directory files when doing quotacheck

Previously, we stated that files in the metadata directory tree are not
counted in the dquot information.  Fix the offline quotacheck code in
xfs_repair and xfs_check to reflect this.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: truncate and unmark orphaned metadata inodes
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: truncate and unmark orphaned metadata inodes

If an inode claims to be a metadata inode but wasn't linked in either
directory tree, remove the attr fork and reset the data fork if the
contents weren't regular extent mappings before moving the inode to the
lost+found.

We don't ifree the inode, because it's possible that the inode was not
actually a metadata inode but simply got corrupted due to bitflips or
something, and we'd rather let the sysadmin examine what's left of the
file instead of photorec'ing it.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: drop all the metadata directory files during pass 4
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: drop all the metadata directory files during pass 4

Drop the entire metadata directory tree during pass 4 so that we can
reinitialize the entire tree in phase 6.  The existing metadata files
(rtbitmap, rtsummary, quotas) will be reattached to the newly rebuilt
directory tree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: metadata dirs are never plausible root dirs
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: metadata dirs are never plausible root dirs

Metadata directories are never candidates to be the root of the
user-accessible directory tree.  Update has_plausible_rootdir to ignore
them all, as well as detecting the case where the superblock incorrectly
thinks both trees have the same root.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: adjust keep_fsinos to handle metadata directories
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: adjust keep_fsinos to handle metadata directories

In keep_fsinos, mark the root of the metadata directory tree as inuse.
The realtime bitmap and summary files still come after the root
directories, so this is a fairly simple change to the loop test.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: mark space used by metadata files
Darrick J. Wong [Wed, 3 Jul 2024 21:21:54 +0000 (14:21 -0700)]
xfs_repair: mark space used by metadata files

Track space used by metadata files as a separate incore extent type.
This ensures that we can warn about cross-linked metadata files, even
though we are going to rebuild the entire metadata directory tree in the
end.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
12 months agoxfs_repair: pass private data pointer to scan_lbtree
Darrick J. Wong [Wed, 3 Jul 2024 21:21:54 +0000 (14:21 -0700)]
xfs_repair: pass private data pointer to scan_lbtree

Pass a private data pointer through scan_lbtree.  We'll use this
later when scanning the rtrmapbt to keep track of scan state.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>