]> www.infradead.org Git - users/hch/xfsprogs.git/log
users/hch/xfsprogs.git
11 months agoxfs: scrub each rtgroup's portion of the rtbitmap separately
Darrick J. Wong [Wed, 3 Jul 2024 21:22:04 +0000 (14:22 -0700)]
xfs: scrub each rtgroup's portion of the rtbitmap separately

Create a new scrub type code so that userspace can scrub each rtgroup's
portion of the rtbitmap file separately.  This reduces the long tail
latency that results from scanning the entire bitmap all at once, and
prepares us for future patchsets, wherein we'll need to be able to lock
a specific rtgroup so that we can rebuild that rtgroup's part of the
rtbitmap contents from the rtgroup's rmap btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: scrub the realtime group superblock
Darrick J. Wong [Wed, 3 Jul 2024 21:22:04 +0000 (14:22 -0700)]
xfs: scrub the realtime group superblock

Enable scrubbing of realtime group superblocks.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: use realtime EFI to free extents when rtgroups are enabled
Darrick J. Wong [Wed, 3 Jul 2024 21:22:04 +0000 (14:22 -0700)]
xfs: use realtime EFI to free extents when rtgroups are enabled

When rmap is enabled, XFS expects a certain order of operations, which
is: 1) remove the file mapping, 2) remove the reverse mapping, and then
3) free the blocks.  When reflink is enabled, XFS replaces (3) with a
deferred refcount decrement operation that can schedule freeing the
blocks if that was the last refcount.

For realtime files, xfs_bmap_del_extent_real tries to do 1 and 3 in the
same transaction, which will break both rmap and reflink unless we
switch it to use realtime EFIs.  Both rmap and reflink depend on the
rtgroups feature, so let's turn on EFIs for all rtgroups filesystems.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: hold an active reference to an rtgroup while processing an EFI
Darrick J. Wong [Wed, 3 Jul 2024 21:22:04 +0000 (14:22 -0700)]
xfs: hold an active reference to an rtgroup while processing an EFI

While we're processing an EFI log item, maintain an active reference to
the rtgroup object so that it cannot go away underneath us.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: use an incore rtgroup rotor for rtpick
Darrick J. Wong [Wed, 3 Jul 2024 21:22:04 +0000 (14:22 -0700)]
xfs: use an incore rtgroup rotor for rtpick

During the 6.7 merge window, Linus noticed that the realtime allocator
was doing some sketchy things trying to encode a u64 sequence counter
into the rtbitmap file's atime.  The sketchy casting of a struct pointer
to a u64 pointer has subtly broken several times over the past decade as
the codebase has transitioned to using the VFS i_atime field and that
field has changed in size and layout over time.

Since the goal of the rtpick code is to _suggest_ a starting place for
new rt file allocations, the repeated breakage has not resulted in
inconsistent metadata.  IOWs, it's a hint.

For rtgroups, we don't need this complex code to cut the rtextents space
into fractions.  Add an rtgroup rotor and use that for rtpick, similar
to AG rotoring on the data device.  The new rotor does not persist,
which reduces the logging overhead slightly.

Between this and the new restrictions on open-by-handle on metadir, it's
no longer possible for userspace to control the rtpick rotor.

Link: https://lore.kernel.org/linux-xfs/CAHk-=wj3oM3d-Hw2vvxys3KCZ9De+gBN7Gxr2jf96OTisL9udw@mail.gmail.com/
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: store rtgroup information with a bmap intent
Darrick J. Wong [Wed, 3 Jul 2024 21:22:03 +0000 (14:22 -0700)]
xfs: store rtgroup information with a bmap intent

Make the bmap intent items take an active reference to the rtgroup
containing the space that is being mapped or unmapped.  We will need
this functionality once we start enabling rmap and reflink on the rt
volume.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: encode the rtsummary in big endian format
Darrick J. Wong [Wed, 3 Jul 2024 21:22:03 +0000 (14:22 -0700)]
xfs: encode the rtsummary in big endian format

Currently, the ondisk realtime summary file counters are accessed in
units of 32-bit words.  There's no endian translation of the contents of
this file, which means that the Bad Things Happen(tm) if you go from
(say) x86 to powerpc.  Since we have a new feature flag, let's take the
opportunity to enforce an endianness on the file.  Encode the summary
information in big endian format, like most of the rest of the
filesystem.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: add block headers to realtime summary blocks
Darrick J. Wong [Wed, 29 May 2024 04:11:20 +0000 (21:11 -0700)]
xfs: add block headers to realtime summary blocks

Upgrade rtsummary blocks to have self describing metadata like most
every other thing in XFS.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: encode the rtbitmap in big endian format
Darrick J. Wong [Wed, 3 Jul 2024 21:22:03 +0000 (14:22 -0700)]
xfs: encode the rtbitmap in big endian format

Currently, the ondisk realtime bitmap file is accessed in units of
32-bit words.  There's no endian translation of the contents of this
file, which means that the Bad Things Happen(tm) if you go from (say)
x86 to powerpc.  Since we have a new feature flag, let's take the
opportunity to enforce an endianness on the file.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: add block headers to realtime bitmap blocks
Darrick J. Wong [Wed, 29 May 2024 04:11:19 +0000 (21:11 -0700)]
xfs: add block headers to realtime bitmap blocks

Upgrade rtbitmap blocks to have self describing metadata like most every
other thing in XFS.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: export the geometry of realtime groups to userspace
Darrick J. Wong [Wed, 29 May 2024 04:11:18 +0000 (21:11 -0700)]
xfs: export the geometry of realtime groups to userspace

Create an ioctl so that the kernel can report the status of realtime
groups to userspace.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: define locking primitives for realtime groups
Darrick J. Wong [Wed, 29 May 2024 04:11:18 +0000 (21:11 -0700)]
xfs: define locking primitives for realtime groups

Define helper functions to lock all metadata inodes related to a
realtime group.  There's not much to look at now, but this will become
important when we add per-rtgroup metadata files and online fsck code
for them.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: record rt group superblock errors in the health system
Darrick J. Wong [Wed, 3 Jul 2024 21:22:02 +0000 (14:22 -0700)]
xfs: record rt group superblock errors in the health system

Record the state of per-rtgroup metadata sickness in the rtgroup
structure for later reporting.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: add frextents to the lazysbcounters when rtgroups enabled
Darrick J. Wong [Wed, 3 Jul 2024 21:22:01 +0000 (14:22 -0700)]
xfs: add frextents to the lazysbcounters when rtgroups enabled

Make the free rt extent count a part of the lazy sb counters when the
realtime groups feature is enabled.  This is possible because the patch
to recompute frextents from the rtbitmap during log recovery predates
the code adding rtgroup support, hence we know that the value will
always be correct during runtime.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: add a helper to prevent bmap merges across rtgroup boundaries
Christoph Hellwig [Wed, 3 Jul 2024 21:22:01 +0000 (14:22 -0700)]
xfs: add a helper to prevent bmap merges across rtgroup boundaries

Except for the rt superblock, realtime groups do not store any metadata
at the start (or end) of the group.  There is nothing to prevent the
bmap code from merging allocations from multiple groups into a single
bmap record.  Add a helper to check for this case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: massage the commit message after pulling this into rtgroups]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: check that rtblock extents do not break rtsupers or rtgroups
Darrick J. Wong [Wed, 29 May 2024 04:11:16 +0000 (21:11 -0700)]
xfs: check that rtblock extents do not break rtsupers or rtgroups

Check that rt block pointers do not point to the realtime superblock and
that allocated rt space extents do not cross rtgroup boundaries.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: create a separate helper to validate rt freespace extents
Darrick J. Wong [Wed, 3 Jul 2024 21:22:01 +0000 (14:22 -0700)]
xfs: create a separate helper to validate rt freespace extents

Realtime allocation groups are not like AGs on the data device which
have a superblock to ensure that a space extent cannot ever cross an AG
boundary.  To make sharding and fsck easier, we would like for allocated
space extents in the realtime volume never to cross an rtgroup boundary.

Therefore, we need two rtblock/rtextent predicates here.  In the next
patch, the xfs_verify_rtbext helper will check that the arguments do not
cross an rtgroup boundary.

However, the rtbitmap/summary files can describe free space extents that
/do/ cross rtgroup boundaries.  For them, create a xfs_verify_rt_freesp
helper that doesn't care about rtgroup boundaries.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: export realtime group geometry via XFS_FSOP_GEOM
Darrick J. Wong [Wed, 29 May 2024 04:11:16 +0000 (21:11 -0700)]
xfs: export realtime group geometry via XFS_FSOP_GEOM

Export the realtime geometry information so that userspace can query it.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: grow the realtime section when realtime groups are enabled
Darrick J. Wong [Wed, 3 Jul 2024 21:22:00 +0000 (14:22 -0700)]
xfs: grow the realtime section when realtime groups are enabled

Enable growing the rt section when realtime groups are enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: update realtime super every time we update the primary fs super
Darrick J. Wong [Wed, 29 May 2024 04:11:13 +0000 (21:11 -0700)]
xfs: update realtime super every time we update the primary fs super

Every time we update parts of the primary filesystem superblock that are
echoed in the rt superblock, we must update the rt super.  Avoid
changing the log to support logging to the rt device by using ordered
buffers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: define the format of rt groups
Darrick J. Wong [Wed, 29 May 2024 04:11:12 +0000 (21:11 -0700)]
xfs: define the format of rt groups

Define the ondisk format of realtime group metadata, and a superblock
for realtime volumes.  rt supers are protected by a separate rocompat
bit so that we can leave them off if the rt device is zoned.

Add a xfs_sb_version_hasrtgroups so that xfs_repair knows how to zero
the tail of superblocks.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: create incore realtime group structures
Darrick J. Wong [Wed, 29 May 2024 04:11:11 +0000 (21:11 -0700)]
xfs: create incore realtime group structures

Create an incore object that will contain information about a realtime
allocation group.  This will eventually enable us to shard the realtime
section in a similar manner to how we shard the data section.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_logprint: report realtime EFIs
Darrick J. Wong [Wed, 3 Jul 2024 21:21:59 +0000 (14:21 -0700)]
xfs_logprint: report realtime EFIs

Decode the EFI format just enough to report if an EFI targets the
realtime device or not.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: support error injection when freeing rt extents
Darrick J. Wong [Wed, 3 Jul 2024 21:21:59 +0000 (14:21 -0700)]
xfs: support error injection when freeing rt extents

A handful of fstests expect to be able to test what happens when extent
free intents fail to actually free the extent.  Now that we're
supporting EFIs for realtime extents, add to xfs_rtfree_extent the same
injection point that exists in the regular extent freeing code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: support logging EFIs for realtime extents
Darrick J. Wong [Wed, 3 Jul 2024 21:21:59 +0000 (14:21 -0700)]
xfs: support logging EFIs for realtime extents

Teach the EFI mechanism how to free realtime extents.  We're going to
need this to enforce proper ordering of operations when we enable
realtime rmap.

Declare a new log intent item type (XFS_LI_EFI_RT) and a separate defer
ops for rt extents.  This keeps the ondisk artifacts and processing code
completely separate between the rt and non-rt cases.  Hopefully this
will make it easier to debug filesystem problems.

Previous versions of this patch accomplished this by setting the high
bit in each rt EFI extent.  This was found to be less transparent by
reviewers.

[Contains a bug fix and cleanups from hch]

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: simplify xfs_rtalloc_query_range
Christoph Hellwig [Wed, 3 Jul 2024 21:21:59 +0000 (14:21 -0700)]
xfs: simplify xfs_rtalloc_query_range

There isn't much of a good reason to pass the xfs_rtalloc_rec structures
that describe extents to xfs_rtalloc_query_range as we really just want
a lower and upper bound xfs_rtxnum_t.  Pass the rtxnum directly and
simply the interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: remove xfs_rtb_to_rtxrem
Christoph Hellwig [Wed, 3 Jul 2024 21:21:59 +0000 (14:21 -0700)]
xfs: remove xfs_rtb_to_rtxrem

Simplify the number of block number conversion helpers by removing
xfs_rtb_to_rtxrem.  Any recent compiler is smart enough to eliminate
the double divisions if using separate xfs_rtb_to_rtx and
xfs_rtb_to_rtxoff calls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agolibxfs: remove duplicate rtalloc declarations in libxfs.h
Christoph Hellwig [Tue, 9 Jul 2024 05:46:15 +0000 (22:46 -0700)]
libxfs: remove duplicate rtalloc declarations in libxfs.h

These already come from xfs_rtbitmap.h.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: libxfs, not xfs]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: remove XFS_ILOCK_RT*
Darrick J. Wong [Wed, 29 May 2024 04:11:10 +0000 (21:11 -0700)]
xfs: remove XFS_ILOCK_RT*

Now that we've centralized the realtime metadata locking routines, get
rid of the ILOCK subclasses since we now use explicit lockdep classes.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: refactor generate_rtinfo
Christoph Hellwig [Tue, 30 Jul 2024 20:58:38 +0000 (13:58 -0700)]
xfs_repair: refactor generate_rtinfo

Move the allocation of the computed values into generate_rtinfo, and thus
make the variables holding them private in rt.c, and clean up a few
formatting nits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: move functions to fix build errors]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: stop preallocating blocks in mk_rbmino and mk_rsumino
Christoph Hellwig [Tue, 30 Jul 2024 20:31:50 +0000 (13:31 -0700)]
xfs_repair: stop preallocating blocks in mk_rbmino and mk_rsumino

Now that repair is using libxfs_rtfile_initialize_blocks to write to the
rtbitmap and rtsummary inodes, space allocation is already taken care of
that helper and there is no need to preallocate it.  Remove the code to
do so.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: use libxfs_rtfile_initialize_blocks
Christoph Hellwig [Tue, 30 Jul 2024 20:31:06 +0000 (13:31 -0700)]
xfs_repair: use libxfs_rtfile_initialize_blocks

Use libxfs_rtfile_initialize_blocks to write the re-computed rtbitmap
and rtsummary contents.  This removes duplicate code and prepares for
even more sharing once the rtgroup features adds a metadata header to
the rtbitmap and rtsummary blocks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agomkfs: use xfs_rtfile_initialize_blocks
Christoph Hellwig [Tue, 30 Jul 2024 20:25:34 +0000 (13:25 -0700)]
mkfs: use xfs_rtfile_initialize_blocks

Use the new libxfs helper for initializing the rtbitmap/summary files
for rtgroup-enabled file systems.  Also skip the zeroing of the blocks
for rtgroup file systems as we'll overwrite every block instantly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agomkfs: remove a pointless rtfreesp_init forward declaration
Christoph Hellwig [Tue, 30 Jul 2024 20:24:14 +0000 (13:24 -0700)]
mkfs: remove a pointless rtfreesp_init forward declaration

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: use xfs_validate_rt_geometry
Christoph Hellwig [Tue, 30 Jul 2024 20:22:53 +0000 (13:22 -0700)]
xfs_repair: use xfs_validate_rt_geometry

Use shared libxfs code with the kernel instead of reimplementing it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: always call into phase5()
Christoph Hellwig [Tue, 30 Jul 2024 20:38:26 +0000 (13:38 -0700)]
xfs_repair: always call into phase5()

Make the phase5() function handle the rt file checking for the no-modify
case instead of special casing it in the main repair flow.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: push transaction join out of xfs_rtbitmap_lock and xfs_rtgroup_lock
Christoph Hellwig [Tue, 30 Jul 2024 23:54:04 +0000 (16:54 -0700)]
xfs: push transaction join out of xfs_rtbitmap_lock and xfs_rtgroup_lock

To prepare for being able to join an already locked rtbitmap inode to a
transaction split out separate helpers for joining the transaction from
the locking helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: factor out rtbitmap/summary initialization helpers
Christoph Hellwig [Tue, 30 Jul 2024 17:54:12 +0000 (10:54 -0700)]
xfs: factor out rtbitmap/summary initialization helpers

Add helpers to libxfs that can be shared by growfs and mkfs for
initializing the rtbitmap and summary, and by passing the optional
data pointer also by repair for rebuilding them.  This will become
even more useful when the rtgroups feature adѕ a metadata header
to each block, which means even more shared code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: minor documentation and data advance tweaks]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: add bounds checking to xfs_rt{bitmap,summary}_read_buf
Christoph Hellwig [Tue, 30 Jul 2024 18:17:10 +0000 (11:17 -0700)]
xfs: add bounds checking to xfs_rt{bitmap,summary}_read_buf

Add a corruption check for passing an invalid block number, which is a
lot easier to understand than the xfs_bmapi_read failure later on.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: assert a valid limit in xfs_rtfind_forw
Christoph Hellwig [Tue, 30 Jul 2024 18:17:09 +0000 (11:17 -0700)]
xfs: assert a valid limit in xfs_rtfind_forw

Protect against developers passing stupid limits when refactoring the
RT code once again.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: remove the limit argument to xfs_rtfind_back
Christoph Hellwig [Tue, 30 Jul 2024 18:17:09 +0000 (11:17 -0700)]
xfs: remove the limit argument to xfs_rtfind_back

All callers pass a 0 limit to xfs_rtfind_back, so remove the argument
and hard code it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: factor out a xfs_validate_rt_geometry helper
Christoph Hellwig [Tue, 30 Jul 2024 18:17:09 +0000 (11:17 -0700)]
xfs: factor out a xfs_validate_rt_geometry helper

Split the RT geometry validation in the early mount code into a
helper than can be reused by repair (from which this code was
apparently originally stolen anyway).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: u64 return value for calc_rbmblocks]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: remove xfs_validate_rtextents
Christoph Hellwig [Tue, 30 Jul 2024 18:17:09 +0000 (11:17 -0700)]
xfs: remove xfs_validate_rtextents

Replace xfs_validate_rtextents with an open coded check for 0
rtextents.  The name for the function implies it does a lot more
than a zero check, which is more obvious when open coded.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: allow setting current address to log blocks
Darrick J. Wong [Wed, 3 Jul 2024 21:21:58 +0000 (14:21 -0700)]
xfs_db: allow setting current address to log blocks

Add commands so that users can target blocks on an external log device.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: convert rtsummary geometry
Darrick J. Wong [Wed, 3 Jul 2024 21:21:58 +0000 (14:21 -0700)]
xfs_db: convert rtsummary geometry

Teach the rtconvert command to be able to convert realtime blocks and
extents to locations within the rt summary.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: convert rtbitmap geometry
Darrick J. Wong [Wed, 3 Jul 2024 21:21:58 +0000 (14:21 -0700)]
xfs_db: convert rtbitmap geometry

Teach the rtconvert command to be able to convert realtime blocks and
extents to locations within the rt bitmap.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: enable conversion of rt space units
Darrick J. Wong [Wed, 3 Jul 2024 21:21:58 +0000 (14:21 -0700)]
xfs_db: enable conversion of rt space units

Teach the xfs_db convert function about rt extents, rt block numbers,
and how to compute offsets within the rt bitmap and summary files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: access arbitrary realtime blocks and extents
Darrick J. Wong [Wed, 3 Jul 2024 21:21:57 +0000 (14:21 -0700)]
xfs_db: access arbitrary realtime blocks and extents

Add two commands to xfs_db so that we can point ourselves at any
arbitrary realtime block or extent.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: access realtime file blocks
Darrick J. Wong [Wed, 3 Jul 2024 21:21:57 +0000 (14:21 -0700)]
xfs_db: access realtime file blocks

Now that we have the ability to point the io cursor at the realtime
device, let's make it so that the "dblock" command can walk the contents
of realtime files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: make the daddr command target the realtime device
Darrick J. Wong [Wed, 3 Jul 2024 21:21:57 +0000 (14:21 -0700)]
xfs_db: make the daddr command target the realtime device

Make it so that users can issue the command "daddr -r XXX" to select
disk block XXX on the realtime device.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: report the realtime device when associated with each io cursor
Darrick J. Wong [Wed, 3 Jul 2024 21:21:57 +0000 (14:21 -0700)]
xfs_db: report the realtime device when associated with each io cursor

When db is reporting on an io cursor and the cursor points to the
realtime device, print that fact.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: support passing the realtime device to the debugger
Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
xfs_db: support passing the realtime device to the debugger

Create a new -R flag so that sysadmins can pass the realtime device to
the xfs debugger.  Since we can now have superblocks on the rt device,
we need this to be able to inspect/dump/etc.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agomkfs: add a utility to generate protofiles
Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
mkfs: add a utility to generate protofiles

Add a new utility to generate mkfs protofiles from a directory tree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agomkfs.xfs: enable metadata directories
Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
mkfs.xfs: enable metadata directories

Enable formatting filesystems with metadata directories.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: allow sysadmins to add metadata directories
Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
xfs_repair: allow sysadmins to add metadata directories

Allow the sysadmin to use xfs_repair to upgrade an existing filesystem
to support metadata directories.  This will be needed to upgrade
filesystems to support realtime rmap and reflink.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: do not count metadata directory files when doing quotacheck
Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
xfs_repair: do not count metadata directory files when doing quotacheck

Previously, we stated that files in the metadata directory tree are not
counted in the dquot information.  Fix the offline quotacheck code in
xfs_repair and xfs_check to reflect this.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: truncate and unmark orphaned metadata inodes
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: truncate and unmark orphaned metadata inodes

If an inode claims to be a metadata inode but wasn't linked in either
directory tree, remove the attr fork and reset the data fork if the
contents weren't regular extent mappings before moving the inode to the
lost+found.

We don't ifree the inode, because it's possible that the inode was not
actually a metadata inode but simply got corrupted due to bitflips or
something, and we'd rather let the sysadmin examine what's left of the
file instead of photorec'ing it.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: drop all the metadata directory files during pass 4
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: drop all the metadata directory files during pass 4

Drop the entire metadata directory tree during pass 4 so that we can
reinitialize the entire tree in phase 6.  The existing metadata files
(rtbitmap, rtsummary, quotas) will be reattached to the newly rebuilt
directory tree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: metadata dirs are never plausible root dirs
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: metadata dirs are never plausible root dirs

Metadata directories are never candidates to be the root of the
user-accessible directory tree.  Update has_plausible_rootdir to ignore
them all, as well as detecting the case where the superblock incorrectly
thinks both trees have the same root.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: adjust keep_fsinos to handle metadata directories
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: adjust keep_fsinos to handle metadata directories

In keep_fsinos, mark the root of the metadata directory tree as inuse.
The realtime bitmap and summary files still come after the root
directories, so this is a fairly simple change to the loop test.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: mark space used by metadata files
Darrick J. Wong [Wed, 3 Jul 2024 21:21:54 +0000 (14:21 -0700)]
xfs_repair: mark space used by metadata files

Track space used by metadata files as a separate incore extent type.
This ensures that we can warn about cross-linked metadata files, even
though we are going to rebuild the entire metadata directory tree in the
end.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: pass private data pointer to scan_lbtree
Darrick J. Wong [Wed, 3 Jul 2024 21:21:54 +0000 (14:21 -0700)]
xfs_repair: pass private data pointer to scan_lbtree

Pass a private data pointer through scan_lbtree.  We'll use this
later when scanning the rtrmapbt to keep track of scan state.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: update incore metadata state whenever we create new files
Darrick J. Wong [Wed, 3 Jul 2024 21:21:54 +0000 (14:21 -0700)]
xfs_repair: update incore metadata state whenever we create new files

Make sure that we update our incore metadata inode bookkeepping whenever
we create new metadata files.  There will be many more of these later.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: don't let metadata and regular files mix
Darrick J. Wong [Wed, 3 Jul 2024 21:21:54 +0000 (14:21 -0700)]
xfs_repair: don't let metadata and regular files mix

Track whether or not inodes thought they were metadata inodes.  We
cannot allow metadata inodes to appear in the regular directory tree,
and we cannot allow regular inodes to appear in the metadata directory
tree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: rebuild the metadata directory
Darrick J. Wong [Wed, 3 Jul 2024 21:21:54 +0000 (14:21 -0700)]
xfs_repair: rebuild the metadata directory

Check the dirents in metadata directories for problems and repair them
if necessary.  Also make sure that the sb-rooted inodes (root, metadir
root, rt bitmap, rt summary) are always allocated in that order.

Note that xfs_repair will always rebuild the metadata directory tree
itself, so we only need to report problems, not fix them.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: check metadata inode flag
Darrick J. Wong [Wed, 3 Jul 2024 21:21:53 +0000 (14:21 -0700)]
xfs_repair: check metadata inode flag

Check whether or not the metadata inode flag is set appropriately.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: refactor grabbing realtime metadata inodes
Darrick J. Wong [Wed, 3 Jul 2024 21:21:53 +0000 (14:21 -0700)]
xfs_repair: refactor grabbing realtime metadata inodes

Create a helper function to grab a realtime metadata inode.  When
metadir arrives, the bitmap and summary inodes can float, so we'll
turn this function into a "load or allocate" function.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: refactor root directory initialization
Darrick J. Wong [Wed, 3 Jul 2024 21:21:53 +0000 (14:21 -0700)]
xfs_repair: refactor root directory initialization

Refactor root directory initialization into a separate function we can
call for both the root dir and the metadir.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: refactor marking of metadata inodes
Darrick J. Wong [Wed, 3 Jul 2024 21:21:53 +0000 (14:21 -0700)]
xfs_repair: refactor marking of metadata inodes

Refactor the mechanics of marking a metadata inode into a helper
function so that we don't have to open-code that for every single
metadata inode.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: refactor fixing dotdot
Darrick J. Wong [Wed, 3 Jul 2024 21:21:52 +0000 (14:21 -0700)]
xfs_repair: refactor fixing dotdot

Pull the code that fixes a directory's dot-dot entry into a separate
helper function so that we can call it on the rootdir and (later) the
metadir.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: dont check metadata directory dirent inumbers
Darrick J. Wong [Wed, 3 Jul 2024 21:21:52 +0000 (14:21 -0700)]
xfs_repair: dont check metadata directory dirent inumbers

Phase 6 always rebuilds the entire metadata directory tree, and repair
quietly ignores all the DIFLAG2_METADATA directory inodes that it finds.
As a result, none of the metadata directories are marked inuse in the
incore data.  Therefore, the is_inode_free checks are not valid for
anything we find in a metadata directory.

Therefore, avoid checking is_inode_free when scanning metadata directory
dirents.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: preserve the metadirino field when zeroing supers
Darrick J. Wong [Wed, 3 Jul 2024 21:21:52 +0000 (14:21 -0700)]
xfs_repair: preserve the metadirino field when zeroing supers

The metadata directory root inumber is now the last field in the
superblock, so extend the zeroing code to know about that.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_scrub: re-run metafile scrubbers during phase 5
Darrick J. Wong [Wed, 3 Jul 2024 21:21:52 +0000 (14:21 -0700)]
xfs_scrub: re-run metafile scrubbers during phase 5

For metadata files on a metadir filesystem, re-run the scrubbers during
phase 5 to ensure that the metadata files are still connected.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_scrub: scan metadata directories during phase 3
Darrick J. Wong [Wed, 3 Jul 2024 21:21:52 +0000 (14:21 -0700)]
xfs_scrub: scan metadata directories during phase 3

Scan metadata directories for correctness during phase 3.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_spaceman: report health of metadir inodes too
Darrick J. Wong [Wed, 3 Jul 2024 21:21:51 +0000 (14:21 -0700)]
xfs_spaceman: report health of metadir inodes too

If the filesystem has a metadata directory tree, we should include those
inodes in the health report.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_io: support scrubbing metadata directory paths
Darrick J. Wong [Wed, 3 Jul 2024 21:21:51 +0000 (14:21 -0700)]
xfs_io: support scrubbing metadata directory paths

Support invoking the metadata directory path scrubber from xfs_io for
testing.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_io: support the bulkstat metadata directory flag
Darrick J. Wong [Wed, 3 Jul 2024 21:21:51 +0000 (14:21 -0700)]
xfs_io: support the bulkstat metadata directory flag

Support the new XFS_BULK_IREQ_METADIR flag for bulkstat commands.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: show the metadata root directory when dumping superblocks
Darrick J. Wong [Wed, 3 Jul 2024 21:21:51 +0000 (14:21 -0700)]
xfs_db: show the metadata root directory when dumping superblocks

Show the metadirino field when appropriate.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: support metadata directories in the path command
Darrick J. Wong [Wed, 3 Jul 2024 21:21:50 +0000 (14:21 -0700)]
xfs_db: support metadata directories in the path command

Teach the path command to traverse the metadata directory tree by
passing a '\' as the first letter in the path.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: don't obfuscate metadata directories and attributes
Darrick J. Wong [Wed, 3 Jul 2024 21:21:50 +0000 (14:21 -0700)]
xfs_db: don't obfuscate metadata directories and attributes

Don't obfuscate the directory and attribute names of metadata inodes.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: report metadir support for version command
Darrick J. Wong [Wed, 3 Jul 2024 21:21:50 +0000 (14:21 -0700)]
xfs_db: report metadir support for version command

Report metadir support if we have it enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_db: disable xfs_check when metadir is enabled
Darrick J. Wong [Wed, 3 Jul 2024 21:21:50 +0000 (14:21 -0700)]
xfs_db: disable xfs_check when metadir is enabled

As of July 2024, xfs_repair can detect more types of corruptions than
xfs_check does.  I don't think it makes sense to maintain the xfs_check
code anymore, so let's just turn it off for any filesystem that has
metadata directory trees.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_io: support scrubbing metadata directory paths
Darrick J. Wong [Wed, 3 Jul 2024 21:21:49 +0000 (14:21 -0700)]
xfs_io: support scrubbing metadata directory paths

Support invoking the metadata directory path scrubber from xfs_io for
testing.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agolibfrog: allow METADIR in xfrog_bulkstat_single5
Darrick J. Wong [Wed, 3 Jul 2024 21:21:49 +0000 (14:21 -0700)]
libfrog: allow METADIR in xfrog_bulkstat_single5

This is a valid flag for a single-file bulkstat, so add that to the
filter.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agolibfrog: report metadata directories in the geometry report
Darrick J. Wong [Wed, 3 Jul 2024 21:21:49 +0000 (14:21 -0700)]
libfrog: report metadata directories in the geometry report

Report the presence of a metadata directory tree in the geometry report.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: enable metadata directory feature
Darrick J. Wong [Wed, 3 Jul 2024 21:21:49 +0000 (14:21 -0700)]
xfs: enable metadata directory feature

Enable the metadata directory feature.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: check metadata directory file path connectivity
Darrick J. Wong [Wed, 3 Jul 2024 21:21:49 +0000 (14:21 -0700)]
xfs: check metadata directory file path connectivity

Create a new scrubber type that checks that well known metadata
directory paths are connected to the metadata inode that the incore
structures think is in use.  IOWs, check that "/quota/user" in the
metadata directory tree actually points to
mp->m_quotainfo->qi_uquotaip->i_ino.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: record health problems with the metadata directory
Darrick J. Wong [Wed, 29 May 2024 04:11:03 +0000 (21:11 -0700)]
xfs: record health problems with the metadata directory

Make a report to the health monitoring subsystem any time we encounter
something in the metadata directory tree that looks like corruption.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: adjust xfs_bmap_add_attrfork for metadir
Darrick J. Wong [Wed, 29 May 2024 04:11:02 +0000 (21:11 -0700)]
xfs: adjust xfs_bmap_add_attrfork for metadir

Online repair might use the xfs_bmap_add_attrfork to repair a file in
the metadata directory tree if (say) the metadata file lacks the correct
parent pointers.  In that case, it is not correct to check that the file
is dqattached -- metadata files must be not have /any/ dquot attached at
all.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: allow bulkstat to return metadata directories
Darrick J. Wong [Wed, 3 Jul 2024 21:21:48 +0000 (14:21 -0700)]
xfs: allow bulkstat to return metadata directories

Allow the V5 bulkstat ioctl to return information about metadata
directory files so that xfs_scrub can find and scrub them, since they
are otherwise ordinary directories.

(Metadata files of course require per-file scrub code and hence do not
need exposure.)

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: advertise metadata directory feature
Darrick J. Wong [Wed, 3 Jul 2024 21:21:47 +0000 (14:21 -0700)]
xfs: advertise metadata directory feature

Advertise the existence of the metadata directory feature; this will be
used by scrub to decide if it needs to scan the metadir too.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: disable the agi rotor for metadata inodes
Darrick J. Wong [Wed, 3 Jul 2024 21:21:47 +0000 (14:21 -0700)]
xfs: disable the agi rotor for metadata inodes

Ideally, we'd put all the metadata inodes in one place if we could, so
that the metadata all stay reasonably close together instead of
spreading out over the disk.  Furthermore, if the log is internal we'd
probably prefer to keep the metadata near the log.  Therefore, disable
AGI rotoring for metadata inode allocations.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: read and write metadata inode directory tree
Darrick J. Wong [Wed, 29 May 2024 04:10:58 +0000 (21:10 -0700)]
xfs: read and write metadata inode directory tree

Plumb in the bits we need to load metadata inodes from a named entry in
a metadir directory, create (or hardlink) inodes into a metadir
directory, create metadir directories, and flag inodes as being metadata
files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: enforce metadata inode flag
Darrick J. Wong [Wed, 29 May 2024 04:10:57 +0000 (21:10 -0700)]
xfs: enforce metadata inode flag

Add checks for the metadata inode flag so that we don't ever leak
metadata inodes out to userspace, and we don't ever try to read a
regular inode as metadata.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: load metadata directory root at mount time
Darrick J. Wong [Wed, 3 Jul 2024 21:21:46 +0000 (14:21 -0700)]
xfs: load metadata directory root at mount time

Load the metadata directory root inode into memory at mount time and
release it at unmount time.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: define the on-disk format for the metadir feature
Darrick J. Wong [Wed, 29 May 2024 04:10:55 +0000 (21:10 -0700)]
xfs: define the on-disk format for the metadir feature

Define the on-disk layout and feature flags for the metadata inode
directory feature.  Add a xfs_sb_version_hasmetadir for benefit of
xfs_repair, which needs to know where the new end of the superblock
lies.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: iget for metadata inodes
Darrick J. Wong [Wed, 29 May 2024 04:10:54 +0000 (21:10 -0700)]
xfs: iget for metadata inodes

Create a xfs_imeta_iget function for metadata inodes to ensure that when
we try to iget a metadata file, the inobt thinks a metadata inode is in
use and that the file type matches what we are expecting.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs: pass the icreate args object to xfs_dialloc
Darrick J. Wong [Wed, 3 Jul 2024 21:21:45 +0000 (14:21 -0700)]
xfs: pass the icreate args object to xfs_dialloc

Pass the xfs_icreate_args object to xfs_dialloc since we can extract the
relevant mode (really just the file type) and parent inumber from there.
This simplifies the calling convention in preparation for the next
patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: upgrade an existing filesystem to have parent pointers
Darrick J. Wong [Wed, 3 Jul 2024 21:21:45 +0000 (14:21 -0700)]
xfs_repair: upgrade an existing filesystem to have parent pointers

Upgrade an existing filesystem to have parent pointers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
11 months agoxfs_repair: allow sysadmins to add reverse mapping indexes
Darrick J. Wong [Wed, 3 Jul 2024 21:21:45 +0000 (14:21 -0700)]
xfs_repair: allow sysadmins to add reverse mapping indexes

Allow the sysadmin to use xfs_repair to upgrade an existing filesystem
to support the reverse mapping btree index.  This is needed for online
fsck.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>