]> www.infradead.org Git - users/hch/xfs.git/log
users/hch/xfs.git
8 months agoxfs: remove superfluous arguments to xfs_rtrefcountbt_init_cursor xfs-generic-group-rebase
Christoph Hellwig [Sat, 21 Sep 2024 14:08:12 +0000 (16:08 +0200)]
xfs: remove superfluous arguments to xfs_rtrefcountbt_init_cursor

The mount structure and inode can be derived from the rtg.

Signed-off-by: Christoph Hellwig <hch@lst.de>
8 months agoxfs: remove superfluous arguments to xfs_rtrmapbt_init_cursor
Christoph Hellwig [Sun, 22 Sep 2024 05:32:52 +0000 (07:32 +0200)]
xfs: remove superfluous arguments to xfs_rtrmapbt_init_cursor

The mount structure and inode can be derived from the rtg.

Signed-off-by: Christoph Hellwig <hch@lst.de>
8 months agoxfs: enable realtime reflink
Darrick J. Wong [Thu, 15 Aug 2024 18:49:50 +0000 (11:49 -0700)]
xfs: enable realtime reflink

Enable reflink for realtime devices, sort of.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: fix CoW forks for realtime files
Darrick J. Wong [Sun, 22 Sep 2024 05:42:18 +0000 (07:42 +0200)]
xfs: fix CoW forks for realtime files

Port the copy on write fork repair to realtime files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: check for shared rt extents when rebuilding rt file's data fork
Darrick J. Wong [Thu, 15 Aug 2024 18:49:48 +0000 (11:49 -0700)]
xfs: check for shared rt extents when rebuilding rt file's data fork

When we're rebuilding the data fork of a realtime file, we need to
cross-reference each mapping with the rt refcount btree to ensure that
the reflink flag is set if there are any shared extents found.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: repair inodes that have a refcount btree in the data fork
Darrick J. Wong [Thu, 15 Aug 2024 18:49:47 +0000 (11:49 -0700)]
xfs: repair inodes that have a refcount btree in the data fork

Plumb knowledge of refcount btrees into the inode core repair code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: online repair of the realtime refcount btree
Darrick J. Wong [Sat, 21 Sep 2024 05:47:38 +0000 (07:47 +0200)]
xfs: online repair of the realtime refcount btree

Port the data device's refcount btree repair code to the realtime
refcount btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: capture realtime CoW staging extents when rebuilding rt rmapbt
Darrick J. Wong [Thu, 15 Aug 2024 18:49:46 +0000 (11:49 -0700)]
xfs: capture realtime CoW staging extents when rebuilding rt rmapbt

Walk the realtime refcount btree to find the CoW staging extents when
we're rebuilding the realtime rmap btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: walk the rt reference count tree when rebuilding rmap
Darrick J. Wong [Thu, 15 Aug 2024 18:49:45 +0000 (11:49 -0700)]
xfs: walk the rt reference count tree when rebuilding rmap

When we're rebuilding the data device rmap, if we encounter a "refcount"
format fork, we have to walk the (realtime) refcount btree inode to
build the appropriate mappings.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: check new rtbitmap records against rt refcount btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:44 +0000 (11:49 -0700)]
xfs: check new rtbitmap records against rt refcount btree

When we're rebuilding the realtime bitmap, check the proposed free
extents against the rt refcount btree to make sure we don't commit any
grievous errors.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: don't flag quota rt block usage on rtreflink filesystems
Darrick J. Wong [Thu, 15 Aug 2024 18:49:43 +0000 (11:49 -0700)]
xfs: don't flag quota rt block usage on rtreflink filesystems

Quota space usage is allowed to exceed the size of the physical storage
when reflink is enabled.  Now that we have reflink for the realtime
volume, apply this same logic to the rtb repair logic.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: scrub the metadir path of rt refcount btree files
Darrick J. Wong [Thu, 15 Aug 2024 18:49:42 +0000 (11:49 -0700)]
xfs: scrub the metadir path of rt refcount btree files

Add a new XFS_SCRUB_METAPATH subtype so that we can scrub the metadata
directory tree path to the refcount btree file for each rt group.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: detect and repair misaligned rtinherit directory cowextsize hints
Darrick J. Wong [Thu, 15 Aug 2024 18:49:41 +0000 (11:49 -0700)]
xfs: detect and repair misaligned rtinherit directory cowextsize hints

If we encounter a directory that has been configured to pass on a CoW
extent size hint to a new realtime file and the hint isn't an integer
multiple of the rt extent size, we should flag the hint for
administrative review and/or turn it off because that is a
misconfiguration.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: allow dquot rt block count to exceed rt blocks on reflink fs
Darrick J. Wong [Thu, 15 Aug 2024 18:49:40 +0000 (11:49 -0700)]
xfs: allow dquot rt block count to exceed rt blocks on reflink fs

Update the quota scrubber to allow dquots where the realtime block count
exceeds the block count of the rt volume if reflink is enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: check reference counts of gaps between rt refcount records
Darrick J. Wong [Thu, 15 Aug 2024 18:49:40 +0000 (11:49 -0700)]
xfs: check reference counts of gaps between rt refcount records

If there's a gap between records in the rt refcount btree, we ought to
cross-reference the gap with the rtrmap records to make sure that there
aren't any overlapping records for a region that doesn't have any shared
ownership.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: allow overlapping rtrmapbt records for shared data extents
Darrick J. Wong [Thu, 15 Aug 2024 18:49:39 +0000 (11:49 -0700)]
xfs: allow overlapping rtrmapbt records for shared data extents

Allow overlapping realtime reverse mapping records if they both describe
shared data extents and the fs supports reflink on the realtime volume.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: cross-reference checks with the rt refcount btree
Darrick J. Wong [Sat, 21 Sep 2024 05:44:11 +0000 (07:44 +0200)]
xfs: cross-reference checks with the rt refcount btree

Use the realtime refcount btree to implement cross-reference checks in
other data structures.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: scrub the realtime refcount btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:37 +0000 (11:49 -0700)]
xfs: scrub the realtime refcount btree

Add code to scrub realtime refcount btrees.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: report realtime refcount btree corruption errors to the health system
Darrick J. Wong [Thu, 15 Aug 2024 18:49:36 +0000 (11:49 -0700)]
xfs: report realtime refcount btree corruption errors to the health system

Whenever we encounter corrupt realtime refcount btree blocks, we should
report that to the health monitoring system for later reporting.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: check that the rtrefcount maxlevels doesn't increase when growing fs
Darrick J. Wong [Thu, 15 Aug 2024 18:49:35 +0000 (11:49 -0700)]
xfs: check that the rtrefcount maxlevels doesn't increase when growing fs

The size of filesystem transaction reservations depends on the maximum
height (maxlevels) of the realtime btrees.  Since we don't want a grow
operation to increase the reservation size enough that we'll fail the
minimum log size checks on the next mount, constrain growfs operations
if they would cause an increase in the rt refcount btree maxlevels.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: enable extent size hints for CoW operations
Darrick J. Wong [Thu, 15 Aug 2024 18:49:34 +0000 (11:49 -0700)]
xfs: enable extent size hints for CoW operations

Wire up the copy-on-write extent size hint for realtime files, and
connect it to the rt allocator so that we avoid fragmentation on rt
filesystems.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: apply rt extent alignment constraints to CoW extsize hint
Darrick J. Wong [Thu, 15 Aug 2024 18:49:33 +0000 (11:49 -0700)]
xfs: apply rt extent alignment constraints to CoW extsize hint

The copy-on-write extent size hint is subject to the same alignment
constraints as the regular extent size hint.  Since we're in the process
of adding reflink (and therefore CoW) to the realtime device, we must
apply the same scattered rextsize alignment validation strategies to
both hints to deal with the possibility of rextsize changing.

Therefore, fix the inode validator to perform rextsize alignment checks
on regular realtime files, and to remove misaligned directory hints.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: fix xfs_get_extsz_hint behavior with realtime alwayscow files
Darrick J. Wong [Thu, 15 Aug 2024 18:49:33 +0000 (11:49 -0700)]
xfs: fix xfs_get_extsz_hint behavior with realtime alwayscow files

Currently, we (ab)use xfs_get_extsz_hint so that it always returns a
nonzero value for realtime files.  This apparently was done to disable
delayed allocation for realtime files.

However, once we enable realtime reflink, we can also turn on the
alwayscow flag to force CoW writes to realtime files.  In this case, the
logic will incorrectly send the write through the delalloc write path.

Fix this by adjusting the logic slightly.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: recover CoW leftovers in the realtime volume
Darrick J. Wong [Sat, 21 Sep 2024 05:39:56 +0000 (07:39 +0200)]
xfs: recover CoW leftovers in the realtime volume

Scan the realtime refcount tree at mount time to get rid of leftover
CoW staging extents.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: allow inodes to have the realtime and reflink flags
Darrick J. Wong [Thu, 15 Aug 2024 18:49:31 +0000 (11:49 -0700)]
xfs: allow inodes to have the realtime and reflink flags

Now that we can share blocks between realtime files, allow this
combination.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: enable sharing of realtime file blocks
Darrick J. Wong [Thu, 15 Aug 2024 18:49:30 +0000 (11:49 -0700)]
xfs: enable sharing of realtime file blocks

Update the remapping routines to be able to handle realtime files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: enable CoW for realtime data
Darrick J. Wong [Thu, 15 Aug 2024 18:49:29 +0000 (11:49 -0700)]
xfs: enable CoW for realtime data

Update our write paths to support copy on write on the rt volume.  This
works in more or less the same way as it does on the data device, with
the major exception that we never do delalloc on the rt volume.

Because we consider unwritten CoW fork staging extents to be incore
quota reservation, we update xfs_quota_reserve_blkres to support this
case.  Though xfs doesn't allow rt and quota together, the change is
trivial and we shouldn't leave a logic bomb here.

While we're at it, add a missing xfs_mod_delalloc call when we remove
delalloc block reservation from the inode.  This is largely irrelvant
since realtime files do not use delalloc, but we want to avoid leaving
logic bombs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: refactor reflink quota updates
Darrick J. Wong [Thu, 15 Aug 2024 18:49:28 +0000 (11:49 -0700)]
xfs: refactor reflink quota updates

Hoist all quota updates for reflink into a helper function, since things
are about to become more complicated.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: compute rtrmap btree max levels when reflink enabled
Darrick J. Wong [Thu, 15 Aug 2024 18:49:28 +0000 (11:49 -0700)]
xfs: compute rtrmap btree max levels when reflink enabled

Compute the maximum possible height of the realtime rmap btree when
reflink is enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: update rmap to allow cow staging extents in the rt rmap
Darrick J. Wong [Thu, 15 Aug 2024 18:49:27 +0000 (11:49 -0700)]
xfs: update rmap to allow cow staging extents in the rt rmap

Don't error out on CoW staging extent records when realtime reflink is
enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: create routine to allocate and initialize a realtime refcount btree inode
Darrick J. Wong [Thu, 15 Aug 2024 18:49:26 +0000 (11:49 -0700)]
xfs: create routine to allocate and initialize a realtime refcount btree inode

Create a library routine to allocate and initialize an empty realtime
refcountbt inode.  We'll use this for growfs, mkfs, and repair.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: wire up realtime refcount btree cursors
Darrick J. Wong [Sun, 22 Sep 2024 08:03:55 +0000 (10:03 +0200)]
xfs: wire up realtime refcount btree cursors

Wire up realtime refcount btree cursors wherever they're needed
throughout the code base.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: refactor xfs_reflink_find_shared
Darrick J. Wong [Sat, 21 Sep 2024 05:19:40 +0000 (07:19 +0200)]
xfs: refactor xfs_reflink_find_shared

Move lookup of the perag structure from the callers into the helpers,
and return the offset into the extent of the shared region instead of
the block number that needs post-processing.  This prepares the
callsites for the creation of an rt-specific variant in the next patch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: port to the middle of the rtreflink series for cleanliness]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: wire up a new inode fork type for the realtime refcount
Darrick J. Wong [Thu, 15 Aug 2024 18:49:23 +0000 (11:49 -0700)]
xfs: wire up a new inode fork type for the realtime refcount

Plumb in the pieces we need to embed the root of the realtime refcount
btree in an inode's data fork, complete with new fork type and
on-disk interpretation functions.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: add metadata reservations for realtime refcount btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:23 +0000 (11:49 -0700)]
xfs: add metadata reservations for realtime refcount btree

Reserve some free blocks so that we will always have enough free blocks
in the data volume to handle expansion of the realtime refcount btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: add realtime refcount btree inode to metadata directory
Darrick J. Wong [Thu, 15 Aug 2024 18:49:22 +0000 (11:49 -0700)]
xfs: add realtime refcount btree inode to metadata directory

Add a metadir path to select the realtime refcount btree inode and load
it at mount time.  The rtrefcountbt inode will have a unique extent format
code, which means that we also have to update the inode validation and
flush routines to look for it.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: add realtime refcount btree block detection to log recovery
Darrick J. Wong [Thu, 15 Aug 2024 18:49:21 +0000 (11:49 -0700)]
xfs: add realtime refcount btree block detection to log recovery

Identify rt refcount btree blocks in the log correctly so that we can
validate them during log recovery.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: support recovering refcount intent items targetting realtime extents
Darrick J. Wong [Sun, 22 Sep 2024 06:41:52 +0000 (08:41 +0200)]
xfs: support recovering refcount intent items targetting realtime extents

Now that we have reflink on the realtime device, refcount intent items
have to support remapping extents on the realtime volume.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: add a realtime flag to the refcount update log redo items
Darrick J. Wong [Sun, 22 Sep 2024 06:26:50 +0000 (08:26 +0200)]
xfs: add a realtime flag to the refcount update log redo items

Extend the refcount update (CUI) log items with a new realtime flag that
indicates that the updates apply against the realtime refcountbt.  We'll
wire up the actual refcount code later.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: prepare refcount functions to deal with rtrefcountbt
Darrick J. Wong [Sat, 21 Sep 2024 04:51:36 +0000 (06:51 +0200)]
xfs: prepare refcount functions to deal with rtrefcountbt

Prepare the high-level refcount functions to deal with the new realtime
refcountbt and its slightly different conventions.  Provide the ability
to talk to either refcountbt or rtrefcountbt formats from the same high
level code.

Note that we leave the _recover_cow_leftovers functions for a separate
patch so that we can convert it all at once.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: add realtime refcount btree operations
Darrick J. Wong [Thu, 15 Aug 2024 18:49:17 +0000 (11:49 -0700)]
xfs: add realtime refcount btree operations

Implement the generic btree operations needed to manipulate rtrefcount
btree blocks. This is different from the regular refcountbt in that we
allocate space from the filesystem at large, and are neither constrained
to the free space nor any particular AG.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: realtime refcount btree transaction reservations
Darrick J. Wong [Thu, 15 Aug 2024 18:49:17 +0000 (11:49 -0700)]
xfs: realtime refcount btree transaction reservations

Make sure that there's enough log reservation to handle mapping
and unmapping realtime extents.  We have to reserve enough space
to handle a split in the rtrefcountbt to add the record and a second
split in the regular refcountbt to record the rtrefcountbt split.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: define the on-disk realtime refcount btree format
Darrick J. Wong [Thu, 15 Aug 2024 18:49:16 +0000 (11:49 -0700)]
xfs: define the on-disk realtime refcount btree format

Start filling out the rtrefcount btree implementation. Start with the
on-disk btree format; add everything needed to read, write and
manipulate refcount btree blocks. This prepares the way for connecting
the btree operations implementation.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: namespace the maximum length/refcount symbols
Darrick J. Wong [Thu, 15 Aug 2024 18:49:15 +0000 (11:49 -0700)]
xfs: namespace the maximum length/refcount symbols

Actually namespace these variables properly, so that readers can tell
that this is an XFS symbol, and that it's for the refcount
functionality.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: introduce realtime refcount btree definitions
Darrick J. Wong [Thu, 15 Aug 2024 18:49:14 +0000 (11:49 -0700)]
xfs: introduce realtime refcount btree definitions

Add new realtime refcount btree definitions. The realtime refcount btree
will be rooted from a hidden inode, but has its own shape and therefore
needs to have most of its own separate types.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: prepare refcount btree cursor tracepoints for realtime
Darrick J. Wong [Sat, 21 Sep 2024 04:49:35 +0000 (06:49 +0200)]
xfs: prepare refcount btree cursor tracepoints for realtime

Rework the refcount btree cursor tracepoints in preparation to handle the
realtime refcount btree cursor.  Mostly this involves renaming the field to
"refcbno" and extracting the group number from the cursor when possible.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: enable realtime rmap btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:12 +0000 (11:49 -0700)]
xfs: enable realtime rmap btree

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: hook live realtime rmap operations during a repair operation
Darrick J. Wong [Sun, 22 Sep 2024 05:11:13 +0000 (07:11 +0200)]
xfs: hook live realtime rmap operations during a repair operation

Hook the regular realtime rmap code when an rtrmapbt repair operation is
running so that we can unlock the AGF buffer to scan the filesystem and
keep the in-memory btree up to date during the scan.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: create a shadow rmap btree during realtime rmap repair
Darrick J. Wong [Sun, 22 Sep 2024 05:08:12 +0000 (07:08 +0200)]
xfs: create a shadow rmap btree during realtime rmap repair

Create an in-memory btree of rmap records instead of an array.  This
enables us to do live record collection instead of freezing the fs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: online repair of the realtime rmap btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:10 +0000 (11:49 -0700)]
xfs: online repair of the realtime rmap btree

Repair the realtime rmap btree while mounted.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: support repairing metadata btrees rooted in metadir inodes
Darrick J. Wong [Sat, 21 Sep 2024 04:23:27 +0000 (06:23 +0200)]
xfs: support repairing metadata btrees rooted in metadir inodes

Adapt the repair code so that we can stage a new btree in the data fork
area of a metadir inode and reap the old blocks.  We already have nearly
all of the infrastructure; the only parts that were missing were the
metadata inode reservation handling.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: online repair of realtime bitmaps for a realtime group
Darrick J. Wong [Thu, 15 Aug 2024 18:49:08 +0000 (11:49 -0700)]
xfs: online repair of realtime bitmaps for a realtime group

For a given rt group, regenerate the bitmap contents from the group's
realtime rmap btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: repair rmap btree inodes
Darrick J. Wong [Thu, 15 Aug 2024 18:49:07 +0000 (11:49 -0700)]
xfs: repair rmap btree inodes

Teach the inode repair code how to deal with realtime rmap btree inodes
that won't load properly.  This is most likely moot since the filesystem
generally won't mount without the rtrmapbt inodes being usable, but
we'll add this for completeness.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: repair inodes that have realtime extents
Darrick J. Wong [Sat, 21 Sep 2024 04:22:58 +0000 (06:22 +0200)]
xfs: repair inodes that have realtime extents

Plumb into the inode core repair code the ability to search for extents
on realtime devices.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: online repair of realtime file bmaps
Darrick J. Wong [Sat, 21 Sep 2024 04:21:52 +0000 (06:21 +0200)]
xfs: online repair of realtime file bmaps

Repair the block mappings of realtime files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: walk the rt reverse mapping tree when rebuilding rmap
Darrick J. Wong [Thu, 15 Aug 2024 18:49:05 +0000 (11:49 -0700)]
xfs: walk the rt reverse mapping tree when rebuilding rmap

When we're rebuilding the data device rmap, if we encounter an "rmap"
format fork, we have to walk the (realtime) rmap btree inode to build
the appropriate mappings.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: scrub the metadir path of rt rmap btree files
Darrick J. Wong [Thu, 15 Aug 2024 18:49:04 +0000 (11:49 -0700)]
xfs: scrub the metadir path of rt rmap btree files

Add a new XFS_SCRUB_METAPATH subtype so that we can scrub the metadata
directory tree path to the rmap btree file for each rt group.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: scan rt rmap when we're doing an intense rmap check of bmbt mappings
Darrick J. Wong [Sat, 21 Sep 2024 04:20:01 +0000 (06:20 +0200)]
xfs: scan rt rmap when we're doing an intense rmap check of bmbt mappings

Teach the bmbt scrubber how to perform a comprehensive check that the
rmapbt does not contain /any/ mappings that are not described by bmbt
records when it's dealing with a realtime file.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: cross-reference the realtime rmapbt
Darrick J. Wong [Sat, 21 Sep 2024 04:15:51 +0000 (06:15 +0200)]
xfs: cross-reference the realtime rmapbt

Teach the data fork and realtime bitmap scrubbers to cross-reference
information with the realtime rmap btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: cross-reference realtime bitmap to realtime rmapbt scrubber
Darrick J. Wong [Thu, 15 Aug 2024 18:49:01 +0000 (11:49 -0700)]
xfs: cross-reference realtime bitmap to realtime rmapbt scrubber

When we're checking the realtime rmap btree entries, cross-reference
those entries with the realtime bitmap too.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: scrub the realtime rmapbt
Darrick J. Wong [Thu, 15 Aug 2024 18:49:01 +0000 (11:49 -0700)]
xfs: scrub the realtime rmapbt

Check the realtime reverse mapping btree against the rtbitmap, and
modify the rtbitmap scrub to check against the rtrmapbt.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: allow queued realtime intents to drain before scrubbing
Darrick J. Wong [Sat, 21 Sep 2024 04:09:42 +0000 (06:09 +0200)]
xfs: allow queued realtime intents to drain before scrubbing

When a writer thread executes a chain of log intent items for the
realtime volume, the ILOCKs taken during each step are for each rt
metadata file, not the entire rt volume itself.  Although scrub takes
all rt metadata ILOCKs, this isn't sufficient to guard against scrub
checking the rt volume while that writer thread is in the middle of
finishing a chain because there's no higher level locking primitive
guarding the realtime volume.

When there's a collision, cross-referencing between data structures
(e.g. rtrmapbt and rtrefcountbt) yields false corruption events; if
repair is running, this results in incorrect repairs, which is
catastrophic.

Fix this by adding to the mount structure the same drain that we use to
protect scrub against concurrent AG updates, but this time for the
realtime volume.

[Contains a few cleanups from hch]

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: fix scrub tracepoints when inode-rooted btrees are involved
Darrick J. Wong [Thu, 15 Aug 2024 18:48:59 +0000 (11:48 -0700)]
xfs: fix scrub tracepoints when inode-rooted btrees are involved

Fix a minor mistakes in the scrub tracepoints that can manifest when
inode-rooted btrees are enabled.  The existing code worked fine for bmap
btrees, but we should tighten the code up to be less sloppy.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: report realtime rmap btree corruption errors to the health system
Darrick J. Wong [Fri, 20 Sep 2024 17:56:11 +0000 (19:56 +0200)]
xfs: report realtime rmap btree corruption errors to the health system

Whenever we encounter corrupt realtime rmap btree blocks, we should
report that to the health monitoring system for later reporting.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: check that the rtrmapbt maxlevels doesn't increase when growing fs
Darrick J. Wong [Thu, 15 Aug 2024 18:48:57 +0000 (11:48 -0700)]
xfs: check that the rtrmapbt maxlevels doesn't increase when growing fs

The size of filesystem transaction reservations depends on the maximum
height (maxlevels) of the realtime btrees.  Since we don't want a grow
operation to increase the reservation size enough that we'll fail the
minimum log size checks on the next mount, constrain growfs operations
if they would cause an increase in those maxlevels.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: wire up getfsmap to the realtime reverse mapping btree
Darrick J. Wong [Thu, 15 Aug 2024 18:48:56 +0000 (11:48 -0700)]
xfs: wire up getfsmap to the realtime reverse mapping btree

Connect the getfsmap ioctl to the realtime rmapbt.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: create routine to allocate and initialize a realtime rmap btree inode
Darrick J. Wong [Thu, 15 Aug 2024 18:48:55 +0000 (11:48 -0700)]
xfs: create routine to allocate and initialize a realtime rmap btree inode

Create a library routine to allocate and initialize an empty realtime
rmapbt inode.  We'll use this for mkfs and repair.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: wire up rmap map and unmap to the realtime rmapbt
Darrick J. Wong [Sun, 22 Sep 2024 06:25:11 +0000 (08:25 +0200)]
xfs: wire up rmap map and unmap to the realtime rmapbt

Connect the map and unmap reverse-mapping operations to the realtime
rmapbt via the deferred operation callbacks.  This enables us to
perform rmap operations against the correct btree.

[Contains a minor bugfix from hch]

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: allow inodes with zero extents but nonzero nblocks
Darrick J. Wong [Thu, 15 Aug 2024 18:48:54 +0000 (11:48 -0700)]
xfs: allow inodes with zero extents but nonzero nblocks

Metadata inodes that store btrees will have zero extents and a nonzero
nblocks.  Adjust the inode verifier so that this combination is not
flagged.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: wire up a new inode fork type for the realtime rmap
Darrick J. Wong [Thu, 15 Aug 2024 18:48:53 +0000 (11:48 -0700)]
xfs: wire up a new inode fork type for the realtime rmap

Plumb in the pieces we need to embed the root of the realtime rmap
btree in an inode's data fork, complete with new fork type and
on-disk interpretation functions.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: add metadata reservations for realtime rmap btrees
Darrick J. Wong [Thu, 15 Aug 2024 18:48:52 +0000 (11:48 -0700)]
xfs: add metadata reservations for realtime rmap btrees

Reserve some free blocks so that we will always have enough free blocks
in the data volume to handle expansion of the realtime rmap btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: add realtime reverse map inode to metadata directory
Darrick J. Wong [Thu, 15 Aug 2024 18:48:51 +0000 (11:48 -0700)]
xfs: add realtime reverse map inode to metadata directory

Add a metadir path to select the realtime rmap btree inode and load
it at mount time.  The rtrmapbt inode will have a unique extent format
code, which means that we also have to update the inode validation and
flush routines to look for it.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: add realtime rmap btree block detection to log recovery
Darrick J. Wong [Thu, 15 Aug 2024 18:48:50 +0000 (11:48 -0700)]
xfs: add realtime rmap btree block detection to log recovery

Identify rtrmapbt blocks in the log correctly so that we can
validate them during log recovery.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: support recovering rmap intent items targetting realtime extents
Darrick J. Wong [Sun, 22 Sep 2024 06:23:40 +0000 (08:23 +0200)]
xfs: support recovering rmap intent items targetting realtime extents

Now that we have rmap on the realtime device, log recovery has to
support remapping extents on the realtime volume.  Make this work.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: add a realtime flag to the rmap update log redo items
Darrick J. Wong [Fri, 20 Sep 2024 17:54:51 +0000 (19:54 +0200)]
xfs: add a realtime flag to the rmap update log redo items

Extend the rmap update (RUI) log items with a new realtime flag that
indicates that the updates apply against the realtime rmapbt.  We'll
wire up the actual rmap code later.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: prepare rmap functions to deal with rtrmapbt
Darrick J. Wong [Fri, 20 Sep 2024 17:51:04 +0000 (19:51 +0200)]
xfs: prepare rmap functions to deal with rtrmapbt

Prepare the high-level rmap functions to deal with the new realtime
rmapbt and its slightly different conventions.  Provide the ability
to talk to either rmapbt or rtrmapbt formats from the same high
level code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: add realtime rmap btree operations
Darrick J. Wong [Thu, 15 Aug 2024 18:48:47 +0000 (11:48 -0700)]
xfs: add realtime rmap btree operations

Implement the generic btree operations needed to manipulate rtrmap
btree blocks. This is different from the regular rmapbt in that we
allocate space from the filesystem at large, and are neither
constrained to the free space nor any particular AG.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: realtime rmap btree transaction reservations
Darrick J. Wong [Thu, 15 Aug 2024 18:48:46 +0000 (11:48 -0700)]
xfs: realtime rmap btree transaction reservations

Make sure that there's enough log reservation to handle mapping
and unmapping realtime extents.  We have to reserve enough space
to handle a split in the rtrmapbt to add the record and a second
split in the regular rmapbt to record the rtrmapbt split.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: define the on-disk realtime rmap btree format
Darrick J. Wong [Fri, 20 Sep 2024 17:34:54 +0000 (19:34 +0200)]
xfs: define the on-disk realtime rmap btree format

Start filling out the rtrmap btree implementation. Start with the
on-disk btree format; add everything needed to read, write and
manipulate rmap btree blocks. This prepares the way for connecting the
btree operations implementation.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: introduce realtime rmap btree definitions
Darrick J. Wong [Thu, 15 Aug 2024 18:48:44 +0000 (11:48 -0700)]
xfs: introduce realtime rmap btree definitions

Add new realtime rmap btree definitions. The realtime rmap btree will
be rooted from a hidden inode, but has its own shape and therefore
needs to have most of its own separate types.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: simplify the xfs_rmap_{alloc,free}_extent calling conventions
Darrick J. Wong [Fri, 20 Sep 2024 17:34:28 +0000 (19:34 +0200)]
xfs: simplify the xfs_rmap_{alloc,free}_extent calling conventions

Simplify the calling conventions by allowing callers to pass a fsbno
(xfs_fsblock_t) directly into these functions, since we're just going to
set it in a struct anyway.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: prepare rmap btree cursor tracepoints for realtime
Darrick J. Wong [Fri, 20 Sep 2024 17:33:27 +0000 (19:33 +0200)]
xfs: prepare rmap btree cursor tracepoints for realtime

Rework the rmap btree cursor tracepoints in preparation to handle the
realtime rmap btree cursor.  Mostly this involves renaming the field to
"rmapbno" and extracting the group number from the cursor when possible.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: allow inode-based btrees to reserve space in the data device
Darrick J. Wong [Fri, 20 Sep 2024 17:30:46 +0000 (19:30 +0200)]
xfs: allow inode-based btrees to reserve space in the data device

Create a new space reservation scheme so that btree metadata for the
realtime volume can reserve space in the data device to avoid space
underruns.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: update btree keys correctly when _insrec splits an inode root block
Darrick J. Wong [Thu, 15 Aug 2024 18:48:40 +0000 (11:48 -0700)]
xfs: update btree keys correctly when _insrec splits an inode root block

In commit 2c813ad66a72, I partially fixed a bug wherein xfs_btree_insrec
would erroneously try to update the parent's key for a block that had
been split if we decided to insert the new record into the new block.
The solution was to detect this situation and update the in-core key
value that we pass up to the caller so that the caller will (eventually)
add the new block to the parent level of the tree with the correct key.

However, I missed a subtlety about the way inode-rooted btrees work.  If
the full block was a maximally sized inode root block, we'll solve that
fullness by moving the root block's records to a new block, resizing the
root block, and updating the root to point to the new block.  We don't
pass a pointer to the new block to the caller because that work has
already been done.  The new record will /always/ land in the new block,
so in this case we need to use xfs_btree_update_keys to update the keys.

This bug can theoretically manifest itself in the very rare case that we
split a bmbt root block and the new record lands in the very first slot
of the new block, though I've never managed to trigger it in practice.
However, it is very easy to reproduce by running generic/522 with the
realtime rmapbt patchset if rtinherit=1.

Fixes: 2c813ad66a72 ("xfs: support btrees with overlapping intervals for keys")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: support storing records in the inode core root
Darrick J. Wong [Thu, 15 Aug 2024 18:48:39 +0000 (11:48 -0700)]
xfs: support storing records in the inode core root

Add the necessary flags and code so that we can support storing leaf
records in the inode root block of a btree.  This hasn't been necessary
before, but the realtime rmapbt will need to be able to do this.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: hoist the node iroot update code out of xfs_btree_kill_iroot
Darrick J. Wong [Thu, 15 Aug 2024 18:48:38 +0000 (11:48 -0700)]
xfs: hoist the node iroot update code out of xfs_btree_kill_iroot

In preparation for allowing records in an inode btree root, hoist the
code that copies keyptrs from an existing node child into the root block
to a separate function.  Remove some unnecessary conditionals and clean
up a few function calls in the new function.  Note that this change
reorders the ->free_block call with respect to the change in bc_nlevels
to make it easier to support inode root leaf blocks in the next patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: hoist the node iroot update code out of xfs_btree_new_iroot
Darrick J. Wong [Thu, 15 Aug 2024 18:48:38 +0000 (11:48 -0700)]
xfs: hoist the node iroot update code out of xfs_btree_new_iroot

In preparation for allowing records in an inode btree root, hoist the
code that copies keyptrs from an existing node root into a child block
to a separate function.  Note that the new function explicitly computes
the keys of the new child block and stores that in the root block; while
the bmap btree could rely on leaving the key alone, realtime rmap needs
to set the new high key.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: tidy up xfs_bmap_broot_realloc a bit
Darrick J. Wong [Fri, 30 Aug 2024 17:47:34 +0000 (10:47 -0700)]
xfs: tidy up xfs_bmap_broot_realloc a bit

Hoist out the code that migrates broot pointers during a resize
operation to avoid code duplication and streamline the caller.  Also
use the correct bmbt pointer type for the sizeof operation.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: make xfs_iroot_realloc a bmap btree function
Darrick J. Wong [Fri, 30 Aug 2024 00:42:23 +0000 (17:42 -0700)]
xfs: make xfs_iroot_realloc a bmap btree function

Move the inode fork btree root reallocation function part of the btree
ops because it's now mostly bmbt-specific code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: make xfs_iroot_realloc take the new numrecs instead of deltas
Darrick J. Wong [Fri, 30 Aug 2024 03:08:34 +0000 (20:08 -0700)]
xfs: make xfs_iroot_realloc take the new numrecs instead of deltas

Change the calling signature of xfs_iroot_realloc to take the ifork and
the new number of records in the btree block, not a diff against the
current number.  This will make the callsites easier to understand.

Note that this function is misnamed because it is very specific to the
single type of inode-rooted btree supported.  This will be addressed in
a subsequent patch.

Return the new btree root to reduce the amount of code clutter.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: refactor the inode fork memory allocation functions
Darrick J. Wong [Thu, 15 Aug 2024 18:48:30 +0000 (11:48 -0700)]
xfs: refactor the inode fork memory allocation functions

Hoist the code that allocates, frees, and reallocates if_broot into a
single xfs_iroot_krealloc function.  Eventually we're going to push
xfs_iroot_realloc into the btree ops structure to handle multiple
inode-rooted btrees, but first let's separate out the bits that should
stay in xfs_inode_fork.c.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: tidy up xfs_iroot_realloc
Darrick J. Wong [Fri, 30 Aug 2024 00:17:48 +0000 (17:17 -0700)]
xfs: tidy up xfs_iroot_realloc

Tidy up this function a bit before we start refactoring the memory
handling and move the function to the bmbt code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
8 months agoxfs: enable metadata directory feature
Darrick J. Wong [Wed, 7 Aug 2024 22:54:28 +0000 (15:54 -0700)]
xfs: enable metadata directory feature

Enable the metadata directory feature.  With this feature, all metadata
inodes are placed in the metadata directory, and the only inumbers in
the superblock are the roots of the two directory trees.

The RT device is now sharded into a number of rtgroups, where 0 rtgroups
mean that no RT extents are supported, and the traditional XFS stub RT
bitmap and summary inodes don't exist.  A single rtgroup gives roughly
identical behavior to the traditional RT setup, but now with checksummed
and self identifying free space metadata.

For quota, the quota options are read from the superblock unless
explicitly overridden via mount options.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
8 months agoxfs: update sb field checks when metadir is turned on
Darrick J. Wong [Thu, 22 Aug 2024 16:00:03 +0000 (09:00 -0700)]
xfs: update sb field checks when metadir is turned on

When metadir is enabled, we want to check the two new rtgroups fields,
and we don't want to check the old inumbers that are now in the metadir.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
8 months agoxfs: persist quota flags with metadir
Darrick J. Wong [Thu, 22 Aug 2024 16:00:02 +0000 (09:00 -0700)]
xfs: persist quota flags with metadir

It's annoying that one has to keep reminding XFS about what quota
options it should mount with, since the quota flags recording the
previous state are sitting right there in the primary superblock.  Even
more strangely, there exists a noquota option to disable quotas
completely, so it's odder still that providing no options is the same as
noquota.

Starting with metadir, let's change the behavior so that if the user
does not specify any quota-related mount options at all, the ondisk
quota flags will be used to bring up quota.  In other words, the
filesystem will mount in the same state and with the same functionality
as it had during the last mount.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
8 months agoxfs: scrub quota file metapaths
Darrick J. Wong [Thu, 22 Aug 2024 16:00:01 +0000 (09:00 -0700)]
xfs: scrub quota file metapaths

Enable online fsck for quota file metadata directory paths.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
8 months agoxfs: use metadir for quota inodes
Darrick J. Wong [Thu, 22 Aug 2024 16:00:00 +0000 (09:00 -0700)]
xfs: use metadir for quota inodes

Store the quota inodes in the /quota metadata directory if metadir is
enabled.  This enables us to stop using the sb_[ugp]uotino fields in the
superblock.  From this point on, all metadata files will be children of
the metadata directory tree root.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
8 months agoxfs: refactor xfs_qm_destroy_quotainos
Darrick J. Wong [Thu, 22 Aug 2024 15:59:59 +0000 (08:59 -0700)]
xfs: refactor xfs_qm_destroy_quotainos

Reuse this function instead of open-coding the logic.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
8 months agoxfs: mask off the rtbitmap and summary inodes when metadir in use
Darrick J. Wong [Thu, 22 Aug 2024 15:59:58 +0000 (08:59 -0700)]
xfs: mask off the rtbitmap and summary inodes when metadir in use

Set the rtbitmap and summary file inumbers to NULLFSINO in the
superblock and make sure they're zeroed whenever we write the superblock
to disk, to mimic mkfs behavior.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
8 months agoxfs: scrub metadir paths for rtgroup metadata
Darrick J. Wong [Thu, 15 Aug 2024 18:48:28 +0000 (11:48 -0700)]
xfs: scrub metadir paths for rtgroup metadata

Add the code we need to scan the metadata directory paths of rt group
metadata files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>