]> www.infradead.org Git - users/hch/xfs.git/log
users/hch/xfs.git
10 months agoxfs: store a generic xfs_group pointer in xfs_getfsmap_info xfs-generic-group
Christoph Hellwig [Thu, 19 Sep 2024 06:45:52 +0000 (08:45 +0200)]
xfs: store a generic xfs_group pointer in xfs_getfsmap_info

Replace the pag and rtg pointers with a generic group pointer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: add a generic group pointer to the btree cursor
Christoph Hellwig [Thu, 19 Sep 2024 06:43:56 +0000 (08:43 +0200)]
xfs: add a generic group pointer to the btree cursor

Replace the pag and rtg pointers in the type specific union with a
generic xfs_group pointer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: remove superfluous arguments to xfs_rtrefcountbt_init_cursor
Christoph Hellwig [Thu, 19 Sep 2024 06:36:52 +0000 (08:36 +0200)]
xfs: remove superfluous arguments to xfs_rtrefcountbt_init_cursor

The mount structure and inode can be derived from the rtg.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: remove superfluous arguments to xfs_rtrmapbt_init_cursor
Christoph Hellwig [Thu, 5 Sep 2024 07:25:38 +0000 (10:25 +0300)]
xfs: remove superfluous arguments to xfs_rtrmapbt_init_cursor

The mount structure and inode can be derived from the rtg.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: switch rtgroups to the new while based iterator
Christoph Hellwig [Thu, 19 Sep 2024 06:35:43 +0000 (08:35 +0200)]
xfs: switch rtgroups to the new while based iterator

Clean the code up by not needing an rgno iterator variable and making
it clear that a loop exit needs to drop the rtg reference.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: simplify rtg rmap update hooks
Christoph Hellwig [Sun, 1 Sep 2024 13:44:06 +0000 (16:44 +0300)]
xfs: simplify rtg rmap update hooks

Use the generic group based code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: store a generic group structure in the intents
Christoph Hellwig [Sat, 14 Sep 2024 07:14:17 +0000 (09:14 +0200)]
xfs: store a generic group structure in the intents

Replace the pag and rtg pointers in the extent free, bmap, rmap and
refcount intent structures with a pointer to the generic code, and
the merge the helpers that are identifical now.

Note: there might be opportunity to push this further down and unify
even more code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: simplify rtg intent draining
Christoph Hellwig [Sun, 1 Sep 2024 13:06:59 +0000 (16:06 +0300)]
xfs: simplify rtg intent draining

Use the generic group based code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: simplify rtg health tracking
Christoph Hellwig [Sun, 1 Sep 2024 12:49:34 +0000 (15:49 +0300)]
xfs: simplify rtg health tracking

Use the generic group based code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: use the generic group structure for realtime groups
Christoph Hellwig [Thu, 19 Sep 2024 06:33:38 +0000 (08:33 +0200)]
xfs: use the generic group structure for realtime groups

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: pass the rtg to xfs_rgbno_to_rtb
Christoph Hellwig [Thu, 19 Sep 2024 06:32:26 +0000 (08:32 +0200)]
xfs: pass the rtg to xfs_rgbno_to_rtb

Convert from a rgbno to a rtbno based on the rtgroup struture instead
of passing a [mp,rgno] tuple.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: remove the unused rtg_active_wq field in struct xfs_rtgroup
Christoph Hellwig [Sat, 31 Aug 2024 04:59:18 +0000 (07:59 +0300)]
xfs: remove the unused rtg_active_wq field in struct xfs_rtgroup

rtg_active_wq is only woken, but never waited for.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: fix superfluous clearing of info->low in xfs_getfsmap_rtdev_rmapbt
Christoph Hellwig [Sun, 15 Sep 2024 04:51:48 +0000 (06:51 +0200)]
xfs: fix superfluous clearing of info->low in xfs_getfsmap_rtdev_rmapbt

The for_each_rtgroup helpers update the rgno passed in for each iteration,
and thus the "if (rtg->rtg_agno == start_rg)" check will always be true.

Add another variable for the loop iterator so that the field is only
cleared after the first iteration.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: add a xfs_group_next_range helper
Christoph Hellwig [Thu, 5 Sep 2024 04:23:38 +0000 (07:23 +0300)]
xfs: add a xfs_group_next_range helper

Add a helper to iterate over iterate over all groups, which can be used
as a simple while loop:

struct xfs_group *xg = NULL;

while ((xg = xfs_group_next_range(mp, xg, 0, MAX_GROUP))) {
...
}

This will be wrapped by the realtime group code first, and eventually
replace the for_each_rtgroup_from and for_each_rtgroup_range helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: convert busy extent tracking to the generic group structure
Christoph Hellwig [Sat, 14 Sep 2024 06:44:00 +0000 (08:44 +0200)]
xfs: convert busy extent tracking to the generic group structure

Split busy extent tracking from struct xfs_perag into its own private
structure, which can be pointed to by the generic group structure.

Note that this structure is now dynamically allocated instead of embedded
as the upcoming zone XFS code doesn't need it and will also have an
unusually high number of groups due to hardware constraints.  Dynamically
allocating the structure this is a big memory saver for this case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: convert extent busy tracing to the generic group structure
Christoph Hellwig [Thu, 19 Sep 2024 13:09:06 +0000 (15:09 +0200)]
xfs: convert extent busy tracing to the generic group structure

Prepare for tracking busy RT extents by passing the generic group
structure to the xfs_extent_busy_class tracepoints.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: return the busy generation from xfs_extent_busy_list_empty
Christoph Hellwig [Mon, 2 Sep 2024 05:45:37 +0000 (08:45 +0300)]
xfs: return the busy generation from xfs_extent_busy_list_empty

This avoid having to poke into the internals of the busy tracking in
ẋrep_setup_ag_allocbt.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: move the online repair rmap hooks to the generic group structure
Christoph Hellwig [Sun, 1 Sep 2024 12:45:33 +0000 (15:45 +0300)]
xfs: move the online repair rmap hooks to the generic group structure

Prepare for the upcoming realtime groups feature by moving the online
repair rmap hooks to based to the generic xfs_group structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: move draining of deferred operations to the generic group structure
Christoph Hellwig [Thu, 19 Sep 2024 07:07:48 +0000 (09:07 +0200)]
xfs: move draining of deferred operations to the generic group structure

Prepare supporting the upcoming realtime groups feature by moving the
deferred operation draining to the generic xfs_group structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: mark xfs_perag_intent_{hold,rele} static
Christoph Hellwig [Sat, 14 Sep 2024 07:09:31 +0000 (09:09 +0200)]
xfs: mark xfs_perag_intent_{hold,rele} static

These two functions are only used inside of xfs_drain.c, so mark them
static.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: move metadata health tracking to the generic group structure
Christoph Hellwig [Thu, 19 Sep 2024 13:06:33 +0000 (15:06 +0200)]
xfs: move metadata health tracking to the generic group structure

Prepare for also tracking the health status of the upcoming realtime
groups by moving the health tracking code to the generic xfs_group
structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: factor out a generic xfs_group structure
Christoph Hellwig [Thu, 19 Sep 2024 06:28:29 +0000 (08:28 +0200)]
xfs: factor out a generic xfs_group structure

Split the lookup and refcount handling of struct xfs_perag into an
embedded xfs_group structure that can be reused for the upcoming
realtime groups.

It will be extended with more features later.

Note that he xg_type field will only need a single bit even with
realtime group support.  For now it fills a hole, but it might be
worth to fold it into another field if we can use this space better.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: insert the pag structures into the xarray later
Christoph Hellwig [Sun, 1 Sep 2024 08:48:37 +0000 (11:48 +0300)]
xfs: insert the pag structures into the xarray later

Cleaning up is much easier if a structure can't be looked up yet, so only
insert the pag once it is fully set up.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: split xfs_initialize_perag
Christoph Hellwig [Thu, 19 Sep 2024 13:16:07 +0000 (15:16 +0200)]
xfs: split xfs_initialize_perag

Factor out a xfs_perag_alloc helper that allocates a single perag
structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: convert remaining tracepoints to pass pag structures
Christoph Hellwig [Sat, 31 Aug 2024 08:10:49 +0000 (11:10 +0300)]
xfs: convert remaining tracepoints to pass pag structures

Convert all tracepoints that take [mp,agno] tuples to take a pag argument
instead so that decoding only happens when tracepoints are enabled and to
clean up the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: pass the pag to the xrep_newbt_extent_class tracepoints
Christoph Hellwig [Sun, 1 Sep 2024 05:39:19 +0000 (08:39 +0300)]
xfs: pass the pag to the xrep_newbt_extent_class tracepoints

This requires moving a few of the callsites a little bit to ensure that
we already have the reference, but allows for the decoding to only happen
when tracing is actually enabled, and cleans up the callsites a bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: pass the pag to the trace_xrep_calc_ag_resblks{,_btsize} trace points
Christoph Hellwig [Sun, 1 Sep 2024 05:34:33 +0000 (08:34 +0300)]
xfs: pass the pag to the trace_xrep_calc_ag_resblks{,_btsize} trace points

This requires holding the pag refcount a little longer, but allows for the
decoding to only happen when tracing is actually enabled, and cleans up the
callsites a bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: pass objects to the xrep_ibt_walk_rmap tracepoint
Christoph Hellwig [Sun, 1 Sep 2024 05:26:13 +0000 (08:26 +0300)]
xfs: pass objects to the xrep_ibt_walk_rmap tracepoint

Pass the perag structure and the irec so that the decoding is only done
when tracing is actually enabled and the call sites look a lot neater,
and remove the pointless class indirection.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: pass the iunlink item to the xfs_iunlink_update_dinode trace point
Christoph Hellwig [Sat, 31 Aug 2024 08:20:24 +0000 (11:20 +0300)]
xfs: pass the iunlink item to the xfs_iunlink_update_dinode trace point

So that decoding is only done when tracing is actually enabled and the
call site look a lot neater.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: pass objects to the xfs_irec_merge_{pre,post} tracepoints
Christoph Hellwig [Sat, 31 Aug 2024 07:39:59 +0000 (10:39 +0300)]
xfs: pass objects to the xfs_irec_merge_{pre,post} tracepoints

Pass the perag structure and the irec to these tracepoints so that the
decoding is only done when tracing is actually enabled and the call sites
look a lot neater.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: pass a perag structure to the xfs_ag_resv_init_error tracepoint
Christoph Hellwig [Sat, 31 Aug 2024 07:37:07 +0000 (10:37 +0300)]
xfs: pass a perag structure to the xfs_ag_resv_init_error tracepoint

And remove the single instance class indirection for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: constify pag arguments to trace points
Christoph Hellwig [Thu, 19 Sep 2024 07:02:34 +0000 (09:02 +0200)]
xfs: constify pag arguments to trace points

Trace points never modify their arguments.  Mark all the pag objects
passed to trace points.  The exception is the xfs_ag_resv_class, which
uses the xfs_perag_resv helper that can't be marked const due to
other users modifying the returned structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: remove the unused xrep_bmap_walk_rmap trace event
Christoph Hellwig [Sat, 14 Sep 2024 06:51:29 +0000 (08:51 +0200)]
xfs: remove the unused xrep_bmap_walk_rmap trace event

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: remove the unused trace_xfs_iwalk_ag trace point
Christoph Hellwig [Sat, 31 Aug 2024 08:22:05 +0000 (11:22 +0300)]
xfs: remove the unused trace_xfs_iwalk_ag trace point

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: remove the mount field from struct xfs_busy_extents
Christoph Hellwig [Sat, 14 Sep 2024 06:31:53 +0000 (08:31 +0200)]
xfs: remove the mount field from struct xfs_busy_extents

The mount field is only passed to xfs_extent_busy_clear, which never uses
it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: keep a reference to perag structure for busy extents
Christoph Hellwig [Sat, 31 Aug 2024 05:29:58 +0000 (08:29 +0300)]
xfs: keep a reference to perag structure for busy extents

Processing of busy extents requires the perag structure, so keep the
reference while they are in flight.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: pass a perag struture to xfs_extent_busy_{search,reuse}
Christoph Hellwig [Sun, 1 Sep 2024 05:10:12 +0000 (08:10 +0300)]
xfs: pass a perag struture to xfs_extent_busy_{search,reuse}

Replace the [mp,agno] tuple with the perag structure, which will become
more useful later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: add a xfs_agino_to_ino helper
Christoph Hellwig [Sat, 31 Aug 2024 07:55:11 +0000 (10:55 +0300)]
xfs: add a xfs_agino_to_ino helper

Add a helpers to convert an agino to an ino based on a pag structure.

This provides a simpler conversion and better type safety compared to the
existing code that passes the mount structure and the agno separately.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: add xfs_agbno_to_fsb and xfs_agbno_to_daddr helpers
Christoph Hellwig [Thu, 19 Sep 2024 06:25:30 +0000 (08:25 +0200)]
xfs: add xfs_agbno_to_fsb and xfs_agbno_to_daddr helpers

Add helpers to convert an agbno to a daddr or fsbno based on a pag
structure.

This provides a simpler conversion and better type safety compared to the
existing code that passes the mount structure and the agno separately.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: remove the agno argument to xfs_free_ag_extent
Christoph Hellwig [Sat, 31 Aug 2024 08:18:02 +0000 (11:18 +0300)]
xfs: remove the agno argument to xfs_free_ag_extent

xfs_free_ag_extent already has a pointer to the pag structure through
the agf buffer.  Use that instead of passing the redundant argument,
and do the same for the tracepoint.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: pass a pag to xfs_difree_inode_chunk
Christoph Hellwig [Sat, 31 Aug 2024 18:05:55 +0000 (21:05 +0300)]
xfs: pass a pag to xfs_difree_inode_chunk

We'll want to use more than just the agno field in a bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: remove the unused pag_active_wq field in struct xfs_perag
Christoph Hellwig [Sun, 1 Sep 2024 14:29:54 +0000 (17:29 +0300)]
xfs: remove the unused pag_active_wq field in struct xfs_perag

pag_active_wq is only woken, but never waited for.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: remove the unused pagb_count field in struct xfs_perag
Christoph Hellwig [Sun, 1 Sep 2024 14:29:37 +0000 (17:29 +0300)]
xfs: remove the unused pagb_count field in struct xfs_perag

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: fix superfluous clearing of info->low in __xfs_getfsmap_datadev
Christoph Hellwig [Sun, 15 Sep 2024 04:49:40 +0000 (06:49 +0200)]
xfs: fix superfluous clearing of info->low in __xfs_getfsmap_datadev

The for_each_perag helpers update the agno passed in for each iteration,
and thus the "if (pag->pag_agno == start_ag)" check will always be true.

Add another variable for the loop iterator so that the field is only
cleared after the first iteration.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: don't use __GFP_RETRY_MAYFAIL in xfs_initialize_perag
Christoph Hellwig [Thu, 19 Sep 2024 13:15:10 +0000 (15:15 +0200)]
xfs: don't use __GFP_RETRY_MAYFAIL in xfs_initialize_perag

__GFP_RETRY_MAYFAIL increases the likelyhood of allocations to fail,
which isn't really helpful during log recovery.  Remove the flag and
stick to the default GFP_KERNEL policies.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: merge the perag freeing helpers
Christoph Hellwig [Sun, 1 Sep 2024 08:09:32 +0000 (11:09 +0300)]
xfs: merge the perag freeing helpers

There is no good reason to have two different routines for freeing perag
structures for the unmount and error cases.  Add two arguments to specify
the range of AGs to free to xfs_free_perag, and use that to replace
xfs_free_unused_perag_range.

The addition RCU grace period for the error case is harmless, and the
extra check for the AG to actually exist is not required now that the
callers pass the exact known allocated range.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: pass the exact range to initialize to xfs_initialize_perag
Christoph Hellwig [Sun, 8 Sep 2024 07:53:41 +0000 (10:53 +0300)]
xfs: pass the exact range to initialize to xfs_initialize_perag

Currently only the new agcount is passed to xfs_initialize_perag, which
requires lookups of existing AGs to skip them and complicates error
handling.  Also pass the previous agcount so that the range that
xfs_initialize_perag operates on is exactly defined.  That way the
extra lookups can be avoided, and error handling can clean up the
exact range from the old count to the last added perag structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: use xas_for_each_marked in xfs_reclaim_inodes_count
Christoph Hellwig [Wed, 28 Aug 2024 04:58:02 +0000 (07:58 +0300)]
xfs: use xas_for_each_marked in xfs_reclaim_inodes_count

xfs_reclaim_inodes_count iterates over all AGs to sum up the reclaimable
inodes counts.  There is no point in grabbing a reference to the them or
unlock the RCU critical section for each iteration, so switch to the
more efficient xas_for_each_marked iterator.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: convert perag lookup to xarray
Christoph Hellwig [Wed, 21 Aug 2024 05:59:27 +0000 (07:59 +0200)]
xfs: convert perag lookup to xarray

Convert the perag lookup from the legacy radix tree to the xarray,
which allows for much nicer iteration and bulk lookup semantics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: simplify tagged perag iteration
Christoph Hellwig [Wed, 21 Aug 2024 05:31:46 +0000 (07:31 +0200)]
xfs: simplify tagged perag iteration

Pass the old perag structure to the tagged loop helpers so that they can
grab the old agno before releasing the reference.  This removes the need
to separately track the agno and the iterator macro, and thus also
obsoletes the for_each_perag_tag syntactic sugar.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: move the tagged perag lookup helpers to xfs_icache.c
Christoph Hellwig [Wed, 21 Aug 2024 05:27:51 +0000 (07:27 +0200)]
xfs: move the tagged perag lookup helpers to xfs_icache.c

The tagged perag helpers are only used in xfs_icache.c in the kernel code
and not at all in xfsprogs.  Move them to xfs_icache.c in preparation for
switching to an xarray, for which I have no plan to implement the tagged
lookup functions for userspace.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: use kfree_rcu_mightsleep to free the perag structures
Christoph Hellwig [Sat, 10 Aug 2024 06:00:44 +0000 (08:00 +0200)]
xfs: use kfree_rcu_mightsleep to free the perag structures

Using the kfree_rcu_mightsleep is simpler and removes the need for a
rcu_head in the perag structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
10 months agoxfs: enable realtime reflink
Darrick J. Wong [Thu, 15 Aug 2024 18:49:50 +0000 (11:49 -0700)]
xfs: enable realtime reflink

Enable reflink for realtime devices, sort of.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: fix CoW forks for realtime files
Darrick J. Wong [Thu, 15 Aug 2024 18:49:49 +0000 (11:49 -0700)]
xfs: fix CoW forks for realtime files

Port the copy on write fork repair to realtime files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: check for shared rt extents when rebuilding rt file's data fork
Darrick J. Wong [Thu, 15 Aug 2024 18:49:48 +0000 (11:49 -0700)]
xfs: check for shared rt extents when rebuilding rt file's data fork

When we're rebuilding the data fork of a realtime file, we need to
cross-reference each mapping with the rt refcount btree to ensure that
the reflink flag is set if there are any shared extents found.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: repair inodes that have a refcount btree in the data fork
Darrick J. Wong [Thu, 15 Aug 2024 18:49:47 +0000 (11:49 -0700)]
xfs: repair inodes that have a refcount btree in the data fork

Plumb knowledge of refcount btrees into the inode core repair code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: online repair of the realtime refcount btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:47 +0000 (11:49 -0700)]
xfs: online repair of the realtime refcount btree

Port the data device's refcount btree repair code to the realtime
refcount btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: capture realtime CoW staging extents when rebuilding rt rmapbt
Darrick J. Wong [Thu, 15 Aug 2024 18:49:46 +0000 (11:49 -0700)]
xfs: capture realtime CoW staging extents when rebuilding rt rmapbt

Walk the realtime refcount btree to find the CoW staging extents when
we're rebuilding the realtime rmap btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: walk the rt reference count tree when rebuilding rmap
Darrick J. Wong [Thu, 15 Aug 2024 18:49:45 +0000 (11:49 -0700)]
xfs: walk the rt reference count tree when rebuilding rmap

When we're rebuilding the data device rmap, if we encounter a "refcount"
format fork, we have to walk the (realtime) refcount btree inode to
build the appropriate mappings.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: check new rtbitmap records against rt refcount btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:44 +0000 (11:49 -0700)]
xfs: check new rtbitmap records against rt refcount btree

When we're rebuilding the realtime bitmap, check the proposed free
extents against the rt refcount btree to make sure we don't commit any
grievous errors.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: don't flag quota rt block usage on rtreflink filesystems
Darrick J. Wong [Thu, 15 Aug 2024 18:49:43 +0000 (11:49 -0700)]
xfs: don't flag quota rt block usage on rtreflink filesystems

Quota space usage is allowed to exceed the size of the physical storage
when reflink is enabled.  Now that we have reflink for the realtime
volume, apply this same logic to the rtb repair logic.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: scrub the metadir path of rt refcount btree files
Darrick J. Wong [Thu, 15 Aug 2024 18:49:42 +0000 (11:49 -0700)]
xfs: scrub the metadir path of rt refcount btree files

Add a new XFS_SCRUB_METAPATH subtype so that we can scrub the metadata
directory tree path to the refcount btree file for each rt group.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: detect and repair misaligned rtinherit directory cowextsize hints
Darrick J. Wong [Thu, 15 Aug 2024 18:49:41 +0000 (11:49 -0700)]
xfs: detect and repair misaligned rtinherit directory cowextsize hints

If we encounter a directory that has been configured to pass on a CoW
extent size hint to a new realtime file and the hint isn't an integer
multiple of the rt extent size, we should flag the hint for
administrative review and/or turn it off because that is a
misconfiguration.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: allow dquot rt block count to exceed rt blocks on reflink fs
Darrick J. Wong [Thu, 15 Aug 2024 18:49:40 +0000 (11:49 -0700)]
xfs: allow dquot rt block count to exceed rt blocks on reflink fs

Update the quota scrubber to allow dquots where the realtime block count
exceeds the block count of the rt volume if reflink is enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: check reference counts of gaps between rt refcount records
Darrick J. Wong [Thu, 15 Aug 2024 18:49:40 +0000 (11:49 -0700)]
xfs: check reference counts of gaps between rt refcount records

If there's a gap between records in the rt refcount btree, we ought to
cross-reference the gap with the rtrmap records to make sure that there
aren't any overlapping records for a region that doesn't have any shared
ownership.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: allow overlapping rtrmapbt records for shared data extents
Darrick J. Wong [Thu, 15 Aug 2024 18:49:39 +0000 (11:49 -0700)]
xfs: allow overlapping rtrmapbt records for shared data extents

Allow overlapping realtime reverse mapping records if they both describe
shared data extents and the fs supports reflink on the realtime volume.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: cross-reference checks with the rt refcount btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:38 +0000 (11:49 -0700)]
xfs: cross-reference checks with the rt refcount btree

Use the realtime refcount btree to implement cross-reference checks in
other data structures.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: scrub the realtime refcount btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:37 +0000 (11:49 -0700)]
xfs: scrub the realtime refcount btree

Add code to scrub realtime refcount btrees.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: report realtime refcount btree corruption errors to the health system
Darrick J. Wong [Thu, 15 Aug 2024 18:49:36 +0000 (11:49 -0700)]
xfs: report realtime refcount btree corruption errors to the health system

Whenever we encounter corrupt realtime refcount btree blocks, we should
report that to the health monitoring system for later reporting.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: check that the rtrefcount maxlevels doesn't increase when growing fs
Darrick J. Wong [Thu, 15 Aug 2024 18:49:35 +0000 (11:49 -0700)]
xfs: check that the rtrefcount maxlevels doesn't increase when growing fs

The size of filesystem transaction reservations depends on the maximum
height (maxlevels) of the realtime btrees.  Since we don't want a grow
operation to increase the reservation size enough that we'll fail the
minimum log size checks on the next mount, constrain growfs operations
if they would cause an increase in the rt refcount btree maxlevels.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: enable extent size hints for CoW operations
Darrick J. Wong [Thu, 15 Aug 2024 18:49:34 +0000 (11:49 -0700)]
xfs: enable extent size hints for CoW operations

Wire up the copy-on-write extent size hint for realtime files, and
connect it to the rt allocator so that we avoid fragmentation on rt
filesystems.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: apply rt extent alignment constraints to CoW extsize hint
Darrick J. Wong [Thu, 15 Aug 2024 18:49:33 +0000 (11:49 -0700)]
xfs: apply rt extent alignment constraints to CoW extsize hint

The copy-on-write extent size hint is subject to the same alignment
constraints as the regular extent size hint.  Since we're in the process
of adding reflink (and therefore CoW) to the realtime device, we must
apply the same scattered rextsize alignment validation strategies to
both hints to deal with the possibility of rextsize changing.

Therefore, fix the inode validator to perform rextsize alignment checks
on regular realtime files, and to remove misaligned directory hints.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: fix xfs_get_extsz_hint behavior with realtime alwayscow files
Darrick J. Wong [Thu, 15 Aug 2024 18:49:33 +0000 (11:49 -0700)]
xfs: fix xfs_get_extsz_hint behavior with realtime alwayscow files

Currently, we (ab)use xfs_get_extsz_hint so that it always returns a
nonzero value for realtime files.  This apparently was done to disable
delayed allocation for realtime files.

However, once we enable realtime reflink, we can also turn on the
alwayscow flag to force CoW writes to realtime files.  In this case, the
logic will incorrectly send the write through the delalloc write path.

Fix this by adjusting the logic slightly.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: recover CoW leftovers in the realtime volume
Darrick J. Wong [Thu, 15 Aug 2024 18:49:32 +0000 (11:49 -0700)]
xfs: recover CoW leftovers in the realtime volume

Scan the realtime refcount tree at mount time to get rid of leftover
CoW staging extents.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: allow inodes to have the realtime and reflink flags
Darrick J. Wong [Thu, 15 Aug 2024 18:49:31 +0000 (11:49 -0700)]
xfs: allow inodes to have the realtime and reflink flags

Now that we can share blocks between realtime files, allow this
combination.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: enable sharing of realtime file blocks
Darrick J. Wong [Thu, 15 Aug 2024 18:49:30 +0000 (11:49 -0700)]
xfs: enable sharing of realtime file blocks

Update the remapping routines to be able to handle realtime files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: enable CoW for realtime data
Darrick J. Wong [Thu, 15 Aug 2024 18:49:29 +0000 (11:49 -0700)]
xfs: enable CoW for realtime data

Update our write paths to support copy on write on the rt volume.  This
works in more or less the same way as it does on the data device, with
the major exception that we never do delalloc on the rt volume.

Because we consider unwritten CoW fork staging extents to be incore
quota reservation, we update xfs_quota_reserve_blkres to support this
case.  Though xfs doesn't allow rt and quota together, the change is
trivial and we shouldn't leave a logic bomb here.

While we're at it, add a missing xfs_mod_delalloc call when we remove
delalloc block reservation from the inode.  This is largely irrelvant
since realtime files do not use delalloc, but we want to avoid leaving
logic bombs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: refactor reflink quota updates
Darrick J. Wong [Thu, 15 Aug 2024 18:49:28 +0000 (11:49 -0700)]
xfs: refactor reflink quota updates

Hoist all quota updates for reflink into a helper function, since things
are about to become more complicated.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: compute rtrmap btree max levels when reflink enabled
Darrick J. Wong [Thu, 15 Aug 2024 18:49:28 +0000 (11:49 -0700)]
xfs: compute rtrmap btree max levels when reflink enabled

Compute the maximum possible height of the realtime rmap btree when
reflink is enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: update rmap to allow cow staging extents in the rt rmap
Darrick J. Wong [Thu, 15 Aug 2024 18:49:27 +0000 (11:49 -0700)]
xfs: update rmap to allow cow staging extents in the rt rmap

Don't error out on CoW staging extent records when realtime reflink is
enabled.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: create routine to allocate and initialize a realtime refcount btree inode
Darrick J. Wong [Thu, 15 Aug 2024 18:49:26 +0000 (11:49 -0700)]
xfs: create routine to allocate and initialize a realtime refcount btree inode

Create a library routine to allocate and initialize an empty realtime
refcountbt inode.  We'll use this for growfs, mkfs, and repair.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: wire up realtime refcount btree cursors
Darrick J. Wong [Thu, 15 Aug 2024 18:49:25 +0000 (11:49 -0700)]
xfs: wire up realtime refcount btree cursors

Wire up realtime refcount btree cursors wherever they're needed
throughout the code base.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: refactor xfs_reflink_find_shared
Christoph Hellwig [Thu, 15 Aug 2024 18:49:24 +0000 (11:49 -0700)]
xfs: refactor xfs_reflink_find_shared

Move lookup of the perag structure from the callers into the helpers,
and return the offset into the extent of the shared region instead of
the block number that needs post-processing.  This prepares the
callsites for the creation of an rt-specific variant in the next patch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: port to the middle of the rtreflink series for cleanliness]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: wire up a new inode fork type for the realtime refcount
Darrick J. Wong [Thu, 15 Aug 2024 18:49:23 +0000 (11:49 -0700)]
xfs: wire up a new inode fork type for the realtime refcount

Plumb in the pieces we need to embed the root of the realtime refcount
btree in an inode's data fork, complete with new fork type and
on-disk interpretation functions.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: add metadata reservations for realtime refcount btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:23 +0000 (11:49 -0700)]
xfs: add metadata reservations for realtime refcount btree

Reserve some free blocks so that we will always have enough free blocks
in the data volume to handle expansion of the realtime refcount btree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: add realtime refcount btree inode to metadata directory
Darrick J. Wong [Thu, 15 Aug 2024 18:49:22 +0000 (11:49 -0700)]
xfs: add realtime refcount btree inode to metadata directory

Add a metadir path to select the realtime refcount btree inode and load
it at mount time.  The rtrefcountbt inode will have a unique extent format
code, which means that we also have to update the inode validation and
flush routines to look for it.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: add realtime refcount btree block detection to log recovery
Darrick J. Wong [Thu, 15 Aug 2024 18:49:21 +0000 (11:49 -0700)]
xfs: add realtime refcount btree block detection to log recovery

Identify rt refcount btree blocks in the log correctly so that we can
validate them during log recovery.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: support recovering refcount intent items targetting realtime extents
Darrick J. Wong [Thu, 15 Aug 2024 18:49:20 +0000 (11:49 -0700)]
xfs: support recovering refcount intent items targetting realtime extents

Now that we have reflink on the realtime device, refcount intent items
have to support remapping extents on the realtime volume.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: add a realtime flag to the refcount update log redo items
Darrick J. Wong [Thu, 15 Aug 2024 18:49:19 +0000 (11:49 -0700)]
xfs: add a realtime flag to the refcount update log redo items

Extend the refcount update (CUI) log items with a new realtime flag that
indicates that the updates apply against the realtime refcountbt.  We'll
wire up the actual refcount code later.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: prepare refcount functions to deal with rtrefcountbt
Darrick J. Wong [Thu, 15 Aug 2024 18:49:18 +0000 (11:49 -0700)]
xfs: prepare refcount functions to deal with rtrefcountbt

Prepare the high-level refcount functions to deal with the new realtime
refcountbt and its slightly different conventions.  Provide the ability
to talk to either refcountbt or rtrefcountbt formats from the same high
level code.

Note that we leave the _recover_cow_leftovers functions for a separate
patch so that we can convert it all at once.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: add realtime refcount btree operations
Darrick J. Wong [Thu, 15 Aug 2024 18:49:17 +0000 (11:49 -0700)]
xfs: add realtime refcount btree operations

Implement the generic btree operations needed to manipulate rtrefcount
btree blocks. This is different from the regular refcountbt in that we
allocate space from the filesystem at large, and are neither constrained
to the free space nor any particular AG.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: realtime refcount btree transaction reservations
Darrick J. Wong [Thu, 15 Aug 2024 18:49:17 +0000 (11:49 -0700)]
xfs: realtime refcount btree transaction reservations

Make sure that there's enough log reservation to handle mapping
and unmapping realtime extents.  We have to reserve enough space
to handle a split in the rtrefcountbt to add the record and a second
split in the regular refcountbt to record the rtrefcountbt split.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: define the on-disk realtime refcount btree format
Darrick J. Wong [Thu, 15 Aug 2024 18:49:16 +0000 (11:49 -0700)]
xfs: define the on-disk realtime refcount btree format

Start filling out the rtrefcount btree implementation. Start with the
on-disk btree format; add everything needed to read, write and
manipulate refcount btree blocks. This prepares the way for connecting
the btree operations implementation.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: namespace the maximum length/refcount symbols
Darrick J. Wong [Thu, 15 Aug 2024 18:49:15 +0000 (11:49 -0700)]
xfs: namespace the maximum length/refcount symbols

Actually namespace these variables properly, so that readers can tell
that this is an XFS symbol, and that it's for the refcount
functionality.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: introduce realtime refcount btree definitions
Darrick J. Wong [Thu, 15 Aug 2024 18:49:14 +0000 (11:49 -0700)]
xfs: introduce realtime refcount btree definitions

Add new realtime refcount btree definitions. The realtime refcount btree
will be rooted from a hidden inode, but has its own shape and therefore
needs to have most of its own separate types.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: prepare refcount btree cursor tracepoints for realtime
Darrick J. Wong [Thu, 15 Aug 2024 18:49:13 +0000 (11:49 -0700)]
xfs: prepare refcount btree cursor tracepoints for realtime

Rework the refcount btree cursor tracepoints in preparation to handle the
realtime refcount btree cursor.  Mostly this involves renaming the field to
"refcbno" and extracting the group number from the cursor when possible.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: enable realtime rmap btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:12 +0000 (11:49 -0700)]
xfs: enable realtime rmap btree

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: hook live realtime rmap operations during a repair operation
Darrick J. Wong [Thu, 15 Aug 2024 18:49:12 +0000 (11:49 -0700)]
xfs: hook live realtime rmap operations during a repair operation

Hook the regular realtime rmap code when an rtrmapbt repair operation is
running so that we can unlock the AGF buffer to scan the filesystem and
keep the in-memory btree up to date during the scan.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: create a shadow rmap btree during realtime rmap repair
Darrick J. Wong [Thu, 15 Aug 2024 18:49:11 +0000 (11:49 -0700)]
xfs: create a shadow rmap btree during realtime rmap repair

Create an in-memory btree of rmap records instead of an array.  This
enables us to do live record collection instead of freezing the fs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
10 months agoxfs: online repair of the realtime rmap btree
Darrick J. Wong [Thu, 15 Aug 2024 18:49:10 +0000 (11:49 -0700)]
xfs: online repair of the realtime rmap btree

Repair the realtime rmap btree while mounted.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>