Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
xfs_repair: allow sysadmins to add metadata directories
Allow the sysadmin to use xfs_repair to upgrade an existing filesystem
to support metadata directories. This will be needed to upgrade
filesystems to support realtime rmap and reflink.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:56 +0000 (14:21 -0700)]
xfs_repair: do not count metadata directory files when doing quotacheck
Previously, we stated that files in the metadata directory tree are not
counted in the dquot information. Fix the offline quotacheck code in
xfs_repair and xfs_check to reflect this.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: truncate and unmark orphaned metadata inodes
If an inode claims to be a metadata inode but wasn't linked in either
directory tree, remove the attr fork and reset the data fork if the
contents weren't regular extent mappings before moving the inode to the
lost+found.
We don't ifree the inode, because it's possible that the inode was not
actually a metadata inode but simply got corrupted due to bitflips or
something, and we'd rather let the sysadmin examine what's left of the
file instead of photorec'ing it.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: drop all the metadata directory files during pass 4
Drop the entire metadata directory tree during pass 4 so that we can
reinitialize the entire tree in phase 6. The existing metadata files
(rtbitmap, rtsummary, quotas) will be reattached to the newly rebuilt
directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: metadata dirs are never plausible root dirs
Metadata directories are never candidates to be the root of the
user-accessible directory tree. Update has_plausible_rootdir to ignore
them all, as well as detecting the case where the superblock incorrectly
thinks both trees have the same root.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:55 +0000 (14:21 -0700)]
xfs_repair: adjust keep_fsinos to handle metadata directories
In keep_fsinos, mark the root of the metadata directory tree as inuse.
The realtime bitmap and summary files still come after the root
directories, so this is a fairly simple change to the loop test.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:54 +0000 (14:21 -0700)]
xfs_repair: mark space used by metadata files
Track space used by metadata files as a separate incore extent type.
This ensures that we can warn about cross-linked metadata files, even
though we are going to rebuild the entire metadata directory tree in the
end.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:54 +0000 (14:21 -0700)]
xfs_repair: don't let metadata and regular files mix
Track whether or not inodes thought they were metadata inodes. We
cannot allow metadata inodes to appear in the regular directory tree,
and we cannot allow regular inodes to appear in the metadata directory
tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:54 +0000 (14:21 -0700)]
xfs_repair: rebuild the metadata directory
Check the dirents in metadata directories for problems and repair them
if necessary. Also make sure that the sb-rooted inodes (root, metadir
root, rt bitmap, rt summary) are always allocated in that order.
Note that xfs_repair will always rebuild the metadata directory tree
itself, so we only need to report problems, not fix them.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Create a helper function to grab a realtime metadata inode. When
metadir arrives, the bitmap and summary inodes can float, so we'll
turn this function into a "load or allocate" function.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:52 +0000 (14:21 -0700)]
xfs_repair: dont check metadata directory dirent inumbers
Phase 6 always rebuilds the entire metadata directory tree, and repair
quietly ignores all the DIFLAG2_METADATA directory inodes that it finds.
As a result, none of the metadata directories are marked inuse in the
incore data. Therefore, the is_inode_free checks are not valid for
anything we find in a metadata directory.
Therefore, avoid checking is_inode_free when scanning metadata directory
dirents.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Create a new scrubber type that checks that well known metadata
directory paths are connected to the metadata inode that the incore
structures think is in use. IOWs, check that "/quota/user" in the
metadata directory tree actually points to
mp->m_quotainfo->qi_uquotaip->i_ino.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:48 +0000 (14:21 -0700)]
xfs: adjust xfs_bmap_add_attrfork for metadir
Online repair might use the xfs_bmap_add_attrfork to repair a file in
the metadata directory tree if (say) the metadata file lacks the correct
parent pointers. In that case, it is not correct to check that the file
is dqattached -- metadata files must be not have /any/ dquot attached at
all.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 29 May 2024 04:11:01 +0000 (21:11 -0700)]
xfs: enable creation of dynamically allocated metadir path structures
Add a few helper functions so that it's possible to allocate
xfs_imeta_path objects dynamically, along with dynamically allocated
path components. Eventually we're going to want to support paths of the
form "/realtime/$rtgroup.rmap", and this is necessary for that.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:48 +0000 (14:21 -0700)]
xfs: allow bulkstat to return metadata directories
Allow the V5 bulkstat ioctl to return information about metadata
directory files so that xfs_scrub can find and scrub them, since they
are otherwise ordinary directories.
(Metadata files of course require per-file scrub code and hence do not
need exposure.)
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:47 +0000 (14:21 -0700)]
xfs: disable the agi rotor for metadata inodes
Ideally, we'd put all the metadata inodes in one place if we could, so
that the metadata all stay reasonably close together instead of
spreading out over the disk. Furthermore, if the log is internal we'd
probably prefer to keep the metadata near the log. Therefore, disable
AGI rotoring for metadata inode allocations.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 29 May 2024 04:10:59 +0000 (21:10 -0700)]
xfs: ensure metadata directory paths exist before creating files
Since xfs_imeta_create can create new metadata files arbitrarily deep in
the metadata directory tree, add a helper function that can ensure that
all directories in a path exist.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:46 +0000 (14:21 -0700)]
xfs: enforce metadata inode flag
Add checks for the metadata inode flag so that we don't ever leak
metadata inodes out to userspace, and we don't ever try to read a
regular inode as metadata.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:46 +0000 (14:21 -0700)]
xfs: define the on-disk format for the metadir feature
Define the on-disk layout and feature flags for the metadata inode
directory feature. Add a xfs_sb_version_hasmetadir for benefit of
xfs_repair, which needs to know where the new end of the superblock
lies.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:46 +0000 (14:21 -0700)]
xfs: iget for metadata inodes
Create a xfs_imeta_iget function for metadata inodes to ensure that when
we try to iget a metadata file, the inobt thinks a metadata inode is in
use and that the file type matches what we are expecting.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:45 +0000 (14:21 -0700)]
xfs: pass the icreate args object to xfs_dialloc
Pass the xfs_icreate_args object to xfs_dialloc since we can extract the
relevant mode (really just the file type) and parent inumber from there.
This simplifies the calling convention in preparation for the next
patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:43 +0000 (14:21 -0700)]
xfs: introduce new file range commit ioctls
This patch introduces two more new ioctls to manage atomic updates to
file contents -- XFS_IOC_START_COMMIT and XFS_IOC_COMMIT_RANGE. The
commit mechanism here is exactly the same as what XFS_IOC_EXCHANGE_RANGE
does, but with the additional requirement that file2 cannot have changed
since some sampling point. The start-commit ioctl performs the sampling
of file attributes.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:44 +0000 (14:21 -0700)]
mkfs: clean up the rtinit() function
Clean up some of the warts in this function, like the inconsistent use
of @i for @error, missing comments, and make this more visually pleasing
by adding some whitespace between major sections. Some things are left
untouched for the next patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:43 +0000 (14:21 -0700)]
xfs: move xfs_refcount_update_defer_add to xfs_refcount_item.c
Move the code that adds the incore xfs_refcount_update_item deferred
work data to a transaction live with the CUI log item code. This means
that the refcount code no longer has to know about the inner workings of
the CUI log items.
As a consequence, we can get rid of the _{get,put}_group helpers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:42 +0000 (14:21 -0700)]
xfs: don't bother calling xfs_refcount_finish_one_cleanup in xfs_refcount_finish_one
In xfs_refcount_finish_one we know the cursor is non-zero when calling
xfs_refcount_finish_one_cleanup and we pass a 0 error variable. This
means xfs_refcount_finish_one_cleanup is just doing a
xfs_btree_del_cursor.
Open code that and move xfs_refcount_finish_one_cleanup to
fs/xfs/xfs_refcount_item.c.
Inspired-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:42 +0000 (14:21 -0700)]
xfs: add a ci_entry helper
Add a helper to translate from the item list head to the
refcount_intent_item structure and use it so shorten assignments and
avoid the need for extra local variables.
Inspired-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:41 +0000 (14:21 -0700)]
xfs: prepare refcount btree tracepoints for widening
Prepare the rest of refcount btree tracepoints for use with realtime
reflink by making them take the btree cursor object as a parameter.
This will save us a lot of trouble later on.
Remove the xfs_refcount_recover_extent tracepoint since it's already
covered by other refcount tracepoints.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:41 +0000 (14:21 -0700)]
xfs: create specialized classes for refcount tracepoints
The only user of the "ag" tracepoint event classes is the refcount
btree, so rename them to make that obvious and make them take the btree
cursor to simplify the arguments. This will save us a lot of trouble
later on.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:41 +0000 (14:21 -0700)]
xfs: move xfs_rmap_update_defer_add to xfs_rmap_item.c
Move the code that adds the incore xfs_rmap_update_item deferred work
data to a transaction live with the RUI log item code. This means that
the rmap code no longer has to know about the inner workings of the RUI
log items.
As a consequence, we can get rid of the _{get,put}_group helpers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Wed, 3 Jul 2024 21:21:41 +0000 (14:21 -0700)]
xfs: simplify usage of the rcur local variable in xfs_rmap_finish_one
Only update rcur when we know the final *pcur value.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[djwong: don't leave the caller with a dangling ref] Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Wed, 3 Jul 2024 21:21:40 +0000 (14:21 -0700)]
xfs: don't bother calling xfs_rmap_finish_one_cleanup in xfs_rmap_finish_one
In xfs_rmap_finish_one we known the cursor is non-zero when calling
xfs_rmap_finish_one_cleanup and we pass a 0 error variable. This means
xfs_rmap_finish_one_cleanup is just doing a xfs_btree_del_cursor.
Open code that and move xfs_rmap_finish_one_cleanup to
fs/xfs/xfs_rmap_item.c.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: minor porting changes] Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Wed, 3 Jul 2024 21:21:40 +0000 (14:21 -0700)]
xfs: add a ri_entry helper
Add a helper to translate from the item list head to the
rmap_intent_item structure and use it so shorten assignments
and avoid the need for extra local variables.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:39 +0000 (14:21 -0700)]
xfs: prepare rmap btree tracepoints for widening
Prepare the rmap btree tracepoints for use with realtime rmap btrees by
making them take the btree cursor object as a parameter. This will save
us a lot of trouble later on.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:39 +0000 (14:21 -0700)]
xfs: give rmap btree cursor error tracepoints their own class
Create a new tracepoint class for btree-related errors, then convert all
the rmap tracepoints to use it. Also fix the one tracepoint that was
abusing the old class by making it a separate tracepoint.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:39 +0000 (14:21 -0700)]
xfs: move xfs_extent_free_defer_add to xfs_extfree_item.c
Move the code that adds the incore xfs_extent_free_item deferred work
data to a transaction live with the EFI log item code. This means that
the allocator code no longer has to know about the inner workings of the
EFI log items.
As a consequence, we can get rid of the _{get,put}_group helpers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Wed, 3 Jul 2024 21:21:39 +0000 (14:21 -0700)]
xfs: remove duplicate asserts in xfs_defer_extent_free
The bno/len verification is already done by the calls to
xfs_verify_rtbext / xfs_verify_fsbext, and reporting a corruption error
seem like the better handling than tripping an assert anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Wed, 3 Jul 2024 21:21:38 +0000 (14:21 -0700)]
xfs: add a xefi_entry helper
Add a helper to translate from the item list head to the
xfs_extent_free_item structure and use it so shorten assignments
and avoid the need for extra local variables.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Wed, 3 Jul 2024 21:21:38 +0000 (14:21 -0700)]
xfs: pass the fsbno to xfs_perag_intent_get
All callers of xfs_perag_intent_get have a fsbno and need boilerplate
code to turn that into an agno. Just pass the fsbno to
xfs_perag_intent_get and look up the agno there.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:37 +0000 (14:21 -0700)]
xfs_repair: use library functions to reset root/rbm/rsum inodes
Use the iroot reset function to reset root inodes instead of open-coding
the reset routine. While we're at it, fix a longstanding memory leak if
the inode being reset actually had an xattr fork full of mappings.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:37 +0000 (14:21 -0700)]
xfs: don't use the incore struct xfs_sb for offsets into struct xfs_dsb
Currently, the XFS_SB_CRC_OFF macro uses the incore superblock struct
(xfs_sb) to compute the address of sb_crc within the ondisk superblock
struct (xfs_dsb). This is a landmine if we ever change the layout of
the incore superblock (as we're about to do), so redefine the macro
to use xfs_dsb to compute the layout of xfs_dsb.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:37 +0000 (14:21 -0700)]
xfs: move dirent update hooks to xfs_dir2.c
Move the directory entry update hook code to xfs_dir2 so that it is
mostly consolidated with the higher level directory functions. Retain
the exports so that online fsck can still send notifications through the
hooks.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:36 +0000 (14:21 -0700)]
xfs: create libxfs helper to rename two directory entries
Create a new libxfs function to rename two directory entries. The
upcoming metadata directory feature will need this to replace a metadata
inode directory entry.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:36 +0000 (14:21 -0700)]
xfs: create libxfs helper to exchange two directory entries
Create a new libxfs function to exchange two directory entries.
The upcoming metadata directory feature will need this to replace a
metadata inode directory entry.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:36 +0000 (14:21 -0700)]
xfs: create libxfs helper to remove an existing inode/name from a directory
Create a new libxfs function to remove a (name, inode) entry from a
directory. The upcoming metadata directory feature will need this to
create a metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:36 +0000 (14:21 -0700)]
xfs: create libxfs helper to link an existing inode into a directory
Create a new libxfs function to link an existing inode into a directory.
The upcoming metadata directory feature will need this to create a
metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:35 +0000 (14:21 -0700)]
xfs: create libxfs helper to link a new inode into a directory
Create a new libxfs function to link a newly created inode into a
directory. The upcoming metadata directory feature will need this to
create a metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 3 Jul 2024 21:21:35 +0000 (14:21 -0700)]
xfs: separate the icreate logic around INIT_XATTRS
INIT_XATTRS is overloaded here -- it's set during the creat process when
we think that we're immediately going to set some ACL xattrs to save
time. However, it's also used by the parent pointers code to enable the
attr fork in preparation to receive ppptr xattrs. This results in
xfs_has_parent() branches scattered around the codebase to turn on
INIT_XATTRS.
Linkable files are created far more commonly than unlinkable temporary
files or directory tree roots, so we should centralize this logic in
xfs_inode_init. For the three callers that don't want parent pointers
(online repiar tempfiles, unlinkable tempfiles, rootdir creation) we
provide an UNLINKABLE flag to skip attr fork initialization.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>