Darrick J. Wong [Thu, 26 Sep 2019 17:41:35 +0000 (13:41 -0400)]
libfrog: share scrub headers
xfs_io and xfs_scrub have nearly identical structures to describe scrub
types. Combine them into a single header file.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 26 Sep 2019 17:41:35 +0000 (13:41 -0400)]
xfs_scrub: remove unnecessary fd parameter from file scrubbers
xfs_scrub's scrub ioctl wrapper functions always take a scrub_ctx and an
fd, but we always set the fd to ctx->mnt.fd. Remove the redundant
parameter.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Thu, 26 Sep 2019 17:41:34 +0000 (13:41 -0400)]
xfs_spaceman: report health problems
Use the fs and ag geometry ioctls to report health problems to users.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
xfs_io: copy_range don't truncate dst_file, and add smart length
1. copy_range should be a simple wrapper for copy_file_range(2)
and nothing else. and there's already -t option for truncate.
so here we remove the truncate action in copy_range.
see: https://patchwork.kernel.org/comment/22863587/#1
2. improve the default length value generation:
if -l option is omitted use the length that from src_offset to end
(src_file's size - src_offset) instead.
if src_offset is greater than file size, length is 0.
3. update manpage
4. rename var name from 'src, dst' to 'src_off, dst_off'
and have confirmed that this change will not affect xfstests.
Signed-off-by: Jianhong Yin <yin-jianhong@163.com>
[sandeen: fix long line] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:08 +0000 (15:37 -0400)]
xfs_repair: add AG btree rmaps into the filesystem after syncing sb
In rmap_store_ag_btree_rec(), we try to reserve 16 blocks to handle
adding all the AG btree rmaps to the rmap record. Unfortunately, at
that point in phase5 we haven't yet reinitialied sb_fdblocks, so
reserving blocks can fail if repair reconstructed the primary sb from a
secondary sb. Even if the function succeeds, this still leads to
incorrect fdblocks because phase 5 resets sb_fdblocks after running the
rmap transactions.
To avoid all this, move the rmap_store_ag_btree_rec call to after the sb
has been reset. xfs/350 was helpful in finding cases where xfs_repair
errored out while repairing the filesystem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:08 +0000 (15:37 -0400)]
xfs_repair: reduce the amount of "clearing reflink flag" messages
Clearing the reflink flag on files that don't share blocks is an
optimization, not a repair, so it's not critical to log a message every
single time we clear a flag. Only log one message that we're clearing
these flags unless verbose mode is enabled.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:08 +0000 (15:37 -0400)]
xfs_repair: use precomputed inode geometry values
Use the precomputed inode geometry values instead of open-coding them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:08 +0000 (15:37 -0400)]
xfs_db: use precomputed inode geometry values
Use the precomputed inode geometry values instead of open-coding them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:08 +0000 (15:37 -0400)]
xfs_db: add a function to compute btree geometry
Add a new command to xfs_db that uses a btree type and a record count
to report the best and worst case height and level size. This can be
used to estimate how much overhead a metadata index will add to a
filesystem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:08 +0000 (15:37 -0400)]
xfs_db: remove db/convert.h
db/convert.h conflicts with include/convert.h and since the former only
has one declaration in it anyway, just get rid of it. We'll need this
in the next patch to avoid an ugly include mess.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:08 +0000 (15:37 -0400)]
man: document the new health reporting fields in various ioctls
Update the manpages to conver the new health reporting fields in the
fs geometry, ag geometry, and bulkstat ioctls.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[sandeen: fix sick/checked description to reference inodes] Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:08 +0000 (15:37 -0400)]
man: document the new allocation group geometry ioctl
Document the new ioctl to describe an allocation group's geometry.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:08 +0000 (15:37 -0400)]
man: document new fs summary counter scrub command
Update the scrub ioctl documentation to include the new fs summary
counter scrubber.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:08 +0000 (15:37 -0400)]
man: document the new v5 fs geometry ioctl structures
Amend the fs geometry ioctl documentation to cover the new v5 structure.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:08 +0000 (15:37 -0400)]
xfs_spaceman: convert open-coded unit conversions to helpers
Create xfrog analogues of the libxfs byte/sector/block conversion
functions and convert spaceman to use them instead of open-coded
arithmatic we do now.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:07 +0000 (15:37 -0400)]
xfs_spaceman: embed struct xfs_fd in struct fileio
Replace the open-coded fd and geometry fields of struct fileio with a
single xfs_fd, which will enable us to use it natively throughout
xfs_spaceman in upcoming patches.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:07 +0000 (15:37 -0400)]
xfs_spaceman: remove unnecessary test in openfile()
xfs_spaceman always records fs_path information for an open file because
spaceman requires running on XFS and it always passes a non-null fs_path
to openfile. Therefore, openfile doesn't need the fs_path null check.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:07 +0000 (15:37 -0400)]
xfs_spaceman: remove typedef usage
Kill off the struct typedefs inside spaceman.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:07 +0000 (15:37 -0400)]
libfrog: move libfrog.h to libfrog/util.h
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:07 +0000 (15:37 -0400)]
libfrog: move workqueue.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:07 +0000 (15:37 -0400)]
libfrog: move path.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:07 +0000 (15:37 -0400)]
libfrog: move crc32c.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move workqueue.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move radix-tree.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move ptvar.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move fsgeom.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move convert.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move bitmap.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:06 +0000 (15:37 -0400)]
libfrog: move avl64.h to libfrog/
Move this header to libfrog since the code is there already.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libxfs: move topology declarations into separate header
The topology functions live in libfrog now, which means their
declarations don't belong in libxcmd.h. Create new header file for
them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libfrog: refactor open-coded INUMBERS calls
Refactor all the INUMBERS ioctl callsites into helper functions.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libfrog: create xfd_open function
Create a helper to open a file and initialize the xfd structure.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libfrog: refactor open-coded bulkstat calls
Refactor the BULKSTAT_SINGLE and BULKSTAT ioctl callsites into helper
functions.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libfrog: create online fs geometry converters
Create helper functions to perform unit conversions against a runtime
filesystem, then remove the open-coded versions in scrub.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:05 +0000 (15:37 -0400)]
libfrog: store more inode and block geometry in struct xfs_fd
Move the extra AG geometry fields out of scrub and into the libfrog code
so that we can consolidate the geoemtry code in one place.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:37:04 +0000 (15:37 -0400)]
libfrog: introduce xfs_fd to wrap an fd to a file on an xfs filesystem
Introduce a new "xfs_fd" context structure where we can store a file
descriptor and all the runtime fs context (geometry, which ioctls work,
etc.) that goes with it. We're going to create wrappers for the
bulkstat and inumbers ioctls in subsequent patches; and when we
introduce the v5 bulkstat/inumbers ioctls we'll need all that context to
downgrade gracefully on old kernels. Start the transition by adopting
xfs_fd natively in scrub.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Mon, 9 Sep 2019 19:36:57 +0000 (15:36 -0400)]
libfrog: refactor online geometry queries
Refactor all the open-coded XFS_IOC_FSGEOMETRY queries into a single
helper that we can use to standardize behaviors across mixed xfslibs
versions. This is the prelude to introducing a new FSGEOMETRY version
in 5.2 and needing to fix the (relatively few) client programs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
[sandeen: remove 5.2.1 compat hack now that we have the wrapper] Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Fri, 6 Sep 2019 21:59:13 +0000 (17:59 -0400)]
xfsprogs: update spdx tags in LICENSES/
Update the GPL related SPDX tags in LICENSES to reflect the SPDX 3.0
tagging formats.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The "obvious" cause is that the attr ifork is null despite the inode
claiming an attr fork having at least one extent, but it's not so
obvious why we ended up with an inode in that state.
Reported-by: Zorro Lang <zlang@redhat.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204031 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Continue our game of replacing ASSERTs for corrupt ondisk metadata with
EFSCORRUPTED returns.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
When we're iterating all the attributes using the built-in xattr
iterator, we can use the seen_enough variable to pass error codes back
to the main scrub function instead of flattening them into 0/1. This
will be used in a more exciting fashion in upcoming patches.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Create a new bulk ireq flag that enables userspace to ask us for a
special inode number instead of interpreting @ino as a literal inode
number. This enables us to query the root inode easily.
The reason for adding the ability to query specifically the root
directory inode is that certain programs (xfsdump and xfsrestore) want
to confirm when they've been pointed to the root directory. The
userspace code assumes the root directory is always the first result
from calling bulkstat with lastino == 0, but this isn't true if the
(initial btree roots + initial AGFL + inode alignment padding) is itself
long enough to be allocated to new inodes if all of those blocks should
happen to be free at the same time. Rather than make userspace guess
at internal filesystem state, we provide a direct query.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Add a new xfs_bulk_ireq flag to constrain the iteration to a single AG.
If the passed-in startino value is zero then we start with the first
inode in the AG that the user passes in; otherwise, we iterate only
within the same AG as the passed-in inode.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Introduce a new "v5" inode group structure that fixes the alignment
and padding problems of the existing structure.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Introduce a new version of the in-core bulkstat structure that supports
our new v5 format features. This structure also fills the gaps in the
previous structure. We leave wiring up the ioctls for the next patch.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Remove xfs_bstat_t, xfs_fsop_bulkreq_t, xfs_inogrp_t, and similarly
named compat typedefs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Create a new iterator function to simplify walking inodes in an XFS
filesystem. This new iterator will replace the existing open-coded
walking that goes on in various places.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Currently, xfs doesn't have generic error codes defined for "stop
iterating"; we just reuse the XFS_BTREE_QUERY_* return values. This
looks a little weird if we're not actually iterating a btree index.
Before we start adding more iterators, we should create general
XFS_ITER_{CONTINUE,ABORT} return values and define the XFS_BTREE_QUERY_*
ones from that.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Instead of a magic flag for xfs_trans_alloc, just ensure all callers
that can't relclaim through the file system use memalloc_nofs_save to
set the per-task nofs flag.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
There are many, many xfs header files which are included but
unneeded (or included twice) in the xfs code, so remove them.
nb: xfs_linux.h includes about 9 headers for everyone, so those
explicit includes get removed by this. I'm not sure what the
preference is, but if we wanted explicit includes everywhere,
a followup patch could remove those xfs_*.h includes from
xfs_linux.h and move them into the files that need them.
Or it could be left as-is.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
When we're writing out a fresh new AG, make sure that we don't list an
internal log as free and that we create the rmap for the region. growfs
never does this, but we will need it when we hook up mkfs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Refactor the code that populates the free space btrees of a new AG so
that we can avoid code duplication once things start getting
complicated.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
xfs_alloc_ag_vextent_small() doesn't update the output parameters in
the event of an AGFL allocation. Instead, it updates the
xfs_alloc_arg structure directly to complete the allocation.
Update both args and the output params to provide consistent
behavior for future callers.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The small allocation helper is implemented in a way that is fairly
tightly integrated to the existing allocation algorithms. It expects
a cntbt cursor beyond the end of the tree, attempts to locate the
last record in the tree and only attempts an AGFL allocation if the
cntbt is empty.
The upcoming generic algorithm doesn't rely on the cntbt processing
of this function. It will only call this function when the cntbt
doesn't have a big enough extent or is empty and thus AGFL
allocation is the only remaining option. Tweak
xfs_alloc_ag_vextent_small() to handle a NULL cntbt cursor and skip
the cntbt logic. This facilitates use by the existing allocation
code and new code that only requires an AGFL allocation attempt.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Move the small allocation helper further up in the file to avoid the
need for a function declaration. The remaining declarations will be
removed by followup patches. No functional changes.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
xfs_alloc_ag_vextent_small() is kind of a mess. Clean it up in
preparation for future changes. No functional changes.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
We need to derive the mount pointer from a buffer in a lot of place.
Add a direct pointer to short cut the pointer chasing.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The inode geometry structure isn't related to ondisk format; it's
support for the mount structure. Move it to xfs_shared.h.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
There are several functions which take a flag argument that is
only ever passed as "0," so remove these arguments.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The flags value is always passed as 0 so remove the argument.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Finish converting all the old inode_cluster_size >> inopblog users to
inodes_per_cluster.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
inode_cluster_size is supposed to represent the size (in bytes) of an
inode cluster buffer. We avoid having to handle multiple clusters per
filesystem block on filesystems with large blocks by openly rounding
this value up to 1 FSB when necessary. However, we never reset
inode_cluster_size to reflect this new rounded value, which adds to the
potential for mistakes in calculating geometries.
Fix this by setting inode_cluster_size to reflect the rounded-up size if
needed, and special-case the few places in the sparse inodes code where
we actually need the smaller value to validate on-disk metadata.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Migrate all of the inode geometry setup code from xfs_mount.c into a
single libxfs function that we can share with xfsprogs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Separate the inode geometry information into a distinct structure.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Eric Sandeen [Wed, 21 Aug 2019 15:41:03 +0000 (11:41 -0400)]
xfsprogs: fix geometry calls on older kernels for 5.2.1
I didn't think 5.2.0 through; the udpate of the geometry ioctl means
that the tools won't work on older kernels that don't support the
v5 ioctls, since I failed to merge Darrick's wrappers.
As a very quick one-off I'd like to merge this to just revert every
geometry call back to the original ioctl, so it keeps working on
older kernels and I'll release 5.2.1. This hack can go away when
Darrick's wrappers get merged.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
It turns out that the log can consume nearly all the space in an AG, and
when this happens this it's possible that there will be less free space
in the AG than the reservation would try to hide. On a debug kernel
this can trigger an ASSERT in xfs/250:
The log is permanently allocated, so we know we're never going to have
to expand the btrees to hold any records associated with the log space.
We therefore can treat the space as if it doesn't exist.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
There are several functions which have no opportunity to return
an error, and don't contain any ASSERTs which could be argued
to be better constructed as error cases. So, make them voids
to simplify the callers.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Teach online scrub how to check the filesystem summary counters. We use
the incore delalloc block counter along with the incore AG headers to
compute expected values for fdblocks, icount, and ifree, and then check
that the percpu counter is within a certain threshold of the expected
value. This is done to avoid having to freeze or otherwise lock the
filesystem, which means that we're only checking that the counters are
fairly close, not that they're exactly correct.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
During testing of xfs/141 on a V4 filesystem, I observed some
inconsistent behavior with regards to resources that are held (i.e.
remain locked) across a defer roll. The transaction roll always gives
the defer roll function a new transaction, even if committing the old
transaction fails. However, the defer roll function only rejoins the
held resources if the transaction commit succeedied. This means that
callers of defer roll have to figure out whether the held resources are
attached to the transaction being passed back.
Worse yet, if the defer roll was part of a defer finish call, we have a
third possibility: the defer finish could pass back a dirty transaction
with dirty held resources and an error code.
The only sane way to handle all of these scenarios is to require that
the code that held the resource either cancel the transaction before
unlocking and releasing the resources, or use functions that detach
resources from a transaction properly (e.g. xfs_trans_brelse) if they
need to drop the reference before committing or cancelling the
transaction.
In order to make this so, change the defer roll code to join held
resources to the new transaction unconditionally and fix all the bhold
callers to release the held buffers correctly.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Add a percpu counter to track the number of blocks directly reserved for
delayed allocations on the data device. This counter (in contrast to
i_delayed_blks) does not track allocated CoW staging extents or anything
going on with the realtime device. It will be used in the upcoming
summary counter scrub function to check the free block counts without
having to freeze the filesystem or walk all the inodes to find the
delayed allocations.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Block allocation requires a permanent transaction for deferred AGFL
frees. Add an assert in the block allocation path to make explicit and
obvious to future callers the requirement of a transaction with a
permanent reservation.
Reported-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: split this out from the previous patch per hch request] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The growdata transaction is used by growfs operations to increase
the data size of the filesystem. Part of this sequence involves
extending the size of the last preexisting AG in the fs, if
necessary. This is implemented by freeing the newly available
physical range to the AG.
tr_growdata is not a permanent transaction, however, and block
allocation transactions must be permanent to handle deferred frees
of AGFL blocks. If the grow operation extends an existing AG that
requires AGFL fixing, assert failures occur due to a populated dfops
list on a non-permanent transaction and the AGFL free does not
occur. This is reproduced (rarely) by xfs/104.
Change tr_growdata to a permanent transaction with a default log
count. This increases initial transaction reservation size, but
growfs is an infrequent and non-performance critical operation and
so should have minimal impact.
Reported-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: add a comment to the assert] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Use space in the bulkstat ioctl structure to report any problems
observed with the inode.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Use the AG geometry info ioctl to report health status too.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Use our newly expanded geometry structure to report the overall fs and
realtime health status.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Add a new ioctl to describe an allocation group's geometry.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Unfortunately, the V4 XFS_IOC_FSGEOMETRY structure is out of space so we
can't just add a new field to it. Hence we need to bump the definition
to V5 and and treat the V4 ioctl and structure similar to v1 to v3.
While doing this, clean up all the definitions associated with the
XFS_IOC_FSGEOMETRY ioctl.
Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: forward port to 5.1, expand structure size to 256 bytes] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
If we know the filesystem metadata isn't healthy during unmount, we want
to encourage the administrator to run xfs_repair right away. We can't
do this if BAD_SUMMARY will cause an unclean log unmount to force
summary recalculation, so turn it off if the fs is bad.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Replace the BAD_SUMMARY mount flag with calls to the equivalent health
tracking code.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Add the necessary in-core metadata fields to keep track of which parts
of the filesystem have been observed and which parts were observed to be
unhealthy, and print a warning at unmount time if we have unfixed
problems.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
The block allocation AG selection code has parameters that allow a
caller to perform multiple allocations from a single AG and
transaction (under certain conditions). The parameters specify the
total block allocation count required by the transaction and the AG
selection code selects and locks an AG that will be able to satisfy
the overall requirement. If the available block accounting
calculation turns out to be inaccurate and a subsequent allocation
call fails with -ENOSPC, the resulting transaction cancel leads to
filesystem shutdown because the transaction is dirty.
This exact problem can be reproduced with a highly parallel space
consumer and fsstress workload running long enough to a large
filesystem against -ENOSPC conditions. A bmbt block allocation
request made for inode extent to bmap format conversion after an
extent allocation is expected to be satisfied by the same AG and the
same transaction as the extent allocation. The bmbt block allocation
fails, however, because the block availability of the AG has changed
since the AG was selected (outside of the blocks used for the extent
itself).
The inconsistent block availability calculation is caused by the
deferred block freeing behavior of the AGFL. This immediately
removes extra blocks from the AGFL to free up AGFL slots, but rather
than immediately freeing such blocks as was done in the past, the
block free is deferred such that said blocks are not available for
allocation until the current transaction commits. The AG selection
logic currently considers all AGFL blocks as available and executes
shortly before any extra AGFL blocks are freed. This means the block
availability of the current AG can change before the first
allocation even occurs, but in practice a failure is more likely to
manifest via a subsequent allocation because extent allocation
usually has a contiguity requirement larger than a single block that
can't be satisfied from the AGFL.
In general, XFS prefers operational robustness to absolute
allocation efficiency. In other words, we prefer to return -ENOSPC
slightly earlier at the expense of not being able to allocate every
last block in an AG to avoid this kind of problem. As such, update
the AG block availability calculation to consider extra AGFL blocks
as unavailable since they are immediately removed following the
calculation and will not become available until the current
transaction commits.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
mkfs.xfs.8: Fix an inconsistency between the code and the man page.
The man page currently states that block and sector size units cannot
be used for other option values unless they are explicitly specified,
when in fact the default sizes will be used in that case.
Change the man page to clarify this.
Signed-off-by: Alvin Zheng <Alvin@linux.alibaba.com>
[sandeen: sector/block values do not need to be specified first] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate xfs shutdown ioctl manpage
Create a separate manual page for the xfs shutdown ioctl so we can
document how it works.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[sandeen: minor edits] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate GETBMAPX/GETBMAPA/GETBMAP ioctl manpage
Create a separate manual page for the xfs BMAP ioctls so we can document
how they work.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[sandeen: minor edits] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate RESBLKS ioctl manpage
Create a separate manual page for the xfs RESBLKS ioctls so we can
document how it works.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[sandeen: minor edits] Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate FSCOUNTS ioctl manpage
Create a separate manual page for the xfs FSCOUNTS ioctl so we can
document how it works.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: create a separate INUMBERS ioctl manpage
Create a separate manual page for the xfs INUMBERS ioctl so we can
document how it works.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Darrick J. Wong [Wed, 10 Jul 2019 21:16:36 +0000 (17:16 -0400)]
man: link to the SCRUB_METADATA ioctl manpage from xfsctl.3
Link to the scrub ioctl documentation from xfsctl.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>