Darrick J. Wong [Wed, 29 May 2024 04:13:05 +0000 (21:13 -0700)]
xfs: move xfs_refcount_update_defer_add to xfs_refcount_item.c
Move the code that adds the incore xfs_refcount_update_item deferred
work data to a transaction live with the CUI log item code. This means
that the refcount code no longer has to know about the inner workings of
the CUI log items.
As a consequence, we can get rid of the _{get,put}_group helpers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:13:05 +0000 (21:13 -0700)]
xfs: simplify usage of the rcur local variable in xfs_refcount_finish_one
Only update rcur when we know the final *pcur value.
Inspired-by: Christoph Hellwig <hch@lst.de>
[djwong: don't leave the caller with a dangling ref] Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:13:04 +0000 (21:13 -0700)]
xfs: don't bother calling xfs_refcount_finish_one_cleanup in xfs_refcount_finish_one
In xfs_refcount_finish_one we know the cursor is non-zero when calling
xfs_refcount_finish_one_cleanup and we pass a 0 error variable. This
means xfs_refcount_finish_one_cleanup is just doing a
xfs_btree_del_cursor.
Open code that and move xfs_refcount_finish_one_cleanup to
fs/xfs/xfs_refcount_item.c.
Inspired-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:13:03 +0000 (21:13 -0700)]
xfs: add a ci_entry helper
Add a helper to translate from the item list head to the
refcount_intent_item structure and use it so shorten assignments and
avoid the need for extra local variables.
Inspired-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:13:01 +0000 (21:13 -0700)]
xfs: pass btree cursors to refcount btree tracepoints
Prepare the rest of refcount btree tracepoints for use with realtime
reflink by making them take the btree cursor object as a parameter.
This will save us a lot of trouble later on.
Remove the xfs_refcount_recover_extent tracepoint since it's already
covered by other refcount tracepoints.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:13:01 +0000 (21:13 -0700)]
xfs: create specialized classes for refcount tracepoints
The only user of the "ag" tracepoint event classes is the refcount
btree, so rename them to make that obvious and make them take the btree
cursor to simplify the arguments. This will save us a lot of trouble
later on.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:11:45 +0000 (21:11 -0700)]
xfs: move xfs_rmap_update_defer_add to xfs_rmap_item.c
Move the code that adds the incore xfs_rmap_update_item deferred work
data to a transaction to live with the RUI log item code. This means
that the rmap code no longer has to know about the inner workings of the
RUI log items.
As a consequence, we can get rid of the _{get,put}_group helpers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Christoph Hellwig [Wed, 29 May 2024 04:11:45 +0000 (21:11 -0700)]
xfs: simplify usage of the rcur local variable in xfs_rmap_finish_one
Only update rcur when we know the final *pcur value.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[djwong: don't leave the caller with a dangling ref] Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Wed, 29 May 2024 04:11:44 +0000 (21:11 -0700)]
xfs: don't bother calling xfs_rmap_finish_one_cleanup in xfs_rmap_finish_one
In xfs_rmap_finish_one we known the cursor is non-zero when calling
xfs_rmap_finish_one_cleanup and we pass a 0 error variable. This means
xfs_rmap_finish_one_cleanup is just doing a xfs_btree_del_cursor.
Open code that and move xfs_rmap_finish_one_cleanup to
fs/xfs/xfs_rmap_item.c.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: minor porting changes] Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Wed, 29 May 2024 04:11:43 +0000 (21:11 -0700)]
xfs: add a ri_entry helper
Add a helper to translate from the item list head to the
rmap_intent_item structure and use it so shorten assignments
and avoid the need for extra local variables.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 29 May 2024 04:11:41 +0000 (21:11 -0700)]
xfs: pass btree cursors to rmap btree tracepoints
Prepare the rmap btree tracepoints for use with realtime rmap btrees by
making them take the btree cursor object as a parameter. This will save
us a lot of trouble later on.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:11:41 +0000 (21:11 -0700)]
xfs: give rmap btree cursor error tracepoints their own class
Create a new tracepoint class for btree-related errors, then convert all
the rmap tracepoints to use it. Also fix the one tracepoint that was
abusing the old class by making it a separate tracepoint.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:11:39 +0000 (21:11 -0700)]
xfs: move xfs_extent_free_defer_add to xfs_extfree_item.c
Move the code that adds the incore xfs_extent_free_item deferred work
data to a transaction to live with the EFI log item code. This means
that the allocator code no longer has to know about the inner workings
of the EFI log items.
As a consequence, we can get rid of the _{get,put}_group helpers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Christoph Hellwig [Wed, 29 May 2024 04:11:37 +0000 (21:11 -0700)]
xfs: remove duplicate asserts in xfs_defer_extent_free
The bno/len verification is already done by the calls to
xfs_verify_rtbext / xfs_verify_fsbext, and reporting a corruption error
seem like the better handling than tripping an assert anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Wed, 29 May 2024 04:11:36 +0000 (21:11 -0700)]
xfs: add a xefi_entry helper
Add a helper to translate from the item list head to the
xfs_extent_free_item structure and use it so shorten assignments
and avoid the need for extra local variables.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Christoph Hellwig [Wed, 29 May 2024 04:11:35 +0000 (21:11 -0700)]
xfs: pass the fsbno to xfs_perag_intent_get
All callers of xfs_perag_intent_get have a fsbno and need boilerplate
code to turn that into an agno. Just pass the fsbno to
xfs_perag_intent_get and look up the agno there.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Darrick J. Wong [Wed, 29 May 2024 04:10:51 +0000 (21:10 -0700)]
xfs: don't use the incore struct xfs_sb for offsets into struct xfs_dsb
Currently, the XFS_SB_CRC_OFF macro uses the incore superblock struct
(xfs_sb) to compute the address of sb_crc within the ondisk superblock
struct (xfs_dsb). This is a landmine if we ever change the layout of
the incore superblock (as we're about to do), so redefine the macro
to use xfs_dsb to compute the layout of xfs_dsb.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:10:50 +0000 (21:10 -0700)]
xfs: move dirent update hooks to xfs_dir2.c
Move the directory entry update hook code to xfs_dir2 so that it is
mostly consolidated with the higher level directory functions. Retain
the exports so that online fsck can still send notifications through the
hooks.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:10:50 +0000 (21:10 -0700)]
xfs: create libxfs helper to rename two directory entries
Create a new libxfs function to rename two directory entries. The
upcoming metadata directory feature will need this to replace a metadata
inode directory entry.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:10:49 +0000 (21:10 -0700)]
xfs: create libxfs helper to exchange two directory entries
Create a new libxfs function to exchange two directory entries.
The upcoming metadata directory feature will need this to replace a
metadata inode directory entry.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:10:49 +0000 (21:10 -0700)]
xfs: create libxfs helper to remove an existing inode/name from a directory
Create a new libxfs function to remove a (name, inode) entry from a
directory. The upcoming metadata directory feature will need this to
create a metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:10:48 +0000 (21:10 -0700)]
xfs: create libxfs helper to link an existing inode into a directory
Create a new libxfs function to link an existing inode into a directory.
The upcoming metadata directory feature will need this to create a
metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:10:47 +0000 (21:10 -0700)]
xfs: create libxfs helper to link a new inode into a directory
Create a new libxfs function to link a newly created inode into a
directory. The upcoming metadata directory feature will need this to
create a metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Tue, 18 Jun 2024 19:56:54 +0000 (12:56 -0700)]
xfs: separate the icreate logic around INIT_XATTRS
INIT_XATTRS is overloaded here -- it's set during the creat process when
we think that we're immediately going to set some ACL xattrs to save
time. However, it's also used by the parent pointers code to enable the
attr fork in preparation to receive ppptr xattrs. This results in
xfs_has_parent() branches scattered around the codebase to turn on
INIT_XATTRS.
Linkable files are created far more commonly than unlinkable temporary
files or directory tree roots, so we should centralize this logic in
xfs_inode_init. For the three callers that don't want parent pointers
(online repiar tempfiles, unlinkable tempfiles, rootdir creation) we
provide an UNLINKABLE flag to skip attr fork initialization.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:10:45 +0000 (21:10 -0700)]
xfs: wrap inode creation dqalloc calls
Create a helper that calls dqalloc to allocate and grab a reference to
dquots for the user, group, and project ids listed in an icreate
structure. This simplifies the creat-related dqalloc callsites
scattered around the code base.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:10:45 +0000 (21:10 -0700)]
xfs: push xfs_icreate_args creation out of xfs_create*
Move the initialization of the xfs_icreate_args structure out of
xfs_create and xfs_create_tempfile into their callers so that we can set
the new inode's attributes in one place and pass that through instead of
open coding the collection of attributes all over the code.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:10:44 +0000 (21:10 -0700)]
xfs: split new inode creation into two pieces
There are two parts to initializing a newly allocated inode: setting up
the incore structures, and initializing the new inode core based on the
parent inode and the current user's environment. The initialization
code is not specific to the kernel, so we would like to share that with
userspace by hoisting it to libxfs. Therefore, split xfs_icreate into
separate functions to prepare for the next few patches.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:10:43 +0000 (21:10 -0700)]
xfs: implement atime updates in xfs_trans_ichgtime
Enable xfs_trans_ichgtime to change the inode access time so that we can
use this function to set inode times when allocating inodes instead of
open-coding it.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Wed, 29 May 2024 04:10:42 +0000 (21:10 -0700)]
xfs: pack icreate initialization parameters into a separate structure
Callers that want to create an inode currently pass all possible file
attribute values for the new inode into xfs_init_new_inode as ten
separate parameters. This causes two code maintenance issues: first, we
have large multi-line call sites which programmers must read carefully
to make sure they did not accidentally invert a value. Second, all
three file id parameters must be passed separately to the quota
functions; any discrepancy results in quota count errors.
Clean this up by creating a new icreate_args structure to hold all this
information, some helpers to initialize them properly, and make the
callers pass this structure through to the creation function, whose name
we shorten to xfs_icreate. This eliminates the issues, enables us to
keep the inode init code in sync with userspace via libxfs, and is
needed for future metadata directory tree management.
(A subsequent cleanup will also fix the quota alloc calls.)
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Mon, 17 Jun 2024 18:39:54 +0000 (11:39 -0700)]
xfs: use consistent uid/gid when grabbing dquots for inodes
I noticed that callers of xfs_qm_vop_dqalloc use the following code to
compute the anticipated uid of the new file:
mapped_fsuid(idmap, &init_user_ns);
whereas the VFS uses a slightly different computation for actually
assigning i_uid:
mapped_fsuid(idmap, i_user_ns(inode));
Technically, these are not the same things. According to Christian
Brauner, the only time that inode->i_sb->s_user_ns != &init_user_ns is
when the filesystem was mounted in a new mount namespace by an
unpriviledged user. XFS does not allow this, which is why we've never
seen bug reports about quotas being incorrect or the uid checks in
xfs_qm_vop_create_dqattach tripping debug assertions.
However, this /is/ a logic bomb, so let's make the code consistent.
This inode log item recovery failing the dinode verifier after
replaying the contents of the inode log item into the ondisk inode.
Looking back into what the kernel was doing at the time of the fs
shutdown, a thread was in the middle of running a series of
transactions, each of which committed changes to the inode.
At some point in the middle of that chain, an invalid (at least
according to the verifier) change was committed. Had the filesystem not
shut down in the middle of the chain, a subsequent transaction would
have corrected the invalid state and nobody would have noticed. But
that's not what happened here. Instead, the invalid inode state was
committed to the ondisk log, so log recovery tripped over it.
The actual defect here was an overzealous inode verifier, which was
fixed in a separate patch. This patch adds some transaction precommit
functions for CONFIG_XFS_DEBUG=y mode so that we can detect these kinds
of transient errors at transaction commit time, where it's much easier
to find the root cause.
Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong [Tue, 16 Jul 2024 21:48:03 +0000 (14:48 -0700)]
xfs: enable FITRIM on the realtime device
Implement FITRIM for the realtime device by pretending that it's
"space" immediately after the data device. We have to hold the
rtbitmap ILOCK while the discard operations are ongoing because there's
no busy extent tracking for the rt volume to prevent reallocations.
Cc: Konst Mayer <cdlscpmv@gmail.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
xfs: Remove header files which are included more than once
Following warning is reported, so remove these duplicated header
including:
./fs/xfs/libxfs/xfs_trans_resv.c: xfs_da_format.h is included more than once.
./fs/xfs/scrub/quota_repair.c: xfs_format.h is included more than once.
./fs/xfs/xfs_handle.c: xfs_da_btree.h is included more than once.
./fs/xfs/xfs_qm_bhv.c: xfs_mount.h is included more than once.
./fs/xfs/xfs_trace.c: xfs_bmap.h is included more than once.
This is just a clean code, no logic changed.
Signed-off-by: Wenchao Hao <haowenchao22@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Christoph Hellwig [Tue, 16 Jul 2024 21:48:01 +0000 (14:48 -0700)]
xfs: fold xfs_ilock_for_write_fault into xfs_write_fault
Now that the page fault handler has been refactored, the only caller
of xfs_ilock_for_write_fault is simple enough and calls it
unconditionally. Fold the logic and expand the comments explaining it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Christoph Hellwig [Tue, 16 Jul 2024 21:48:00 +0000 (14:48 -0700)]
xfs: always take XFS_MMAPLOCK shared in xfs_dax_read_fault
After the previous refactoring, xfs_dax_fault is now never used for write
faults, so don't bother with the xfs_ilock_for_write_fault logic to
protect against writes when remapping is in progress.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Christoph Hellwig [Tue, 16 Jul 2024 21:47:57 +0000 (14:47 -0700)]
xfs: move the dio write relocking out of xfs_ilock_for_iomap
About half of xfs_ilock_for_iomap deals with a special case for direct
I/O writes to COW files that need to take the ilock exclusively. Move
this code into the one callers that cares and simplify
xfs_ilock_for_iomap.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
lei lu [Tue, 16 Jul 2024 21:47:56 +0000 (14:47 -0700)]
xfs: don't walk off the end of a directory data block
This adds sanity checks for xfs_dir2_data_unused and xfs_dir2_data_entry
to make sure don't stray beyond valid memory region. Before patching, the
loop simply checks that the start offset of the dup and dep is within the
range. So in a crafted image, if last entry is xfs_dir2_data_unused, we
can change dup->length to dup->length-1 and leave 1 byte of space. In the
next traversal, this space will be considered as dup or dep. We may
encounter an out of bound read when accessing the fixed members.
In the patch, we make sure that the remaining bytes large enough to hold
an unused entry before accessing xfs_dir2_data_unused and
xfs_dir2_data_unused is XFS_DIR2_DATA_ALIGN byte aligned. We also make
sure that the remaining bytes large enough to hold a dirent with a
single-byte name before accessing xfs_dir2_data_entry.
Signed-off-by: lei lu <llfamsec@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
lei lu [Tue, 16 Jul 2024 21:47:55 +0000 (14:47 -0700)]
xfs: add bounds checking to xlog_recover_process_data
There is a lack of verification of the space occupied by fixed members
of xlog_op_header in the xlog_recover_process_data.
We can create a crafted image to trigger an out of bounds read by
following these steps:
1) Mount an image of xfs, and do some file operations to leave records
2) Before umounting, copy the image for subsequent steps to simulate
abnormal exit. Because umount will ensure that tail_blk and
head_blk are the same, which will result in the inability to enter
xlog_recover_process_data
3) Write a tool to parse and modify the copied image in step 2
4) Make the end of the xlog_op_header entries only 1 byte away from
xlog_rec_header->h_size
5) xlog_rec_header->h_num_logops++
6) Modify xlog_rec_header->h_crc
Fix:
Add a check to make sure there is sufficient space to access fixed members
of xlog_op_header.
Signed-off-by: lei lu <llfamsec@gmail.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
John Garry [Tue, 16 Jul 2024 21:47:54 +0000 (14:47 -0700)]
xfs: Fix xfs_prepare_shift() range for RT
The RT extent range must be considered in the xfs_flush_unmap_range() call
to stabilize the boundary.
This code change is originally from Dave Chinner.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
John Garry [Tue, 16 Jul 2024 21:47:53 +0000 (14:47 -0700)]
xfs: Fix xfs_flush_unmap_range() range for RT
Currently xfs_flush_unmap_range() does unmap for a full RT extent range,
which we also want to ensure is clean and idle.
This code change is originally from Dave Chinner.
Reviewed-by: Christoph Hellwig <hch@lst.de>4 Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Currently AGFL blocks can be filled from the following three sources:
- allocbt free blocks, as in xfs_allocbt_free_block();
- rmapbt free blocks, as in xfs_rmapbt_free_block();
- refilled from freespace btrees, as in xfs_alloc_fix_freelist().
Originally, allocbt free blocks would be marked as stale only when they
put back in the general free space pool as Dave mentioned on IRC, "we
don't stale AGF metadata btree blocks when they are returned to the
AGFL .. but once they get put back in the general free space pool, we
have to make sure the buffers are marked stale as the next user of
those blocks might be user data...."
However, after commit ca250b1b3d71 ("xfs: invalidate allocbt blocks
moved to the free list") and commit edfd9dd54921 ("xfs: move buffer
invalidation to xfs_btree_free_block"), even allocbt / bmapbt free
blocks will be invalidated immediately since they may fail to pass
V5 format validation on writeback even writeback to free space would be
safe.
IOWs, IMHO currently there is actually no difference of free blocks
between AGFL freespace pool and the general free space pool. So let's
avoid extra redundant AGFL buffer invalidation, since otherwise we're
currently facing unnecessary xfs_log_force() due to xfs_trans_binval()
again on buffers already marked as stale before as below:
xfs_log_force() will take tens of milliseconds with AGF buffer locked.
It becomes an unnecessary long latency especially on our PMEM devices
with FSDAX enabled and fsops like xfs_reflink_find_shared() at the same
time are stuck due to the same AGF lock. Removing the double
invalidation on the AGFL blocks does not make this issue go away, but
this patch fixes for our workloads in reality and it should also work
by the code analysis.
Note that I'm not sure I need to remove another redundant one in
xfs_alloc_ag_vextent_small() since it's unrelated to our workloads.
Also fstests are passed with this patch.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Merge tag 'kbuild-fixes-v6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Make scripts/ld-version.sh robust against the latest LLD
- Fix warnings in rpm-pkg with device tree support
- Fix warnings in fortify tests with KASAN
* tag 'kbuild-fixes-v6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
fortify: fix warnings in fortify tests with KASAN
kbuild: rpm-pkg: avoid the warnings with dtb's listed twice
kbuild: Make ld-version.sh more robust against version string changes
When a software KASAN mode is enabled, the fortify tests emit warnings
on some architectures.
For example, for ARCH=arm, the combination of CONFIG_FORTIFY_SOURCE=y
and CONFIG_KASAN=y produces the following warnings:
TEST lib/test_fortify/read_overflow-memchr.log
warning: unsafe memchr() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memchr.c
TEST lib/test_fortify/read_overflow-memchr_inv.log
warning: unsafe memchr_inv() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memchr_inv.c
TEST lib/test_fortify/read_overflow-memcmp.log
warning: unsafe memcmp() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memcmp.c
TEST lib/test_fortify/read_overflow-memscan.log
warning: unsafe memscan() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memscan.c
TEST lib/test_fortify/read_overflow2-memcmp.log
warning: unsafe memcmp() usage lacked '__read_overflow2' warning in lib/test_fortify/read_overflow2-memcmp.c
[ more and more similar warnings... ]
Commit 9c2d1328f88a ("kbuild: provide reasonable defaults for tool
coverage") removed KASAN flags from non-kernel objects by default.
It was an intended behavior because lib/test_fortify/*.c are unit
tests that are not linked to the kernel.
As it turns out, some architectures require -fsanitize=kernel-(hw)address
to define __SANITIZE_ADDRESS__ for the fortify tests.
Without __SANITIZE_ADDRESS__ defined, arch/arm/include/asm/string.h
defines __NO_FORTIFY, thus excluding <linux/fortify-string.h>.
This issue does not occur on x86 thanks to commit 4ec4190be4cf
("kasan, x86: don't rename memintrinsics in uninstrumented files"),
but there are still some architectures that define __NO_FORTIFY
in such a situation.
Set KASAN_SANITIZE=y explicitly to the fortify tests.
Fixes: 9c2d1328f88a ("kbuild: provide reasonable defaults for tool coverage") Reported-by: Arnd Bergmann <arnd@arndb.de> Closes: https://lore.kernel.org/all/0e8dee26-41cc-41ae-9493-10cd1a8e3268@app.fastmail.com/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
kbuild: rpm-pkg: avoid the warnings with dtb's listed twice
After 8d1001f7bdd0 (kbuild: rpm-pkg: fix build error with CONFIG_MODULES=n),
the following warning "warning: File listed twice: *.dtb" is appearing for
every dtb file that is included.
The reason is that the commented commit already adds the folder
/lib/modules/%{KERNELRELEASE} in kernel.list file so the folder
/lib/modules/%{KERNELRELEASE}/dtb is no longer necessary, just remove it.
because the trailing comma is included in the patch level part of the
expression. While [1] has been partially reverted in [2] to avoid this
breakage (as it impacts the configuration stage and it is present in all
LTS branches), it would be good to make ld-version.sh more robust
against such miniscule changes like this one.
Use POSIX shell parameter expansion [3] to remove the largest suffix
after just numbers and periods, replacing of the current removal of
everything after a hyphen. ld-version.sh continues to work for a number
of distributions (Arch Linux, Debian, and Fedora) and the kernel.org
toolchains and no longer errors on a version of ld.lld with [1].
Merge tag 'sched_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Fix a performance regression when measuring the CPU time of a thread
(clock_gettime(CLOCK_THREAD_CPUTIME_ID,...)) due to the addition of
PSI IRQ time accounting in the hotpath
- Fix a task_struct leak due to missing to decrement the refcount when
the task is enqueued before the timer which is supposed to do that,
expires
- Revert an attempt to expedite detaching of movable tasks, as finding
those could become very costly. Turns out the original issue wasn't
even hit by anyone
* tag 'sched_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath
sched/deadline: Fix task_struct reference leak
Revert "sched/fair: Make sure to try to detach at least one movable task"
Merge tag 'x86_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
- Make sure TF is cleared before calling other functions (BHI
mitigation in this case) in the SYSENTER compat handler, as
otherwise it will warn about being in single-step mode
* tag 'x86_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bhi: Avoid warning in #DB handler due to BHI mitigation
Merge tag 'i2c-for-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Fixes for the I2C testunit, the Renesas R-Car driver and some
MAINTAINERS corrections"
* tag 'i2c-for-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: testunit: avoid re-issued work after read message
i2c: rcar: ensure Gen3+ reset does not disturb local targets
i2c: mark HostNotify target address as used
i2c: testunit: correct Kconfig description
MAINTAINERS: VIRTIO I2C loses a maintainer, gains a reviewer
MAINTAINERS: delete entries for Thor Thayer
i2c: rcar: clear NO_RXDMA flag after resetting
i2c: rcar: bring hardware to known state when probing
Steve French [Tue, 9 Jul 2024 23:07:35 +0000 (18:07 -0500)]
cifs: fix setting SecurityFlags to true
If you try to set /proc/fs/cifs/SecurityFlags to 1 it
will set them to CIFSSEC_MUST_NTLMV2 which no longer is
relevant (the less secure ones like lanman have been removed
from cifs.ko) and is also missing some flags (like for
signing and encryption) and can even cause mount to fail,
so change this to set it to Kerberos in this case.
Also change the description of the SecurityFlags to remove mention
of flags which are no longer supported.
Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Wolfram Sang [Sat, 13 Jul 2024 08:50:55 +0000 (10:50 +0200)]
Merge tag 'i2c-host-fixes-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
This tag includes three fixes for the Renesas R-Car driver:
1. Ensures the device is in a known state after probing.
2. Allows clearing the NO_RXDMA flag after a reset.
3. Forces a reset before any transfer on Gen3+ platforms to
prevent disruption of the configuration during parallel
transfers.
Merge tag 'net-6.10-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull more networking fixes from Jakub Kicinski:
"A quick follow up to yesterday's pull. We got a regressions report for
the bnxt patch as soon as it got to your tree. The ethtool fix is also
good to have, although it's an older regression.
Current release - regressions:
- eth: bnxt_en: fix crash in bnxt_get_max_rss_ctx_ring() on older HW
when user tries to decrease the ring count
Previous releases - regressions:
- ethtool: fix RSS setting, accept "no change" setting if the driver
doesn't support the new features
- eth: i40e: remove needless retries of NVM update, don't wait 20min
when we know the firmware update won't succeed"
* tag 'net-6.10-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net:
bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring()
octeontx2-af: fix issue with IPv4 match for RSS
octeontx2-af: fix issue with IPv6 ext match for RSS
octeontx2-af: fix detection of IP layer
octeontx2-af: fix a issue with cpt_lf_alloc mailbox
octeontx2-af: replace cpt slot with lf id on reg write
i40e: fix: remove needless retries of NVM update
net: ethtool: Fix RSS setting
bp->rss_ctx_list is not initialized if the chip or firmware does not
support multiple RSS contexts. Fix it by adding a check in
bnxt_get_max_rss_ctx_ring() before proceeding to reference
bp->rss_ctx_list.
Merge tag 'for-6.10-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Fix a regression in extent map shrinker behaviour.
In the past weeks we got reports from users that there are huge
latency spikes or freezes. This was bisected to newly added shrinker
of extent maps (it was added to fix a build up of the structures in
memory).
I'm assuming that the freezes would happen to many users after release
so I'd like to get it merged now so it's in 6.10. Although the diff
size is not small the changes are relatively straightforward, the
reporters verified the fixes and we did testing on our side.
The fixes:
- adjust behaviour under memory pressure and check lock or scheduling
conditions, bail out if needed
- synchronize tracking of the scanning progress so inode ranges are
not skipped or work duplicated
- do a delayed iput when scanning a root so evicting an inode does
not slow things down in case of lots of dirty data, also fix
lockdep warning, a deadlock could happen when writing the dirty
data would need to start a transaction"
* tag 'for-6.10-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: avoid races when tracking progress for extent map shrinking
btrfs: stop extent map shrinker if reschedule is needed
btrfs: use delayed iput during extent map shrinking
Merge tag 'pmdomain-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fix from Ulf Hansson:
- qcom: Skip retention level for rpmhpd's
* tag 'pmdomain-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: qcom: rpmhpd: Skip retention level for Power Domains
Merge tag 'mmc-v6.10-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- davinci_mmc: Prevent transmitted data size from exceeding sgm's
length
- sdhci: Fix max_seg_size for 64KiB PAGE_SIZE
* tag 'mmc-v6.10-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: davinci_mmc: Prevent transmitted data size from exceeding sgm's length
mmc: sdhci: Fix max_seg_size for 64KiB PAGE_SIZE
Merge tag 'arm-fixes-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"Most of these changes are Qualcomm SoC specific and came in just after
I sent out the last set of fixes. This includes two regression fixes
for SoC drivers, a defconfig change to ensure the Lenovo X13s is
usable and 11 changes to DT files to fix regressions and minor
platform specific issues.
Tony and Chunyan step back from their respective maintainership roles
on the omap and unisoc platforms, and Christophe in turn takes over
maintaining some of the Freescale SoC drivers that he has been taking
care of in practice already.
Lastly, there are two trivial fixes for the davinci and sunxi
platforms"
* tag 'arm-fixes-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
MAINTAINERS: Update FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY
MAINTAINERS: Add more maintainers for omaps
ARM: davinci: Convert comma to semicolon
MAINTAINERS: Move myself from SPRD Maintainer to Reviewer
Revert "dt-bindings: cache: qcom,llcc: correct QDU1000 reg entries"
arm64: dts: qcom: qdu1000: Fix LLCC reg property
arm64: dts: qcom: sm6115: add iommu for sdhc_1
arm64: dts: qcom: x1e80100-crd: fix DAI used for headset recording
arm64: dts: qcom: x1e80100-crd: fix WCD audio codec TX port mapping
soc: qcom: pmic_glink: disable UCSI on sc8280xp
arm64: defconfig: enable Elan i2c-hid driver
arm64: dts: qcom: sc8280xp-crd: use external pull up for touch reset
arm64: dts: qcom: sc8280xp-x13s: fix touchscreen power on
arm64: dts: qcom: x1e80100: Fix PCIe 6a reg offsets and add MHI
arm64: dts: qcom: sa8775p: Correct IRQ number of EL2 non-secure physical timer
arm64: dts: allwinner: Fix PMIC interrupt number
arm64: dts: qcom: sc8280xp: Set status = "reserved" on PSHOLD
arm64: dts: qcom: x1e80100-*: Allocate some CMA buffers
arm64: dts: qcom: sc8180x: Fix LLCC reg property again
Merge tag 'char-misc-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc driver fixes from Greg KH:
"Here are some small remaining driver fixes for 6.10-final that have
all been in linux-next for a while and resolve reported issues.
Included in here are:
- mei driver fixes (and a spelling fix at the end just to be clean)
- iio driver fixes for reported problems
- fastrpc bugfixes
- nvmem small fixes"
* tag 'char-misc-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
mei: vsc: Fix spelling error
mei: vsc: Enhance SPI transfer of IVSC ROM
mei: vsc: Utilize the appropriate byte order swap function
mei: vsc: Prevent timeout error with added delay post-firmware download
mei: vsc: Enhance IVSC chipset stability during warm reboot
nvmem: core: limit cell sysfs permissions to main attribute ones
nvmem: core: only change name to fram for current attribute
nvmem: meson-efuse: Fix return value of nvmem callbacks
nvmem: rmem: Fix return value of rmem_read()
misc: microchip: pci1xxxx: Fix return value of nvmem callbacks
hpet: Support 32-bit userspace
misc: fastrpc: Restrict untrusted app to attach to privileged PD
misc: fastrpc: Fix ownership reassignment of remote heap
misc: fastrpc: Fix memory leak in audio daemon attach operation
misc: fastrpc: Avoid updating PD type for capability request
misc: fastrpc: Copy the complete capability structure to user
misc: fastrpc: Fix DSP capabilities request
iio: light: apds9306: Fix error handing
iio: trigger: Fix condition for own trigger
Merge tag 'tty-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial fixes from Greg KH:
"Here are some small serial driver fixes for 6.10-final. Included in
here are:
- qcom-geni fixes for a much much much discussed issue and everyone
now seems to be agreed that this is the proper way forward to
resolve the reported lockups
- imx serial driver bugfixes
- 8250_omap errata fix
- ma35d1 serial driver bugfix
All of these have been in linux-next for over a week with no reported
issues"
* tag 'tty-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: qcom-geni: do not kill the machine on fifo underrun
serial: qcom-geni: fix hard lockup on buffer flush
serial: qcom-geni: fix soft lockup on sw flow control and suspend
serial: imx: ensure RTS signal is not left active after shutdown
tty: serial: ma35d1: Add a NULL check for of_node
serial: 8250_omap: Fix Errata i2310 with RX FIFO level check
serial: imx: only set receiver level if it is zero
Merge tag 'sound-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"The majority of changes here are small device-specific fixes for ASoC
SOF / Intel and usual HD-audio quirks.
The only significant high LOC is found in the Cirrus firmware driver,
but all those are for hardening against malicious firmware blobs, and
they look fine for taking as a last minute fix, too"
* tag 'sound-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Enable Mute LED on HP 250 G7
firmware: cs_dsp: Use strnlen() on name fields in V1 wmfw files
ALSA: hda/realtek: Limit mic boost on VAIO PRO PX
ALSA: hda: cs35l41: Fix swapped l/r audio channels for Lenovo ThinBook 13x Gen4
ASoC: SOF: Intel: hda-pcm: Limit the maximum number of periods by MAX_BDL_ENTRIES
ASoC: rt711-sdw: add missing readable registers
ASoC: SOF: Intel: hda: fix null deref on system suspend entry
ALSA: hda/realtek: add quirk for Clevo V5[46]0TU
firmware: cs_dsp: Prevent buffer overrun when processing V2 alg headers
firmware: cs_dsp: Validate payload length before processing block
firmware: cs_dsp: Return error if block header overflows file
firmware: cs_dsp: Fix overflow checking of wmfw header
Merge tag 'bcachefs-2024-07-12' of https://evilpiepirate.org/git/bcachefs
Pull more bcachefs fixes from Kent Overstreet:
- revert the SLAB_ACCOUNT patch, something crazy is going on in memcg
and someone forgot to test
- minor fixes: missing rcu_read_lock(), scheduling while atomic (in an
emergency shutdown path)
- two lockdep fixes; these could have gone earlier, but were left to
bake awhile
* tag 'bcachefs-2024-07-12' of https://evilpiepirate.org/git/bcachefs:
bcachefs: bch2_gc_btree() should not use btree_root_lock
bcachefs: Set PF_MEMALLOC_NOFS when trans->locked
bcachefs; Use trans_unlock_long() when waiting on allocator
Revert "bcachefs: Mark bch_inode_info as SLAB_ACCOUNT"
bcachefs: fix scheduling while atomic in break_cycle()
bcachefs: Fix RCU splat
Satheesh Paul [Wed, 10 Jul 2024 07:51:27 +0000 (13:21 +0530)]
octeontx2-af: fix issue with IPv4 match for RSS
While performing RSS based on IPv4, packets with
IPv4 options are not being considered. Adding changes
to match both plain IPv4 and IPv4 with option header.
Fixes: 41a7aa7b800d ("octeontx2-af: NIX Rx flowkey configuration for RSS") Signed-off-by: Satheesh Paul <psatheesh@marvell.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
octeontx2-af: fix issue with IPv6 ext match for RSS
While performing RSS based on IPv6, extension ltype
is not being considered. This will be problem for
fragmented packets or packets with extension header.
Adding changes to match IPv6 ext header along with IPv6
ltype.
Fixes: 41a7aa7b800d ("octeontx2-af: NIX Rx flowkey configuration for RSS") Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Mazur [Wed, 10 Jul 2024 07:51:25 +0000 (13:21 +0530)]
octeontx2-af: fix detection of IP layer
Checksum and length checks are not enabled for IPv4 header with
options and IPv6 with extension headers.
To fix this a change in enum npc_kpu_lc_ltype is required which will
allow adjustment of LTYPE_MASK to detect all types of IP headers.
Fixes: 21e6699e5cd6 ("octeontx2-af: Add NPC KPU profile") Signed-off-by: Michal Mazur <mmazur2@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
octeontx2-af: fix a issue with cpt_lf_alloc mailbox
This patch fixes CPT_LF_ALLOC mailbox error due to
incompatible mailbox message format. Specifically, it
corrects the `blkaddr` field type from `int` to `u8`.
Fixes: de2854c87c64 ("octeontx2-af: Mailbox changes for 98xx CPT block") Signed-off-by: Srujana Challa <schalla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
octeontx2-af: replace cpt slot with lf id on reg write
Replace slot id with global CPT lf id on reg read/write as
CPTPF/VF driver would send slot number instead of global
lf id in the reg offset. And also update the mailbox response
with the global lf's register offset.
Fixes: ae454086e3c2 ("octeontx2-af: add mailbox interface for CPT") Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
MAINTAINERS: Update FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY
FREESCALE SOC DRIVERS has been orphaned since
commit eaac25d026a1 ("MAINTAINERS: Drop Li Yang as their email address
stopped working")
QUICC ENGINE LIBRARY has Qiang Zhao as maintainer but he hasn't
responded for years and when Li Yang was still maintaining FREESCALE
SOC DRIVERS he was also handling QUICC ENGINE LIBRARY directly.
As a maintainer of LINUX FOR POWERPC EMBEDDED PPC8XX AND PPC83XX, I
also need FREESCALE SOC DRIVERS to be actively maintained, so add
myself as maintainer of FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY.
Tony Lindgren [Tue, 9 Jul 2024 13:59:29 +0000 (16:59 +0300)]
MAINTAINERS: Add more maintainers for omaps
There are many generations of omaps to maintain, and I will be only active
as a hobbyist with time permitting. Let's add more maintainers to ensure
continued Linux support.
TI is interested in maintaining the active SoCs such as am3, am4 and
dra7. And the hobbyists are interested in maintaining some of the older
devices, mainly based on omap3 and 4 SoCs.
Kevin and Roger have agreed to maintain the active TI parts. Both Kevin
and Roger have been working on the omap variants for a long time, and
have a good understanding of the hardware.
Aaro and Andreas have agreed to maintain the community devices. Both Aaro
and Andreas have long experience on working with the earlier TI SoCs.
While at it, let's also change me to be a reviewer for the omap1, and
drop the link to my old omap web page.
Signed-off-by: Tony Lindgren <tony@atomide.com> Acked-by: Kevin Hilman <khilman@baylibre.com> Acked-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Wolfram Sang [Thu, 11 Jul 2024 12:08:19 +0000 (14:08 +0200)]
i2c: testunit: avoid re-issued work after read message
The to-be-fixed commit rightfully prevented that the registers will be
cleared. However, the index must be cleared. Otherwise a read message
will re-issue the last work. Fix it and add a comment describing the
situation.
Fixes: c422b6a63024 ("i2c: testunit: don't erase registers after STOP") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Aleksandr Loktionov [Wed, 10 Jul 2024 22:44:54 +0000 (15:44 -0700)]
i40e: fix: remove needless retries of NVM update
Remove wrong EIO to EGAIN conversion and pass all errors as is.
After commit 230f3d53a547 ("i40e: remove i40e_status"), which should only
replace F/W specific error codes with Linux kernel generic, all EIO errors
suddenly started to be converted into EAGAIN which leads nvmupdate to retry
until it timeouts and sometimes fails after more than 20 minutes in the
middle of NVM update, so NVM becomes corrupted.
The bug affects users only at the time when they try to update NVM, and
only F/W versions that generate errors while nvmupdate. For example, X710DA2
with 0x8000ECB7 F/W is affected, but there are probably more...
Command for reproduction is just NVM update:
./nvmupdate64
The problematic code did silently convert EIO into EAGAIN which forced
nvmupdate to ignore EAGAIN error and retry the same operation until timeout.
That's why NVM update takes 20+ minutes to finish with the fail in the end.
Fixes: 230f3d53a547 ("i40e: remove i40e_status") Co-developed-by: Kelvin Kang <kelvin.kang@intel.com> Signed-off-by: Kelvin Kang <kelvin.kang@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240710224455.188502-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When user submits a rxfh set command without touching XFRM_SYM_XOR,
rxfh.input_xfrm is set to RXH_XFRM_NO_CHANGE, which is equal to 0xff.
Testing if (rxfh.input_xfrm & RXH_XFRM_SYM_XOR &&
!ops->cap_rss_sym_xor_supported)
return -EOPNOTSUPP;
Will always be true on devices that don't set cap_rss_sym_xor_supported,
since rxfh.input_xfrm & RXH_XFRM_SYM_XOR is always true, if input_xfrm
was not set, i.e RXH_XFRM_NO_CHANGE=0xff, which will result in failure
of any command that doesn't require any change of XFRM, e.g RSS context
or hash function changes.
To avoid this breakage, test if rxfh.input_xfrm != RXH_XFRM_NO_CHANGE
before testing other conditions. Note that the problem will only trigger
with XFRM-aware userspace, old ethtool CLI would continue to work.
Fixes: 0dd415d15505 ("net: ethtool: add a NO_CHANGE uAPI for new RXFH's input_xfrm") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://patch.msgid.link/20240710225538.43368-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>