]> www.infradead.org Git - users/hch/misc.git/commitdiff
netfs: Fix collection of results during pause when collection offloaded
authorDavid Howells <dhowells@redhat.com>
Fri, 14 Mar 2025 16:41:56 +0000 (16:41 +0000)
committerChristian Brauner <brauner@kernel.org>
Wed, 19 Mar 2025 09:04:22 +0000 (10:04 +0100)
A netfs read request can run in one of two modes: for synchronous reads
writes, the app thread does the collection of results and for asynchronous
reads, this is offloaded to a worker thread.  This is controlled by the
NETFS_RREQ_OFFLOAD_COLLECTION flag.

Now, if a subrequest incurs an error, the NETFS_RREQ_PAUSE flag is set to
stop the issuing loop temporarily from issuing more subrequests until a
retry is successful or the request is abandoned.

When the issuing loop sees NETFS_RREQ_PAUSE, it jumps to
netfs_wait_for_pause() which will wait for the PAUSE flag to be cleared -
and whilst it is waiting, it will call out to the collector as more results
acrue...  But this is the wrong thing to do if OFFLOAD_COLLECTION is set as
we can then end up with both the app thread and the work item collecting
results simultaneously.

This manifests itself occasionally when running the generic/323 xfstest
against multichannel cifs as an oops that's a bit random but frequently
involving io_submit() (the test does lots of simultaneous async DIO reads).

Fix this by only doing the collection in netfs_wait_for_pause() if the
NETFS_RREQ_OFFLOAD_COLLECTION is not set.

Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item")
Reported-by: Steve French <stfrench@microsoft.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20250314164201.1993231-2-dhowells@redhat.com
Acked-by: "Paulo Alcantara (Red Hat)" <pc@manguebit.com>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
fs/netfs/read_collect.c

index 636cc5a98ef576e6ec1d1ba64cccafc2ac3f3cf4..23c75755ad4ed9b49581765cb1b46bb056b25ee0 100644 (file)
@@ -682,14 +682,16 @@ void netfs_wait_for_pause(struct netfs_io_request *rreq)
                trace_netfs_rreq(rreq, netfs_rreq_trace_wait_queue);
                prepare_to_wait(&rreq->waitq, &myself, TASK_UNINTERRUPTIBLE);
 
-               subreq = list_first_entry_or_null(&stream->subrequests,
-                                                 struct netfs_io_subrequest, rreq_link);
-               if (subreq &&
-                   (!test_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags) ||
-                    test_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags))) {
-                       __set_current_state(TASK_RUNNING);
-                       netfs_read_collection(rreq);
-                       continue;
+               if (!test_bit(NETFS_RREQ_OFFLOAD_COLLECTION, &rreq->flags)) {
+                       subreq = list_first_entry_or_null(&stream->subrequests,
+                                                         struct netfs_io_subrequest, rreq_link);
+                       if (subreq &&
+                           (!test_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags) ||
+                            test_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags))) {
+                               __set_current_state(TASK_RUNNING);
+                               netfs_read_collection(rreq);
+                               continue;
+                       }
                }
 
                if (!test_bit(NETFS_RREQ_IN_PROGRESS, &rreq->flags) ||