fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
Orabug:
26797298
During testing, I discovered that __generic_file_splice_read() returns
0 (EOF) when aops->readpage fails with AOP_TRUNCATED_PAGE on the first
page of a single/multi-page splice read operation. This EOF return code
causes the userspace test to (correctly) report a zero-length read error
when it was expecting otherwise.
The current strategy of returning a partial non-zero read when ->readpage
returns AOP_TRUNCATED_PAGE works only when the failed page is not the
first of the lot being processed.
This patch attempts to retry lookup and call ->readpage again on pages
that had previously failed with AOP_TRUNCATED_PAGE. With this patch, my
tests pass and I haven't noticed any unwanted side effects.
This version removes the thrice-retry loop and instead indefinitely
retries lookups on AOP_TRUNCATED_PAGE errors from ->readpage. This
behavior is now similar to do_generic_file_read().
Signed-off-by: Abhi Das <adas@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
AOP_TRUNCATED_PAGE is not used much in the kernel now, but ocfs2 uses it to
avoid deadlocks. Specifically, ocfs2_readpage() fails the read and returns
AOP_TRUNCATED_PAGE in order to avoid deadlock on page lock with the
downconvert thread, if it fails to get the inode cluster lock. It also uses
this return value to avoid livelock on the ip_alloc_sem semaphore. This is
done with the expectation that the VFS will check for this return value and
retry the read on the page and do_generic_file_read() does exactly this.
However, in case of splice read, __generic_file_splice_read() fails the read
and returns a partial/zero-length read back. This causes upper layers that use
splice read (such as nfs) to return EIO or other failures to userspace. Saar
ran into this issue while testing database workloads over knfs with ocfs2 as
the backend fs on the nfs server. This issue is fixed with this patch in place.
(cherrypicked from commit
90330e689c32e5105265c461c54af6ecec3373fa)
Tested-by: Saar Maoz <saar.maoz@oracle.com>
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>