Kees Cook [Sat, 18 Feb 2023 00:24:40 +0000 (16:24 -0800)]
smb3: Replace smb2pdu 1-element arrays with flex-arrays
The kernel is globally removing the ambiguous 0-length and 1-element
arrays in favor of flexible arrays, so that we can gain both compile-time
and run-time array bounds checking[1].
Replace the trailing 1-element array with a flexible array in the
following structures:
Replace the trailing 1-element array with a flexible array, but leave
the existing structure padding:
struct smb2_file_all_info
struct smb2_lock_req
Adjust all related size calculations to match the changes to sizeof().
No machine code output or .data section differences are produced after
these changes.
[1] For lots of details, see both:
https://docs.kernel.org/process/deprecated.html#zero-length-and-one-element-arrays
https://people.kernel.org/kees/bounded-flexible-arrays-in-c
Cc: Steve French <sfrench@samba.org> Cc: Paulo Alcantara <pc@cjr.nz> Cc: Ronnie Sahlberg <lsahlber@redhat.com> Cc: Shyam Prasad N <sprasad@microsoft.com> Cc: Tom Talpey <tom@talpey.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Mon, 20 Feb 2023 19:36:54 +0000 (16:36 -0300)]
cifs: get rid of dns resolve worker
We already upcall to resolve hostnames during reconnect by calling
reconn_set_ipaddr_from_hostname(), so there is no point in having a
worker to periodically call it.
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Reviewed-by <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Ronnie Sahlberg [Fri, 17 Feb 2023 03:35:01 +0000 (13:35 +1000)]
cifs: return a single-use cfid if we did not get a lease
If we did not get a lease we can still return a single use cfid to the caller.
The cfid will not have has_lease set and will thus not be shared with any
other concurrent users and will be freed immediately when the caller
drops the handle.
This avoids extra roundtrips for servers that do not support directory leases
where they would first fail to get a cfid with a lease and then fallback
to try a normal SMB2_open()
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Ronnie Sahlberg [Fri, 17 Feb 2023 03:35:00 +0000 (13:35 +1000)]
cifs: Check the lease context if we actually got a lease
Some servers may return that we got a lease in rsp->OplockLevel
but then in the lease context contradict this and say we got no lease
at all. Thus we need to check the context if we have a lease.
Additionally, If we do not get a lease we need to make sure we close
the handle before we return an error to the caller.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Kees Cook [Wed, 15 Feb 2023 00:09:45 +0000 (16:09 -0800)]
cifs: Replace remaining 1-element arrays
The kernel is globally removing the ambiguous 0-length and 1-element
arrays in favor of flexible arrays, so that we can gain both compile-time
and run-time array bounds checking[1].
Replace the trailing 1-element array with a flexible array in the
following structures:
Replace the trailing 1-element array with a flexible array, but leave
the existing structure padding:
FILE_ALL_INFO
FILE_UNIX_INFO
Remove unused structures:
struct gea
struct gealist
Adjust all related size calculations to match the changes to sizeof().
No machine code output differences are produced after these changes.
[1] For lots of details, see both:
https://docs.kernel.org/process/deprecated.html#zero-length-and-one-element-arrays
https://people.kernel.org/kees/bounded-flexible-arrays-in-c
Cc: Steve French <sfrench@samba.org> Cc: Paulo Alcantara <pc@cjr.nz> Cc: Ronnie Sahlberg <lsahlber@redhat.com> Cc: Shyam Prasad N <sprasad@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Kees Cook [Wed, 15 Feb 2023 00:08:39 +0000 (16:08 -0800)]
cifs: Convert struct fealist away from 1-element array
The kernel is globally removing the ambiguous 0-length and 1-element
arrays in favor of flexible arrays, so that we can gain both compile-time
and run-time array bounds checking[1].
While struct fealist is defined as a "fake" flexible array (via a
1-element array), it is only used for examination of the first array
element. Walking the list is performed separately, so there is no reason
to treat the "list" member of struct fealist as anything other than a
single entry. Adjust the struct and code to match.
Additionally, struct fea uses the "name" member either as a dynamic
string, or is manually calculated from the start of the struct. Redefine
the member as a flexible array.
No machine code output differences are produced after these changes.
[1] For lots of details, see both:
https://docs.kernel.org/process/deprecated.html#zero-length-and-one-element-arrays
https://people.kernel.org/kees/bounded-flexible-arrays-in-c
Cc: Steve French <sfrench@samba.org> Cc: Paulo Alcantara <pc@cjr.nz> Cc: Ronnie Sahlberg <lsahlber@redhat.com> Cc: Shyam Prasad N <sprasad@microsoft.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Thu, 16 Feb 2023 18:33:22 +0000 (15:33 -0300)]
cifs: fix mount on old smb servers
The client was sending rfc1002 session request packet with a wrong
length field set, therefore failing to mount shares against old SMB
servers over port 139.
Fix this by calculating the correct length as specified in rfc1002.
Fixes: d7173623bf0b ("cifs: use ALIGN() and round_up() macros") Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Namjae Jeon [Wed, 8 Feb 2023 09:34:37 +0000 (18:34 +0900)]
cifs: remove unneeded 2bytes of padding from smb2 tree connect
Due to the 2bytes of padding from the smb2 tree connect request,
there is an unneeded difference between the rfc1002 length and the actual
frame length. In the case of windows client, it is sent by matching it
exactly.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Matthew Wilcox (Oracle) [Wed, 8 Feb 2023 14:56:11 +0000 (14:56 +0000)]
filemap: Remove lock_page_killable()
There are no more callers; remove this function before any more appear.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Matthew Wilcox (Oracle) [Wed, 8 Feb 2023 14:56:10 +0000 (14:56 +0000)]
cifs: Use a folio in cifs_page_mkwrite()
Avoids many calls to compound_head() and removes calls to various
compat functions.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Volker Lendecke [Wed, 11 Jan 2023 11:37:58 +0000 (12:37 +0100)]
cifs: Fix uninitialized memory read in smb3_qfs_tcon()
oparms was not fully initialized
Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
Stefan Metzmacher [Wed, 1 Feb 2023 15:21:41 +0000 (16:21 +0100)]
cifs: don't try to use rdma offload on encrypted connections
The aim of using encryption on a connection is to keep
the data confidential, so we must not use plaintext rdma offload
for that data!
It seems that current windows servers and ksmbd would allow
this, but that's no reason to expose the users data in plaintext!
And servers hopefully reject this in future.
Note modern windows servers support signed or encrypted offload,
see MS-SMB2 2.2.3.1.6 SMB2_RDMA_TRANSFORM_CAPABILITIES, but we don't
support that yet.
Signed-off-by: Stefan Metzmacher <metze@samba.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
Stefan Metzmacher [Wed, 1 Feb 2023 15:21:40 +0000 (16:21 +0100)]
cifs: split out smb3_use_rdma_offload() helper
We should have the logic to decide if we want rdma offload
in a single spot in order to advance it in future.
Signed-off-by: Stefan Metzmacher <metze@samba.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
Stefan Metzmacher [Wed, 1 Feb 2023 15:21:39 +0000 (16:21 +0100)]
cifs: introduce cifs_io_parms in smb2_async_writev()
This will simplify the following changes and makes it easy to get
in passed in from the caller in future.
Signed-off-by: Stefan Metzmacher <metze@samba.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Tue, 31 Jan 2023 16:22:07 +0000 (13:22 -0300)]
cifs: get rid of unneeded conditional in cifs_get_num_sgs()
Just have @skip set to 0 after first iterations of the two nested
loops.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Mon, 30 Jan 2023 23:33:29 +0000 (20:33 -0300)]
cifs: prevent data race in smb2_reconnect()
Make sure to get an up-to-date TCP_Server_Info::nr_targets value prior
to waiting the server to be reconnected in smb2_reconnect(). It is
set in cifs_tcp_ses_needs_reconnect() and protected by
TCP_Server_Info::srv_lock.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
Steve French [Tue, 31 Jan 2023 01:32:52 +0000 (19:32 -0600)]
cifs: fix indentation in make menuconfig options
The options that are displayed for the smb3.1.1/cifs client
in "make menuconfig" are confusing because some of them are
not indented making them not appear to be related to cifs.ko
Fix that by adding an if/endif (similar to what ceph and 9pm did)
if fs/cifs/Kconfig
Suggested-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Steve French [Tue, 31 Jan 2023 00:57:06 +0000 (18:57 -0600)]
cifs: update Kconfig description
There were various outdated or missing things in fs/cifs/Kconfig
e.g. mention of support for insecure NTLM which has been removed,
and lack of mention of some important features. This also shortens
it slightly, and fixes some confusing text (e.g. the SMB1 POSIX
extensions option).
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Andy Shevchenko [Fri, 20 Jan 2023 12:08:57 +0000 (14:08 +0200)]
cifs: Get rid of unneeded conditional in the smb2_get_aead_req()
In the smb2_get_aead_req() the skip variable is used only for
the very first iteration of the two nested loops, which means
it's basically in invariant to those loops. Hence, instead of
using conditional on each iteration, unconditionally assign
the 'skip' variable before the loops and at the end of the
inner loop.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Gustavo A. R. Silva [Tue, 10 Jan 2023 01:39:00 +0000 (19:39 -0600)]
cifs: Replace zero-length arrays with flexible-array members
Zero-length arrays are deprecated[1] and we are moving towards
adopting C99 flexible-array members instead. So, replace zero-length
arrays in a couple of structures with flex-array members.
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [2].
Christophe JAILLET [Sat, 14 Jan 2023 08:58:15 +0000 (09:58 +0100)]
cifs: Use kstrtobool() instead of strtobool()
strtobool() is the same as kstrtobool().
However, the latter is more used within the kernel.
In order to remove strtobool() and slightly simplify kstrtox.h, switch to
the other function name.
While at it, include the corresponding header file (<linux/kstrtox.h>)
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Paulo Alcantara <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
Linus Torvalds [Sun, 19 Feb 2023 01:57:16 +0000 (17:57 -0800)]
Merge tag 'x86-urgent-2023-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
"A single fix for x86.
Revert the recent change to the MTRR code which aimed to support
SEV-SNP guests on Hyper-V. It caused a regression on XEN Dom0 kernels.
The underlying issue of MTTR (mis)handling in the x86 code needs some
deeper investigation and is definitely not 6.2 material"
* tag 'x86-urgent-2023-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mtrr: Revert 90b926e68f50 ("x86/pat: Fix pat_x_mtrr_type() for MTRR disabled case")
Linus Torvalds [Sun, 19 Feb 2023 01:46:50 +0000 (17:46 -0800)]
Merge tag 'timers-urgent-2023-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A fix for a long standing issue in the alarmtimer code.
Posix-timers armed with a short interval with an ignored signal result
in an unpriviledged DoS. Due to the ignored signal the timer switches
into self rearm mode. This issue had been "fixed" before but a rework
of the alarmtimer code 5 years ago lost that workaround.
There is no real good solution for this issue, which is also worked
around in the core posix-timer code in the same way, but it certainly
moved way up on the ever growing todo list"
* tag 'timers-urgent-2023-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
alarmtimer: Prevent starvation by small intervals and SIG_IGN
Linus Torvalds [Sun, 19 Feb 2023 01:38:18 +0000 (17:38 -0800)]
Merge tag 'irq-urgent-2023-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Thomas Gleixner:
"A single build fix for the PCI/MSI infrastructure.
The addition of the new alloc/free interfaces in this cycle forgot to
add stub functions for pci_msix_alloc_irq_at() and pci_msix_free_irq()
for the CONFIG_PCI_MSI=n case"
* tag 'irq-urgent-2023-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
PCI/MSI: Provide missing stubs for CONFIG_PCI_MSI=n
Linus Torvalds [Sat, 18 Feb 2023 19:07:32 +0000 (11:07 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm/x86 fixes from Paolo Bonzini:
- zero all padding for KVM_GET_DEBUGREGS
- fix rST warning
- disable vPMU support on hybrid CPUs
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: initialize all of the kvm_debugregs structure before sending it to userspace
perf/x86: Refuse to export capabilities for hybrid PMUs
KVM: x86/pmu: Disable vPMU support on hybrid CPUs (host PMUs)
Documentation/hw-vuln: Fix rST warning
Linus Torvalds [Sat, 18 Feb 2023 18:10:49 +0000 (10:10 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 regression fix from Will Deacon:
"Apologies for the _extremely_ late pull request here, but we had a
'perf' (i.e. CPU PMU) regression on the Apple M1 reported on Wednesday
[1] which was introduced by bd2756811766 ("perf: Rewrite core context
handling") during the merge window.
Mark and I looked into this and noticed an additional problem caused
by the same patch, where the 'CHAIN' event (used to combine two
adjacent 32-bit counters into a single 64-bit counter) was not being
filtered correctly. Mark posted a series on Thursday [2] which
addresses both of these regressions and I queued it the same day.
The changes are small, self-contained and have been confirmed to fix
the original regression.
Summary:
- Fix 'perf' regression for non-standard CPU PMU hardware (i.e. Apple
M1)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: perf: reject CHAIN events at creation time
arm_pmu: fix event CPU filtering
Linus Torvalds [Sat, 18 Feb 2023 17:56:58 +0000 (09:56 -0800)]
Merge tag 'block-6.2-2023-02-17' of git://git.kernel.dk/linux
Pull block fix from Jens Axboe:
"I guess this is what can happen when you prep things early for going
away, something else comes in last minute. This one fixes another
regression in 6.2 for NVMe, from this release, and hence we should
probably get it submitted for 6.2.
Still waiting for the original reporter (see bugzilla linked in the
commit) to test this, but Keith managed to setup and recreate the
issue and tested the patch that way"
* tag 'block-6.2-2023-02-17' of git://git.kernel.dk/linux:
nvme-pci: refresh visible attrs for cmb attributes
Linus Torvalds [Sat, 18 Feb 2023 01:51:40 +0000 (17:51 -0800)]
Merge tag 'mm-hotfixes-stable-2023-02-17-15-16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Six hotfixes. Five are cc:stable: four for MM, one for nilfs2.
Also a MAINTAINERS update"
* tag 'mm-hotfixes-stable-2023-02-17-15-16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
nilfs2: fix underflow in second superblock position calculations
hugetlb: check for undefined shift on 32 bit architectures
mm/migrate: fix wrongly apply write bit after mkdirty on sparc64
MAINTAINERS: update FPU EMULATOR web page
mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcount
mm/filemap: fix page end in filemap_get_read_batch
Ryusuke Konishi [Tue, 14 Feb 2023 22:40:43 +0000 (07:40 +0900)]
nilfs2: fix underflow in second superblock position calculations
Macro NILFS_SB2_OFFSET_BYTES, which computes the position of the second
superblock, underflows when the argument device size is less than 4096
bytes. Therefore, when using this macro, it is necessary to check in
advance that the device size is not less than a lower limit, or at least
that underflow does not occur.
The current nilfs2 implementation lacks this check, causing out-of-bound
block access when mounting devices smaller than 4096 bytes:
I/O error, dev loop0, sector 36028797018963960 op 0x0:(READ) flags 0x0
phys_seg 1 prio class 2
NILFS (loop0): unable to read secondary superblock (blocksize = 1024)
In addition, when trying to resize the filesystem to a size below 4096
bytes, this underflow occurs in nilfs_resize_fs(), passing a huge number
of segments to nilfs_sufile_resize(), corrupting parameters such as the
number of segments in superblocks. This causes excessive loop iterations
in nilfs_sufile_resize() during a subsequent resize ioctl, causing
semaphore ns_segctor_sem to block for a long time and hang the writer
thread:
Mike Kravetz [Thu, 16 Feb 2023 01:35:42 +0000 (17:35 -0800)]
hugetlb: check for undefined shift on 32 bit architectures
Users can specify the hugetlb page size in the mmap, shmget and
memfd_create system calls. This is done by using 6 bits within the flags
argument to encode the base-2 logarithm of the desired page size. The
routine hstate_sizelog() uses the log2 value to find the corresponding
hugetlb hstate structure. Converting the log2 value (page_size_log) to
potential hugetlb page size is the simple statement:
1UL << page_size_log
Because only 6 bits are used for page_size_log, the left shift can not be
greater than 63. This is fine on 64 bit architectures where a long is 64
bits. However, if a value greater than 31 is passed on a 32 bit
architecture (where long is 32 bits) the shift will result in undefined
behavior. This was generally not an issue as the result of the undefined
shift had to exactly match hugetlb page size to proceed.
Recent improvements in runtime checking have resulted in this undefined
behavior throwing errors such as reported below.
Fix by comparing page_size_log to BITS_PER_LONG before doing shift.
Link: https://lkml.kernel.org/r/20230216013542.138708-1-mike.kravetz@oracle.com Link: https://lore.kernel.org/lkml/CA+G9fYuei_Tr-vN9GS7SfFyU1y9hNysnf=PB7kT0=yv4MiPgVg@mail.gmail.com/ Fixes: 42d7395feb56 ("mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reviewed-by: Jesper Juhl <jesperjuhl76@gmail.com> Acked-by: Muchun Song <songmuchun@bytedance.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Anders Roxell <anders.roxell@linaro.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Sasha Levin <sashal@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Peter Xu [Thu, 16 Feb 2023 15:30:59 +0000 (10:30 -0500)]
mm/migrate: fix wrongly apply write bit after mkdirty on sparc64
Nick Bowler reported another sparc64 breakage after the young/dirty
persistent work for page migration (per "Link:" below). That's after a
similar report [2].
It turns out page migration was overlooked, and it wasn't failing before
because page migration was not enabled in the initial report test
environment.
David proposed another way [2] to fix this from sparc64 side, but that
patch didn't land somehow. Neither did I check whether there's any other
arch that has similar issues.
Let's fix it for now as simple as moving the write bit handling to be
after dirty, like what we did before.
Note: this is based on mm-unstable, because the breakage was since 6.1 and
we're at a very late stage of 6.2 (-rc8), so I assume for this specific
case we should target this at 6.3.
Linus Torvalds [Fri, 17 Feb 2023 22:53:37 +0000 (14:53 -0800)]
Merge tag 'powerpc-6.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
- Prevent fallthrough to hash TLB flush when using radix
Thanks to Benjamin Gray and Erhard Furtner.
* tag 'powerpc-6.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Prevent fallthrough to hash TLB flush when using radix
Linus Torvalds [Fri, 17 Feb 2023 22:44:31 +0000 (14:44 -0800)]
Merge tag 'sound-fix-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A few last-minute fixes. The significant ones are two ASoC SOF
regression fixes while the rest are trivial HD-audio quirks.
All are small / one-liners and should be pretty safe to take"
* tag 'sound-fix-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: SOF: Intel: hda-dai: fix possible stream_tag leak
ALSA: hda/realtek: Enable mute/micmute LEDs and speaker support for HP Laptops
ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform.
ALSA: hda/realtek - fixed wrong gpio assigned
ALSA: hda: Fix codec device field initializan
ALSA: hda/conexant: add a new hda codec SN6180
ASoC: SOF: ops: refine parameters order in function snd_sof_dsp_update8
Linus Torvalds [Fri, 17 Feb 2023 21:53:09 +0000 (13:53 -0800)]
Merge tag 'ata-6.2-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata fixes from Damien Le Moal:
"Three small fixes for 6.2 final:
- Disable READ LOG DMA EXT for Samsung MZ7LH drives as these drives
choke on that command, from Patrick.
- Add Intel Tiger Lake UP{3,4} to the list of supported AHCI
controllers (this is not technically a bug fix, but it is trivial
enough that I add it here), from Simon.
- Fix code comments in the pata_octeon_cf driver as incorrect
formatting was causing warnings from kernel-doc, from Randy"
* tag 'ata-6.2-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: pata_octeon_cf: drop kernel-doc notation
ata: ahci: Add Tiger Lake UP{3,4} AHCI controller
ata: libata-core: Disable READ LOG DMA EXT for Samsung MZ7LH
Linus Torvalds [Fri, 17 Feb 2023 21:48:54 +0000 (13:48 -0800)]
Merge tag 'mmc-v6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Fix potential resource leaks in SDIO card detection error path
MMC host:
- jz4740: Decrease maximum clock rate to workaround bug on JZ4760(B)
- meson-gx: Fix SDIO support to get some WiFi modules to work again
- mmc_spi: Fix error handling in ->probe()"
* tag 'mmc-v6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: jz4740: Work around bug on JZ4760(B)
mmc: mmc_spi: fix error handling in mmc_spi_probe()
mmc: sdio: fix possible resource leaks in some error paths
mmc: meson-gx: fix SDIO mode if cap_sdio_irq isn't set
Linus Torvalds [Fri, 17 Feb 2023 21:45:09 +0000 (13:45 -0800)]
Merge tag 'sched-urgent-2023-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
- Fix user-after-free bug in call_usermodehelper_exec()
- Fix missing user_cpus_ptr update in __set_cpus_allowed_ptr_locked()
- Fix PSI use-after-free bug in ep_remove_wait_queue()
* tag 'sched-urgent-2023-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/psi: Fix use-after-free in ep_remove_wait_queue()
sched/core: Fix a missed update of user_cpus_ptr
freezer,umh: Fix call_usermode_helper_exec() vs SIGKILL
Unfortunately, it has come to our attention that there is still a bug
somewhere in the READ_PLUS code that can result in nfsroot systems on
ARM to crash during boot.
Let's do the right thing and revert this change so we don't break
people's nfsroot setups.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Keith Busch [Thu, 16 Feb 2023 16:44:03 +0000 (08:44 -0800)]
nvme-pci: refresh visible attrs for cmb attributes
The sysfs group containing the cmb attributes is registered before the
driver knows if they need to be visible or not. Update the group when
cmb attributes are known to exist so the visibility setting is correct.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217037 Fixes: 86adbf0cdb9ec65 ("nvme: simplify transport specific device attribute handling") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
Linus Torvalds [Fri, 17 Feb 2023 04:23:32 +0000 (20:23 -0800)]
Merge tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Just a final collection of misc fixes, the biggest disables the
recently added dynamic debugging support, it has a regression that
needs some bigger fixes.
Otherwise a bunch of fixes across the board, vc4, amdgpu and vmwgfx
mostly, with some smaller i915 and ast fixes.
* tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Fail atomic_check early on normalize_zpos error
drm/amd/amdgpu: fix warning during suspend
drm/vmwgfx: Do not drop the reference to the handle too soon
drm/vmwgfx: Stop accessing buffer objects which failed init
drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list
drm: Disable dynamic debug as broken
drm/ast: Fix start address computation
fbdev: Fix invalid page access after closing deferred I/O devices
drm/vc4: crtc: Increase setup cost in core clock calculation to handle extreme reduced blanking
drm/vc4: hdmi: Always enable GCP with AVMUTE cleared
drm/vc4: Fix YUV plane handling when planes are in different buffers
Zach O'Keefe [Wed, 25 Jan 2023 01:57:37 +0000 (17:57 -0800)]
mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcount
During collapse, in a few places we check to see if a given small page has
any unaccounted references. If the refcount on the page doesn't match our
expectations, it must be there is an unknown user concurrently interested
in the page, and so it's not safe to move the contents elsewhere.
However, the unaccounted pins are likely an ephemeral state.
In this situation, MADV_COLLAPSE returns -EINVAL when it should return
-EAGAIN. This could cause userspace to conclude that the syscall
failed, when it in fact could succeed by retrying.
Qian Yingjin [Wed, 8 Feb 2023 02:24:00 +0000 (10:24 +0800)]
mm/filemap: fix page end in filemap_get_read_batch
I was running traces of the read code against an RAID storage system to
understand why read requests were being misaligned against the underlying
RAID strips. I found that the page end offset calculation in
filemap_get_read_batch() was off by one.
When a read is submitted with end offset 1048575, then it calculates the
end page for read of 256 when it should be 255. "last_index" is the index
of the page beyond the end of the read and it should be skipped when get a
batch of pages for read in @filemap_get_read_batch().
The below simple patch fixes the problem. This code was introduced in
kernel 5.12.
Link: https://lkml.kernel.org/r/20230208022400.28962-1-coolqyj@163.com Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read") Signed-off-by: Qian Yingjin <qian@ddn.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Benjamin Gray [Fri, 17 Feb 2023 01:14:34 +0000 (12:14 +1100)]
powerpc/64s: Prevent fallthrough to hash TLB flush when using radix
In the fix reconnecting hash__tlb_flush() to tlb_flush() the
void return on radix__tlb_flush() was not restored and subsequently
falls through to the restored hash__tlb_flush().
Guard hash__tlb_flush() under an else to prevent this.
Fixes: 1665c027afb2 ("powerpc/64s: Reconnect tlb_flush() to hash__tlb_flush()") Reported-by: "Erhard F." <erhard_f@mailbox.org> Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230217011434.115554-1-bgray@linux.ibm.com
Dave Airlie [Thu, 16 Feb 2023 23:23:43 +0000 (09:23 +1000)]
Merge tag 'drm-misc-fixes-2023-02-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Multiple fixes in vc4 to address issues with YUV planes, HDMI and CRTC;
an invalid page access fix for fbdev, mark dynamic debug as broken, a
double free and refcounting fix for vmwgfx.
Mark Rutland [Thu, 16 Feb 2023 14:12:39 +0000 (14:12 +0000)]
arm64: perf: reject CHAIN events at creation time
Currently it's possible for a user to open CHAIN events arbitrarily,
which we previously tried to rule out in commit:
ca2b497253ad01c8 ("arm64: perf: Reject stand-alone CHAIN events for PMUv3")
Which allowed the events to be opened, but prevented them from being
scheduled by by using an arm_pmu::filter_match hook to reject the
relevant events.
The CHAIN event filtering in the arm_pmu::filter_match hook was silently
removed in commit:
As a result, it's now possible for users to open CHAIN events, and for
these to be installed arbitrarily.
Fix this by rejecting CHAIN events at creation time. This avoids the
creation of events which will never count, and doesn't require using the
dynamic filtering.
Attempting to open a CHAIN event (0x1e) will now be rejected:
| # ./perf stat -e armv8_pmuv3/config=0x1e/ ls
| perf
|
| Performance counter stats for 'ls':
|
| <not supported> armv8_pmuv3/config=0x1e/
|
| 0.002197470 seconds time elapsed
|
| 0.000000000 seconds user
| 0.002294000 seconds sys
Other events (e.g. CPU_CYCLES / 0x11) will open as usual:
| # ./perf stat -e armv8_pmuv3/config=0x11/ ls
| perf
|
| Performance counter stats for 'ls':
|
| 2538761 armv8_pmuv3/config=0x11/
|
| 0.002227330 seconds time elapsed
|
| 0.002369000 seconds user
| 0.000000000 seconds sys
Fixes: bd2756811766 ("perf: Rewrite core context handling") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230216141240.3833272-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
That commit replaced the pmu::filter_match() callback with
pmu::filter(), whose return value has the opposite polarity, with true
implying events should be ignored rather than scheduled. While an
attempt was made to update the logic in armv8pmu_filter() and
armpmu_filter() accordingly, the return value remains inverted in a
couple of cases:
* If the arm_pmu does not have an arm_pmu::filter() callback,
armpmu_filter() will always return whether the CPU is supported rather
than whether the CPU is not supported.
As a result, the perf core will not schedule events on supported CPUs,
resulting in a loss of events. Additionally, the perf core will
attempt to schedule events on unsupported CPUs, but this will be
rejected by armpmu_add(), which may result in a loss of events from
other PMUs on those unsupported CPUs.
* If the arm_pmu does have an arm_pmu::filter() callback, and
armpmu_filter() is called on a CPU which is not supported by the
arm_pmu, armpmu_filter() will return false rather than true.
As a result, the perf core will attempt to schedule events on
unsupported CPUs, but this will be rejected by armpmu_add(), which may
result in a loss of events from other PMUs on those unsupported CPUs.
This means a loss of events can be seen with any arm_pmu driver, but
with the ARMv8 PMUv3 driver (which is the only arm_pmu driver with an
arm_pmu::filter() callback) the event loss will be more limited and may
go unnoticed, which is how this issue evaded testing so far.
Fix the CPU filtering by performing this consistently in
armpmu_filter(), and remove the redundant arm_pmu::filter() callback and
armv8pmu_filter() implementation.
Commit bd2756811766 also silently removed the CHAIN event filtering from
armv8pmu_filter(), which will be addressed by a separate patch without
using the filter callback.
Linus Torvalds [Thu, 16 Feb 2023 20:13:58 +0000 (12:13 -0800)]
Merge tag 'net-6.2-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Fixes from the main networking tree only, probably because all
sub-trees have backed off and haven't submitted their changes.
None of the fixes here are particularly scary and no outstanding
regressions. In an ideal world the "current release" sections would be
empty at this stage but that never happens.
Current release - regressions:
- fix unwanted sign extension in netdev_stats_to_stats64()
Current release - new code bugs:
- initialize net->notrefcnt_tracker earlier
- devlink: fix netdev notifier chain corruption
- nfp: make sure mbox accesses in IPsec code are atomic
- ice: fix check for weight and priority of a scheduling node
- mpls: fix stale pointer if allocation fails during device rename
- dccp/tcp: avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions
- remove WARN_ON_ONCE(sk->sk_forward_alloc) from
sk_stream_kill_queues()
- af_key: fix heap information leak
- ipv6: fix socket connection with DSCP (correct interpretation of
the tclass field vs fib rule matching)
- tipc: fix kernel warning when sending SYN message
- vmxnet3: read RSS information from the correct descriptor (eop)"
* tag 'net-6.2-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits)
devlink: Fix netdev notifier chain corruption
igb: conditionalize I2C bit banging on external thermal sensor support
net: mpls: fix stale pointer if allocation fails during device rename
net/sched: tcindex: search key must be 16 bits
tipc: fix kernel warning when sending SYN message
igb: Fix PPS input and output using 3rd and 4th SDP
net: use a bounce buffer for copying skb->mark
ixgbe: add double of VLAN header when computing the max MTU
i40e: add double of VLAN header when computing the max MTU
ixgbe: allow to increase MTU to 3K with XDP enabled
net: stmmac: Restrict warning on disabling DMA store and fwd mode
net/sched: act_ctinfo: use percpu stats
net: stmmac: fix order of dwmac5 FlexPPS parametrization sequence
ice: fix lost multicast packets in promisc mode
ice: Fix check for weight and priority of a scheduling node
bnxt_en: Fix mqprio and XDP ring checking logic
net: Fix unwanted sign extension in netdev_stats_to_stats64()
net/usb: kalmia: Don't pass act_len in usb_bulk_msg error path
net: openvswitch: fix possible memory leak in ovs_meter_cmd_set()
af_key: Fix heap information leak
...
Linus Torvalds [Thu, 16 Feb 2023 20:05:33 +0000 (12:05 -0800)]
Merge tag 'block-6.2-2023-02-16' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"Just a few NVMe fixes that should go into the 6.2 release, adding a
quirk and fixing two issues introduced in this release:
- NVMe fixes via Christoph:
- Always return an ERR_PTR from nvme_pci_alloc_dev (Irvin Cote)
- Add bogus ID quirk for ADATA SX6000PNP (Daniel Wagner)
- Set the DMA mask earlier (Christoph Hellwig)"
* tag 'block-6.2-2023-02-16' of git://git.kernel.dk/linux:
nvme-pci: always return an ERR_PTR from nvme_pci_alloc_dev
nvme-pci: set the DMA mask earlier
nvme-pci: add bogus ID quirk for ADATA SX6000PNP
Linus Torvalds [Thu, 16 Feb 2023 20:01:46 +0000 (12:01 -0800)]
Merge tag 'spi-v6.2-rc8-abi' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fix from Mark Brown:
"One more last minute patch for v6.2 updating the parsing of the newly
added spi-cs-setup-delay-ns.
It's been pointed out that due to the way DT parsing works the change
in property size is ABI visible so let's not let a release go out
without it being fixed. The change got split from some earlier ABI
related fixes to the property since the first version sent had a build
error"
* tag 'spi-v6.2-rc8-abi' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: Use a 32-bit DT property for spi-cs-setup-delay-ns
Christoph Hellwig [Thu, 16 Feb 2023 06:31:10 +0000 (07:31 +0100)]
stop mainaining UUID
The uuid code is very low maintainance now that the major overhaul
has completed, and doesn't need it's own tree. All the recent work
has been done by Andy who'd like to stay on as a reviewer without an
explicit tree.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Andy Shevchenko <andy@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Greg Kroah-Hartman [Tue, 14 Feb 2023 10:33:04 +0000 (11:33 +0100)]
kvm: initialize all of the kvm_debugregs structure before sending it to userspace
When calling the KVM_GET_DEBUGREGS ioctl, on some configurations, there
might be some unitialized portions of the kvm_debugregs structure that
could be copied to userspace. Prevent this as is done in the other kvm
ioctls, by setting the whole structure to 0 before copying anything into
it.
Bonus is that this reduces the lines of code as the explicit flag
setting and reserved space zeroing out can be removed.
Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: <x86@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: stable <stable@kernel.org> Reported-by: Xingyuan Mo <hdthky0@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Message-Id: <20230214103304.3689213-1-gregkh@linuxfoundation.org> Tested-by: Xingyuan Mo <hdthky0@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pierre-Louis Bossart [Thu, 16 Feb 2023 16:23:40 +0000 (18:23 +0200)]
ASoC: SOF: Intel: hda-dai: fix possible stream_tag leak
The HDaudio stream allocation is done first, and in a second step the
LOSIDV parameter is programmed for the multi-link used by a codec.
This leads to a possible stream_tag leak, e.g. if a DisplayAudio link
is not used. This would happen when a non-Intel graphics card is used
and userspace unconditionally uses the Intel Display Audio PCMs without
checking if they are connected to a receiver with jack controls.
We should first check that there is a valid multi-link entry to
configure before allocating a stream_tag. This change aligns the
dma_assign and dma_cleanup phases.
Complements: b0cd60f3e9f5 ("ALSA/ASoC: hda: clarify bus_get_link() and bus_link_get() helpers") Link: https://github.com/thesofproject/linux/issues/4151 Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://lore.kernel.org/r/20230216162340.19480-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
Ido Schimmel [Wed, 15 Feb 2023 07:31:39 +0000 (09:31 +0200)]
devlink: Fix netdev notifier chain corruption
Cited commit changed devlink to register its netdev notifier block on
the global netdev notifier chain instead of on the per network namespace
one.
However, when changing the network namespace of the devlink instance,
devlink still tries to unregister its notifier block from the chain of
the old namespace and register it on the chain of the new namespace.
This results in corruption of the notifier chains, as the same notifier
block is registered on two different chains: The global one and the per
network namespace one. In turn, this causes other problems such as the
inability to dismantle namespaces due to netdev reference count issues.
Fix by preventing devlink from moving its notifier block between
namespaces.
Reproducer:
# echo "10 1" > /sys/bus/netdevsim/new_device
# ip netns add test123
# devlink dev reload netdevsim/netdevsim10 netns test123
# ip netns del test123
[ 71.935619] unregister_netdevice: waiting for lo to become free. Usage count = 2
[ 71.938348] leaked reference.
Fixes: 565b4824c39f ("devlink: change port event netdev notifier from per-net to global") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230215073139.1360108-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
the SFP NICs no longer get link at all. Reverting commit a97f8783a937
or switching to the Intel out-of-tree driver both fix the problem.
Per the igb out-of-tree driver, I2C bit banging on i350 depends on
support for an external thermal sensor (ETS). However, commit a97f8783a937 added bit banging unconditionally. Additionally, the
out-of-tree driver always calls init_thermal_sensor_thresh on probe,
while our driver only calls init_thermal_sensor_thresh only in
igb_reset(), and only if an ETS is present, ignoring the internal
thermal sensor. The affected SFPs don't provide an ETS. Per Intel,
the behaviour is a result of i350 firmware requirements.
This patch fixes the problem by aligning the behaviour to the
out-of-tree driver:
- split igb_init_i2c() into two functions:
- igb_init_i2c() only performs the basic I2C initialization.
- igb_set_i2c_bb() makes sure that E1000_CTRL_I2C_ENA is set
and enables bit-banging.
- igb_probe() only calls igb_set_i2c_bb() if an ETS is present.
- igb_reset() aligns its behaviour to igb_probe(), i. e., call
igb_set_i2c_bb() if an ETS is present and call
init_thermal_sensor_thresh() unconditionally.
Jakub Kicinski [Thu, 16 Feb 2023 03:20:58 +0000 (19:20 -0800)]
Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-02-14 (ixgbe, i40e)
This series contains updates to ixgbe and i40e drivers.
Jason Xing corrects comparison of frame sizes for setting MTU with XDP on
ixgbe and adjusts frame size to account for a second VLAN header on ixgbe
and i40e.
* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ixgbe: add double of VLAN header when computing the max MTU
i40e: add double of VLAN header when computing the max MTU
ixgbe: allow to increase MTU to 3K with XDP enabled
====================
Linus Torvalds [Wed, 15 Feb 2023 22:53:08 +0000 (14:53 -0800)]
Merge tag 'apparmor-v6.2-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor fix from John Johansen:
"Regression fix for getattr mediation of old policy"
* tag 'apparmor-v6.2-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: Fix regression in compat permissions for getattr
Jens Axboe [Wed, 15 Feb 2023 20:47:27 +0000 (13:47 -0700)]
Merge tag 'nvme-6.2-2023-02-15' of git://git.infradead.org/nvme into block-6.2
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.2
- always return an ERR_PTR from nvme_pci_alloc_dev (Irvin Cote)
- add bogus ID quirk for ADATA SX6000PNP (Daniel Wagner)
- set the DMA mask earlier (Christoph Hellwig)"
* tag 'nvme-6.2-2023-02-15' of git://git.infradead.org/nvme:
nvme-pci: always return an ERR_PTR from nvme_pci_alloc_dev
nvme-pci: set the DMA mask earlier
nvme-pci: add bogus ID quirk for ADATA SX6000PNP
Linus Torvalds [Wed, 15 Feb 2023 19:31:34 +0000 (11:31 -0800)]
Merge tag 'trace-v6.2-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixlet from Steven Rostedt:
"Make trace_define_field_ext() static.
Just after the fix to TASK_COMM_LEN not converted to its value in
trace_events was pulled, the kernel test robot reported that the
helper function trace_define_field_ext() added to that change was only
used in the file it was defined in but was not declared static.
Make it a local function"
* tag 'trace-v6.2-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Make trace_define_field_ext() static
John Johansen [Wed, 15 Feb 2023 04:21:17 +0000 (20:21 -0800)]
apparmor: Fix regression in compat permissions for getattr
This fixes a regression in mediation of getattr when old policy built
under an older ABI is loaded and mapped to internal permissions.
The regression does not occur for all getattr permission requests,
only appearing if state zero is the final state in the permission
lookup. This is because despite the first state (index 0) being
guaranteed to not have permissions in both newer and older permission
formats, it may have to carry permissions that were not mediated as
part of an older policy. These backward compat permissions are
mapped here to avoid special casing the mediation code paths.
Since the mapping code already takes into account backwards compat
permission from older formats it can be applied to state 0 to fix
the regression.
Fixes: 408d53e923bd ("apparmor: compute file permissions on profile load") Reported-by: Philip Meulengracht <the_meulengracht@hotmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
Werner Sembach [Wed, 15 Feb 2023 14:39:41 +0000 (15:39 +0100)]
gpiolib: acpi: Add a ignore wakeup quirk for Clevo NH5xAx
The commit 1796f808e4bb ("HID: i2c-hid: acpi: Stop setting wakeup_capable")
changed the policy such that I2C touchpads may be able to wake up the
system by default if the system is configured as such.
However for some devices there is a bug, that is causing the touchpad to
instantly wake up the device again once it gets deactivated. The root cause
is still under investigation (see Link tag).
To workaround this problem for the time being, introduce a quirk for this
model that will prevent the wakeup capability for being set for GPIO 16.
Sean Christopherson [Wed, 8 Feb 2023 20:42:30 +0000 (20:42 +0000)]
perf/x86: Refuse to export capabilities for hybrid PMUs
Now that KVM disables vPMU support on hybrid CPUs, WARN and return zeros
if perf_get_x86_pmu_capability() is invoked on a hybrid CPU. The helper
doesn't provide an accurate accounting of the PMU capabilities for hybrid
CPUs and needs to be enhanced if KVM, or anything else outside of perf,
wants to act on the PMU capabilities.
Cc: stable@vger.kernel.org Cc: Andrew Cooper <Andrew.Cooper3@citrix.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/all/20220818181530.2355034-1-kan.liang@linux.intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230208204230.1360502-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Wed, 8 Feb 2023 20:42:29 +0000 (20:42 +0000)]
KVM: x86/pmu: Disable vPMU support on hybrid CPUs (host PMUs)
Disable KVM support for virtualizing PMUs on hosts with hybrid PMUs until
KVM gains a sane way to enumeration the hybrid vPMU to userspace and/or
gains a mechanism to let userspace opt-in to the dangers of exposing a
hybrid vPMU to KVM guests. Virtualizing a hybrid PMU, or at least part of
a hybrid PMU, is possible, but it requires careful, deliberate
configuration from userspace.
E.g. to expose full functionality, vCPUs need to be pinned to pCPUs to
prevent migrating a vCPU between a big core and a little core, userspace
must enumerate a reasonable topology to the guest, and guest CPUID must be
curated per vCPU to enumerate accurate vPMU capabilities.
The last point is especially problematic, as KVM doesn't control which
pCPU it runs on when enumerating KVM's vPMU capabilities to userspace,
i.e. userspace can't rely on KVM_GET_SUPPORTED_CPUID in it's current form.
Alternatively, userspace could enable vPMU support by enumerating the
set of features that are common and coherent across all cores, e.g. by
filtering PMU events and restricting guest capabilities. But again, that
requires userspace to take action far beyond reflecting KVM's supported
feature set into the guest.
For now, simply disable vPMU support on hybrid CPUs to avoid inducing
seemingly random #GPs in guests, and punt support for hybrid CPUs to a
future enabling effort.
Reported-by: Jianfeng Gao <jianfeng.gao@intel.com> Cc: stable@vger.kernel.org Cc: Andrew Cooper <Andrew.Cooper3@citrix.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/all/20220818181530.2355034-1-kan.liang@linux.intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230208204230.1360502-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Munehisa Kamata [Tue, 14 Feb 2023 21:27:05 +0000 (13:27 -0800)]
sched/psi: Fix use-after-free in ep_remove_wait_queue()
If a non-root cgroup gets removed when there is a thread that registered
trigger and is polling on a pressure file within the cgroup, the polling
waitqueue gets freed in the following path:
The fundamental problem here is that cgroup_file_release() (and
consequently waitqueue's lifetime) is not tied to the file's real lifetime.
Using wake_up_pollfree() here might be less than ideal, but it is in line
with the comment at commit 42288cb44c4b ("wait: add wake_up_pollfree()")
since the waitqueue's lifetime is not tied to file's one and can be
considered as another special case. While this would be fixable by somehow
making cgroup_file_release() be tied to the fput(), it would require
sizable refactoring at cgroups or higher layer which might be more
justifiable if we identify more cases like this.
BUG: KASAN: use-after-free in _raw_spin_lock_irqsave+0x60/0xc0
Write of size 4 at addr ffff88810e625328 by task a.out/4404
Jakub Kicinski [Tue, 14 Feb 2023 06:53:55 +0000 (22:53 -0800)]
net: mpls: fix stale pointer if allocation fails during device rename
lianhui reports that when MPLS fails to register the sysctl table
under new location (during device rename) the old pointers won't
get overwritten and may be freed again (double free).
Handle this gracefully. The best option would be unregistering
the MPLS from the device completely on failure, but unfortunately
mpls_ifdown() can fail. So failing fully is also unreliable.
Another option is to register the new table first then only
remove old one if the new one succeeds. That requires more
code, changes order of notifications and two tables may be
visible at the same time.
sysctl point is not used in the rest of the code - set to NULL
on failures and skip unregister if already NULL.
Reported-by: lianhui tang <bluetlh@gmail.com> Fixes: 0fae3bf018d9 ("mpls: handle device renames for per-device sysctls") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
It is because commit a41dad905e5a ("iov_iter: saner checks for attempt
to copy to/from iterator") has introduced sanity check for copying
from/to iov iterator. Lacking of copy direction from the iterator
viewpoint would lead to kernel stack trace like above.
This commit fixes this issue by initializing the iov iterator with
the correct copy direction when sending SYN or ACK without data.
Miroslav Lichvar [Mon, 13 Feb 2023 18:58:22 +0000 (10:58 -0800)]
igb: Fix PPS input and output using 3rd and 4th SDP
Fix handling of the tsync interrupt to compare the pin number with
IGB_N_SDP instead of IGB_N_EXTTS/IGB_N_PEROUT and fix the indexing to
the perout array.
Fixes: cf99c1dd7b77 ("igb: move PEROUT and EXTTS isr logic to separate functions") Reported-by: Matt Corallo <ntp-lists@mattcorallo.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230213185822.3960072-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 15 Feb 2023 04:41:23 +0000 (20:41 -0800)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-02-13 (ice)
This series contains updates to ice driver only.
Michal fixes check of scheduling node weight and priority to be done
against desired value, not current value.
Jesse adds setting of all multicast when adding promiscuous mode to
resolve traffic being lost due to filter settings.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: fix lost multicast packets in promisc mode
ice: Fix check for weight and priority of a scheduling node
====================
Eric Dumazet [Mon, 13 Feb 2023 16:00:59 +0000 (16:00 +0000)]
net: use a bounce buffer for copying skb->mark
syzbot found arm64 builds would crash in sock_recv_mark()
when CONFIG_HARDENED_USERCOPY=y
x86 and powerpc are not detecting the issue because
they define user_access_begin.
This will be handled in a different patch,
because a check_object_size() is missing.
Only data from skb->cb[] can be copied directly to/from user space,
as explained in commit 79a8a642bf05 ("net: Whitelist
the skbuff_head_cache "cb" field")
Zack Rusin [Sat, 11 Feb 2023 05:05:14 +0000 (00:05 -0500)]
drm/vmwgfx: Do not drop the reference to the handle too soon
v3: Fix vmw_user_bo_lookup which was also dropping the gem reference
before the kernel was done with buffer depending on userspace doing
the right thing. Same bug, different spot.
It is possible for userspace to predict the next buffer handle and
to destroy the buffer while it's still used by the kernel. Delay
dropping the internal reference on the buffers until kernel is done
with them.
Instead of immediately dropping the gem reference in vmw_user_bo_lookup
and vmw_gem_object_create_with_handle let the callers decide when they're
ready give the control back to userspace.
Also fixes the second usage of vmw_gem_object_create_with_handle in
vmwgfx_surface.c which wasn't grabbing an explicit reference
to the gem object which could have been destroyed by the userspace
on the owning surface at any point.
Zack Rusin [Wed, 8 Feb 2023 18:00:50 +0000 (13:00 -0500)]
drm/vmwgfx: Stop accessing buffer objects which failed init
ttm_bo_init_reserved on failure puts the buffer object back which
causes it to be deleted, but kfree was still being called on the same
buffer in vmw_bo_create leading to a double free.
After the double free the vmw_gem_object_create_with_handle was
setting the gem function objects before checking the return status
of vmw_bo_create leading to null pointer access.
Fix the entire path by relaying on ttm_bo_init_reserved to delete the
buffer objects on failure and making sure the return status is checked
before setting the gem function objects on the buffer object.
Matt Roper [Wed, 1 Feb 2023 22:28:29 +0000 (14:28 -0800)]
drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list
The UNSLICE_UNIT_LEVEL_CLKGATE register programmed by this workaround
has 'BUS' style reset, indicating that it does not lose its value on
engine resets. Furthermore, this register is part of the GT forcewake
domain rather than the RENDER domain, so it should not be impacted by
RCS engine resets. As such, we should implement this on the GT
workaround list rather than an engine list.
Jason Xing [Thu, 9 Feb 2023 02:41:28 +0000 (10:41 +0800)]
ixgbe: add double of VLAN header when computing the max MTU
Include the second VLAN HLEN into account when computing the maximum
MTU size as other drivers do.
Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jason Xing [Wed, 8 Feb 2023 02:43:33 +0000 (10:43 +0800)]
i40e: add double of VLAN header when computing the max MTU
Include the second VLAN HLEN into account when computing the maximum
MTU size as other drivers do.
Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jason Xing [Wed, 8 Feb 2023 02:43:32 +0000 (10:43 +0800)]
ixgbe: allow to increase MTU to 3K with XDP enabled
Recently I encountered one case where I cannot increase the MTU size
directly from 1500 to a much bigger value with XDP enabled if the
server is equipped with IXGBE card, which happened on thousands of
servers in production environment. After applying the current patch,
we can set the maximum MTU size to 3K.
This patch follows the behavior of changing MTU as i40e/ice does.
References:
[1] commit 23b44513c3e6 ("ice: allow 3k MTU for XDP")
[2] commit 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Linus Torvalds [Tue, 14 Feb 2023 17:17:01 +0000 (09:17 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Certain AMD processors are vulnerable to a cross-thread return address
predictions bug. When running in SMT mode and one of the sibling
threads transitions out of C0 state, the other thread gets access to
twice as many entries in the RSB, but unfortunately the predictions of
the now-halted logical processor are not purged. Therefore, the
executing processor could speculatively execute from locations that
the now-halted processor had trained the RSB on.
The Spectre v2 mitigations cover the Linux kernel, as it fills the RSB
when context switching to the idle thread. However, KVM allows a VMM
to prevent exiting guest mode when transitioning out of C0 using the
KVM_CAP_X86_DISABLE_EXITS capability can be used by a VMM to change
this behavior. To mitigate the cross-thread return address predictions
bug, a VMM must not be allowed to override the default behavior to
intercept C0 transitions.
These patches introduce a KVM module parameter that, if set, will
prevent the user from disabling the HLT, MWAIT and CSTATE exits"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
Documentation/hw-vuln: Add documentation for Cross-Thread Return Predictions
KVM: x86: Mitigate the cross-thread return address predictions bug
x86/speculation: Identify processors vulnerable to SMT RSB predictions