Reviewed-by: John Sobecki <john.sobecki@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Rajan Shanmugavelu <rajan.shanmugavelu@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Hannes Reinecke [Fri, 30 Sep 2016 09:01:17 +0000 (11:01 +0200)]
scsi: libfc: Do not drop down to FLOGI for fc_rport_login()
When fc_rport_login() is called while the rport is not
in RPORT_ST_INIT, RPORT_ST_READY, or RPORT_ST_DELETE
login is already in progress and there's no need to
drop down to FLOGI; doing so will only confuse the
other side.
Reviewed-by: John Sobecki <john.sobecki@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Rajan Shanmugavelu <rajan.shanmugavelu@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Chad Dupuis [Fri, 30 Sep 2016 09:01:16 +0000 (11:01 +0200)]
scsi: libfc: Do not take rdata->rp_mutex when processing a -FC_EX_CLOSED ELS response.
When an ELS response handler receives a -FC_EX_CLOSED, the rdata->rp_mutex is
already held which can lead to a deadlock condition like the following stack trace:
The other ELS handlers need to follow the FLOGI response handler and simply do
a kref_put against the fc_rport_priv struct and exit when receving a
-FC_EX_CLOSED response.
Reviewed-by: John Sobecki <john.sobecki@oracle.com> Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Rajan Shanmugavelu <rajan.shanmugavelu@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Hannes Reinecke [Fri, 30 Sep 2016 09:01:15 +0000 (11:01 +0200)]
scsi: libfc: Fixup disc_mutex handling
The list of attached 'rdata' remote port structures is RCU
protected, so there is no need to take the 'disc_mutex' when
traversing it.
Rather we should be using rcu_read_lock() and kref_get_unless_zero()
to validate the entries.
We need, however, take the disc_mutex when deleting an entry;
otherwise we risk clashes with list_add.
Reviewed-by: John Sobecki <john.sobecki@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Rajan Shanmugavelu <rajan.shanmugavelu@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Ajaykumar Hotchandani [Wed, 27 Feb 2019 22:53:57 +0000 (14:53 -0800)]
xve: arm ud tx cq to generate completion interrupts
IPoIB polls for UD send cq for every 16th post_send() request to reduce
interrupt count; and it does not arm UD send cq (16 is controlled by
MAX_SEND_CQE variable)
XVE has followed IPoIB methodology in terms of handling UD send cq;
however, it missed to poll send cq after certain number of iterations.
This makes freeing of resources related to work request unreliable since
completion arrival is not controlled. This caused problem for live
migration; since initial UDP and ICMP skbs which are using UD work
requests are not getting freed. And, xenwatch process is getting stuck
on waiting for these skbs to be freed.
This patch does following:
- arm send cq at initialization. This will generate interrupt for
initial ud send requests.
- Once polling of send cq is completed, arm send cq again to generate
interrupt whenever next cqe arrives.
I'm going back to interrupt mechanism, since UD workload for xve is
extremely limited. And, I don't expect to generate interrupt flood here.
And, I don't want to miss out on freeing of skb (for example, if
scenario ends up as, only 10 post_send() are attempted for UD QP; and
after that, we try to live migrate that VM, we may miss completion if
our logic is, poll CQ at every 16th post_send() iteration)
Signed-off-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com> Reviewed-by: Chien Yen <chien.yen@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Alexei Starovoitov [Fri, 1 May 2015 03:14:07 +0000 (20:14 -0700)]
net: sched: run ingress qdisc without locks
TC classifiers/actions were converted to RCU by John in the series:
http://thread.gmane.org/gmane.linux.network/329739/focus=329739
and many follow on patches.
This is the last patch from that series that finally drops
ingress spin_lock.
Single cpu ingress+u32 performance goes from 22.9 Mpps to 24.5 Mpps.
In two cpu case when both cores are receiving traffic on the same
device and go into the same ingress+u32 the performance jumps
from 4.5 + 4.5 Mpps to 23.5 + 23.5 Mpps
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 087c1a601ad7f851a2d31f5fa0e5e9dfc766df55)
Orabug: 29395374 Signed-off-by: Calum Mackay <calum.mackay@oracle.com> Reviewed-by: Laurence Rochfort <laurence.rochfort@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
The logic that polls for the firmware message response uses a shorter
sleep interval for the first few passes. But there was a typo so it
was using the wrong counter (larger counter) for these short sleep
passes. The result is a slightly shorter timeout period for these
firmware messages than intended. Fix it by using the proper counter.
Fixes: 9751e8e71487 ("bnxt_en: reduce timeout on initial HWRM calls") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: Allen Pais <allen.pais@oracle.com>
The code waits up to 20 usec for the firmware response to complete
once we've seen the valid response header in the buffer. It turns
out that in some scenarios, this wait time is not long enough.
Extend it to 150 usec and use usleep_range() instead of udelay().
Fixes: 9751e8e71487 ("bnxt_en: reduce timeout on initial HWRM calls") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: Allen Pais <allen.pais@oracle.com>
Syzbot caught an oops at unregister_shrinker() because combination of
commit 1d3d4437eae1bb29 ("vmscan: per-node deferred work") and fault
injection made register_shrinker() fail and the caller of
register_shrinker() did not check for failure.
Since allowing register_shrinker() callers to call unregister_shrinker()
when register_shrinker() failed can simplify error recovery path, this
patch makes unregister_shrinker() no-op when register_shrinker() failed.
Also, reset shrinker->nr_deferred in case unregister_shrinker() was
by error called twice.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Glauber Costa <glauber@scylladb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7880fc541566166d140954825fc83c826534e622)
Signed-off-by: John Sobecki <john.sobecki@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Joe Jin <joe.jin@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
David Howells [Wed, 24 Feb 2016 14:37:54 +0000 (14:37 +0000)]
X.509: Handle midnight alternative notation in GeneralizedTime
The ASN.1 GeneralizedTime object carries an ISO 8601 format date and time.
The time is permitted to show midnight as 00:00 or 24:00 (the latter being
equivalent of 00:00 of the following day).
The permitted value is checked in x509_decode_time() but the actual
handling is left to mktime64().
Without this patch, certain X.509 certificates will be rejected and could
lead to an unbootable kernel.
Note that with this patch we also permit any 24:mm:ss time and extend this
to UTCTime, which whilst not strictly correct don't permit much leeway in
fiddling date strings.
Reported-by: Rudolf Polzer <rpolzer@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
cc: David Woodhouse <David.Woodhouse@intel.com>
cc: John Stultz <john.stultz@linaro.org>
Orabug: 29460344
CVE: CVE-2015-5327
(cherry picked from commit 7650cb80e4e90b0fae7854b6008a46d24360515f) Signed-off-by: Dan Duval <dan.duval@oracle.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
David Howells [Wed, 24 Feb 2016 14:37:53 +0000 (14:37 +0000)]
X.509: Support leap seconds
The format of ASN.1 GeneralizedTime seems to be specified by ISO 8601
[X.680 46.3] and this apparently supports leap seconds (ie. the seconds
field is 60). It's not entirely clear that ASN.1 expects it, but we can
relax the seconds check slightly for GeneralizedTime.
This results in us passing a time with sec as 60 to mktime64(), which
handles it as being a duplicate of the 0th second of the next minute.
We can't really do otherwise without giving the kernel much greater
knowledge of where all the leap seconds are. Unfortunately, this would
require change the mapping of the kernel's current-time-in-seconds.
UTCTime, however, only supports a seconds value in the range 00-59, but for
the sake of simplicity allow this with UTCTime also.
Without this patch, certain X.509 certificates will be rejected,
potentially making a kernel unbootable.
Reported-by: Rudolf Polzer <rpolzer@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
cc: David Woodhouse <David.Woodhouse@intel.com>
cc: John Stultz <john.stultz@linaro.org>
Orabug: 29460344
CVE: CVE-2015-5327
(cherry picked from commit da02559c9f864c8d62f524c1e0b64173711a16ab) Signed-off-by: Dan Duval <dan.duval@oracle.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
David Howells [Thu, 12 Nov 2015 09:36:40 +0000 (09:36 +0000)]
X.509: Fix the time validation [ver #2]
This fixes CVE-2015-5327. It affects kernels from 4.3-rc1 onwards.
Fix the X.509 time validation to use month number-1 when looking up the
number of days in that month. Also put the month number validation before
doing the lookup so as not to risk overrunning the array.
Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
is_broadcast_packet() expands to compare_ether_addr() which doesn't
exist since commit 7367d0b573d1 ("drivers/net: Convert uses of
compare_ether_addr to ether_addr_equal"). It turns out it's actually not
used.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
the be2net implementation of .ndo_tunnel_{add,del}() changes the value of
NETIF_F_GSO_UDP_TUNNEL bit in 'features' and 'hw_features', but it forgets
to call netdev_features_change(). Moreover, ethtool setting for that bit
can potentially be reverted after a tunnel is added or removed.
GSO already does software segmentation when 'hw_enc_features' is 0, even
if VXLAN offload is turned on. In addition, commit 096de2f83ebc ("benet:
stricter vxlan offloading check in be_features_check") avoids hardware
segmentation of non-VXLAN tunneled packets, or VXLAN packets having wrong
destination port. So, it's safe to avoid flipping the above feature on
addition/deletion of VXLAN tunnels.
Fixes: 630f4b70567f ("be2net: Export tunnel offloads only when a VxLAN tunnel is created") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
DMA allocated memory is lost in be_cmd_get_profile_config() when we
call it with non-NULL port_res parameter.
Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
When enable skyhawk only it will reduce module size by ~20kb
New help style in Kconfig
Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Petr Oros <poros@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Addresses-Coverity-ID: 114787 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Trivial fix to spelling mistake in dev_info message.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
1) This patch gathers and prints the following info that can
help in diagnosing the cause of a TX-timeout.
a) TX queue and completion queue entries.
b) SKB and TCP/UDP header details.
2) For Lancer NICs (TX-timeout recovery is not supported for
BE3/Skyhawk-R NICs), it recovers from the TX timeout as follows:
a) On a TX-timeout, driver sets the PHYSDEV_CONTROL_FW_RESET_MASK
bit in the PHYSDEV_CONTROL register. Lancer firmware goes into
an error state and indicates this back to the driver via a bit
in a doorbell register.
b) Driver detects this and calls be_err_recover(). DMA is disabled,
all pending TX skbs are unmapped and freed (be_close()). All rings
are destroyed (be_clear()).
c) The driver waits for the FW to re-initialize and re-creates all
rings along with other data structs (be_resume())
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
The current position of .rss_flags field in struct rss_info causes
that fields .rsstable and .rssqueue (both 128 bytes long) crosses
cache-line boundaries. Moving it at the end properly align all fields.
Signed-off-by: Ivan Vecera <cera@cera.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
- Unionize two u8 fields where only one of them is used depending on NIC
chipset.
- Move recovery_supported field after that union
These changes eliminate 7-bytes hole in the struct and makes it smaller
by 8 bytes.
Signed-off-by: Ivan Vecera <cera@cera.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Ivan Vecera <cera@cera.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Ivan Vecera <cera@cera.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Re-order fields in struct be_eq_obj to ensure that .napi field begins
at start of cache-line. Also the .adapter field is moved to the first
cache-line next to .q field and 3 fields (idx,msi_idx,spurious_intr)
and the 4-bytes hole to 3rd cache-line.
Signed-off-by: Ivan Vecera <cera@cera.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
The commit fb6113e688e0 ("be2net: get rid of custom busy poll code")
replaced custom busy-poll code by the generic one but left several
macros and fields in struct be_eq_obj that are currently unused.
Remove this stuff.
Fixes: fb6113e688e0 ("be2net: get rid of custom busy poll code") Signed-off-by: Ivan Vecera <cera@cera.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
The commit 2632bafd74ae ("be2net: fix adaptive interrupt coalescing")
introduced a separate struct be_aic_obj to hold AIC information but
unfortunately left the old stuff in be_eq_obj. So remove it.
Fixes: 2632bafd74ae ("be2net: fix adaptive interrupt coalescing") Signed-off-by: Ivan Vecera <cera@cera.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Check for 0xE00 (RECOVERABLE_ERR) along with ARMFW UE (0x0)
in be_detect_error() to know whether the error is valid error or not
Fixes: 673c96e5a ("be2net: Fix UE detection logic for BE3") Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Martin K. Petersen [Thu, 28 Sep 2017 01:38:59 +0000 (21:38 -0400)]
scsi: sd: Do not override max_sectors_kb sysfs setting
A user may lower the max_sectors_kb setting in sysfs to accommodate
certain workloads. Previously we would always set the max I/O size to
either the block layer default or the optional preferred I/O size
reported by the device.
Keep the current heuristics for the initial setting of max_sectors_kb.
For subsequent invocations, only update the current queue limit if it
exceeds the capabilities of the hardware.
Cc: <stable@vger.kernel.org> Reported-by: Don Brace <don.brace@microsemi.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Tested-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 77082ca503bed061f7fbda7cfd7c93beda967a41)
Signed-off-by: John Sobecki <john.sobecki@oracle.com> Tested-by: Dustin Samko <Dustin.Samko@gm.com> Reviewed-by: Ritika Srivastava <ritika.srivastava@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
There have been reports of oversize UDP packets being sent to the
driver to be transmitted, causing error conditions. The issue is
likely caused by the dst of the SKB switching between 'lo' with
64K MTU and the hardware device with a smaller MTU. Patches are
being proposed by Mahesh Bandewar <maheshb@google.com> to fix the
issue.
In the meantime, add a quick length check in the driver to prevent
the error. The driver uses the TX packet size as index to look up an
array to setup the TX BD. The array is large enough to support all MTU
sizes supported by the driver. The oversize TX packet causes the
driver to index beyond the array and put garbage values into the
TX BD. Add a simple check to prevent this.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2b3c6885386020b1b9d92d45e8349637e27d1f66) Signed-off-by: Brian Maly <brian.maly@oracle.com>
x86/speculation: Read per-cpu value of x86_spec_ctrl_priv in x86_virt_spec_ctrl()
In x86_virt_spec_ctrl(), when IBRS is in use on the host, the baseline to
restore the host SPEC_CTRL must be taken from the privileged value which
has the IBRS bit set. In addition, it must be read from the per cpu variable
(x86_spec_ctrl_priv_cpu) that holds the SPEC_CTRL MSR for the current cpu.
Currently, this line:
hostval = this_cpu_read(x86_spec_ctrl_priv);
incorrectly uses the global x86_spec_ctrl_priv instead of the correct
per-cpu variable x86_spec_ctrl_priv_cpu, which assigns spurious values
to hostval.
Fix this issue by reading the correct per-cpu value instead of the
global.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Alejandro Jimenez [Wed, 20 Mar 2019 15:00:49 +0000 (11:00 -0400)]
x86/speculation: Keep enhanced IBRS on when prctl is used for SSBD control
When using the prctl system call to enable/disable SSBD mitigation for a
specific thread, it is necessary to update the SPEC_CTRL MSR on the CPU
running it. The value used as the base for the msr that will be written
is x86_spec_ctrl_base, which does not have the IBRS bit set. The relevant
SSBD bits are OR'd to this value before it is written to the MSR, but
the IBRS bit will remain unset.
As a result, the thread that requested the SSBD protection will run without
IBRS enabled, and when it is context switched out, IBRS will not be turned
back on again. This is not a problem in processors that use basic IBRS since
the bit is constantly toggled on kernel entry, but with enhanced IBRS this
is not necessary and therefore the bit remains unset.
Fix it by adding a check to detect when enhanced IBRS is in use, and add
the bit to the msr value that will be used as the baseline.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Hui Peng [Wed, 12 Dec 2018 11:42:24 +0000 (12:42 +0100)]
USB: hso: Fix OOB memory access in hso_probe/hso_get_config_data
The function hso_probe reads if_num from the USB device (as an u8) and uses
it without a length check to index an array, resulting in an OOB memory read
in hso_probe or hso_get_config_data.
Add a length check for both locations and updated hso_probe to bail on
error.
This issue has been assigned CVE-2018-19985.
Reported-by: Hui Peng <benquike@gmail.com> Reported-by: Mathias Payer <mathias.payer@nebelwelt.net> Signed-off-by: Hui Peng <benquike@gmail.com> Signed-off-by: Mathias Payer <mathias.payer@nebelwelt.net> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5146f95df782b0ac61abde36567e718692725c89)
swiotlb: save io_tlb_used to local variable before leaving critical section
When swiotlb is full, the kernel would print io_tlb_used. However, the
result might be inaccurate at that time because we have left the critical
section protected by spinlock.
Therefore, we backup the io_tlb_used into local variable before leaving
critical section.
Fixes: 83ca25948940 ("swiotlb: dump used and total slots when swiotlb buffer is full") Suggested-by: HÃ¥kon Bugge <haakon.bugge@oracle.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 29637525
(cherry picked from commit 53b29c336830db48ad3dc737f88b8c065b1f0851) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
kernel/dma/swiotlb.c does not exist.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-By: Joe Jin <joe.jin@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
swiotlb: dump used and total slots when swiotlb buffer is full
So far the kernel only prints the requested size if swiotlb buffer if full.
It is not possible to know whether it is simply an out of buffer, or it is
because swiotlb cannot allocate buffer with the requested size due to
fragmentation.
As 'io_tlb_used' is available since commit 71602fe6d4e9 ("swiotlb: add
debugfs to track swiotlb buffer usage"), both 'io_tlb_used' and
'io_tlb_nslabs' are printed when swiotlb buffer is full.
(cherry picked from commit 83ca259489409a1fe8a83dad83a82f32174d4f31) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
kernel/dma/swiotlb.c does not exist.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-By: Joe Jin <joe.jin@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
x86/bugs, kvm: don't miss SSBD when IBRS is in use.
When IBRS is in use, we unconditionnaly need to write to MSR_IA32_SPEC_CTRL
(it acts as a barrier) but we were failing to take into account the SSBD
state from the thread info flags, potentially disabling SSBD on the host on
tasks that needs it after a vmexit.
Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Shuning Zhang [Mon, 15 Apr 2019 15:01:37 +0000 (23:01 +0800)]
cifs: Fix use after free of a mid_q_entry
With protocol version 2.0 mounts we have seen crashes with corrupt mid
entries. Either the server->pending_mid_q list becomes corrupt with a
cyclic reference in one element or a mid object fetched by the
demultiplexer thread becomes overwritten during use.
Code review identified a race between the demultiplexer thread and the
request issuing thread. The demultiplexer thread seems to be written
with the assumption that it is the sole user of the mid object until
it calls the mid callback which either wakes the issuer task or
deletes the mid.
This assumption is not true because the issuer task can be woken up
earlier by a signal. If the demultiplexer thread has proceeded as far
as setting the mid_state to MID_RESPONSE_RECEIVED then the issuer
thread will happily end up calling cifs_delete_mid while the
demultiplexer thread still is using the mid object.
Inserting a delay in the cifs demultiplexer thread widens the race
window and makes reproduction of the race very easy:
if (server->large_buf)
buf = server->bigbuf;
+ usleep_range(500, 4000);
server->lstrp = jiffies;
To resolve this I think the proper solution involves putting a
reference count on the mid object. This patch makes sure that the
demultiplexer thread holds a reference until it has finished
processing the transaction.
Cc: stable@vger.kernel.org Signed-off-by: Lars Persson <larper@axis.com> Acked-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
(cherry picked from commit 696e420bb2a6624478105651d5368d45b502b324)
Signed-off-by: Shuning Zhang <sunny.s.zhang@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
fs/cifs/connect.c
fs/cifs/smb2ops.c
fs/cifs/smb2transport.c
fs/cifs/transport.c
[
connect.c: contextual has changed.
smb2ops.c: hdr->Command changed to shdr->Command in the third line.
^
smb2transport.c: contextual has changed, the codes are the a statement
block of else, but this statement block has been moved outside.
transport.c: contextual has changed, the codes are the a statement
block of else, but this statement block has been moved outside.
]
Linus Torvalds [Mon, 22 Aug 2016 23:41:46 +0000 (16:41 -0700)]
binfmt_elf: switch to new creds when switching to new mm
We used to delay switching to the new credentials until after we had
mapped the executable (and possible elf interpreter). That was kind of
odd to begin with, since the new executable will actually then _run_
with the new creds, but whatever.
The bigger problem was that we also want to make sure that we turn off
prof events and tracing before we start mapping the new executable
state. So while this is a cleanup, it's also a fix for a possible
information leak.
Reported-by: Robert Święcki <robert@swiecki.net> Tested-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 9f834ec18defc369d73ccf9e87a2790bfa05bf46)
Signed-off-by: John Donnelly <John.P.Donnelly@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Boris Ostrovsky [Thu, 9 May 2019 15:04:38 +0000 (11:04 -0400)]
x86/microcode: Don't return error if microcode update is not needed
Commit 347b54683 ("x86/microcode: Synchronize late microcode loading")
incorrectly returns -EINVAL error on all request_microcode_fw() failures
in reload_store(). In fact, when update is not needed or if there is no
microcode to load we don't need to treat this as an error.
Konrad Rzeszutek Wilk [Fri, 3 May 2019 02:01:38 +0000 (22:01 -0400)]
x86/mds: Add empty commit for CVE-2019-11091
The fixes for MDS also cover this CVE which states to be:
"Microarchitectural Data SamplingUncacheable Memory(MDSUM): Uncacheable
memory on some microprocessors utilizing speculative execution may allow
an authenticated user to potentially enable information disclosure
via a side channel with local access"
Boris Ostrovsky [Wed, 8 May 2019 18:50:39 +0000 (14:50 -0400)]
x86/microcode: Add loader version file in debugfs
We want to be able to find out whether late microcode loader is using a
"safe" method where the system is in stop_machines() --- i.e. all cores
are pinned in kernel with interrupts disabled. This is especially
important for core siblings --- if one thread is loading microcode while
the other is executing instructions that are being patched then bad
things may happen, including MCEs.
Presense of this file indicates that we are all good. We will also
provide version value of "1".
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Borislav Petkov [Wed, 14 Mar 2018 18:36:15 +0000 (19:36 +0100)]
x86/microcode: Fix CPU synchronization routine
Emanuel reported an issue with a hang during microcode update because my
dumb idea to use one atomic synchronization variable for both rendezvous
- before and after update - was simply bollocks:
microcode: microcode_reload_late: late_cpus: 4
microcode: __reload_late: cpu 2 entered
microcode: __reload_late: cpu 1 entered
microcode: __reload_late: cpu 3 entered
microcode: __reload_late: cpu 0 entered
microcode: __reload_late: cpu 1 left
microcode: Timeout while waiting for CPUs rendezvous, remaining: 1
CPU1 above would finish, leave and the others will still spin waiting for
it to join.
So do two synchronization atomics instead, which makes the code a lot more
straightforward.
Also, since the update is serialized and it also takes quite some time per
microcode engine, increase the exit timeout by the number of CPUs on the
system.
That's ok because the moment all CPUs are done, that timeout will be cut
short.
Furthermore, panic when some of the CPUs timeout when returning from a
microcode update: we can't allow a system with not all cores updated.
Also, as an optimization, do not do the exit sync if microcode wasn't
updated.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Borislav Petkov [Wed, 8 May 2019 15:10:41 +0000 (11:10 -0400)]
x86/microcode: Synchronize late microcode loading
Original idea by Ashok, completely rewritten by Borislav.
Before you read any further: the early loading method is still the
preferred one and you should always do that. The following patch is
improving the late loading mechanism for long running jobs and cloud use
cases.
Gather all cores and serialize the microcode update on them by doing it
one-by-one to make the late update process as reliable as possible and
avoid potential issues caused by the microcode update.
[ Borislav: Rewrite completely. ]
Co-developed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: https://lkml.kernel.org/r/20180228102846.13447-8-bp@alien8.de
(cherry picked from commit a5321aec6412b20b5ad15db2d6b916c05349dbff)
Conflicts --- quite a few. Notable ones:
* We don't have microcode cache and so call request_microcode_fw() for
each CPU
* No need to get/put_online_cpus() --- they are part of stop_machine()
* No stop_machine_cpuslocked() in uek4 but uek4's version of
stop_machine() prevents CPU hotplug.
* uek4's has fewer result codes for microcode operations and thus error
handling is slightly different.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Configure x86 runtime CPU speculation bug mitigations in accordance with
the 'mitigations=' cmdline option. This affects Meltdown, Spectre v2,
Speculative Store Bypass, and L1TF.
The default behavior is unchanged.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86) Reviewed-by: Jiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <bp@alien8.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Jon Masters <jcm@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-arch@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Price <steven.price@arm.com> Cc: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/6616d0ae169308516cfdf5216bedd169f8a8291b.1555085500.git.jpoimboe@redhat.com
(cherry picked from commit aaa95f2f1112dd4ec31ae13c4cf877dc7c7fcbc8)
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
Documentation/admin-guide/kernel-parameters.txt
arch/x86/kernel/cpu/bugs.c
arch/x86/mm/pti.c
Documentation/admin-guide/kernel-parameters.txt: different location
arch/x86/kernel/cpu/bugs.c: different name (bugs_64.c). Also we have different logic in nospectre_v2.
arch/x86/mm/pti.c: different location for mitigation arch/x86/mm/kaiser.c.
Keeping track of the number of mitigations for all the CPU speculation
bugs has become overwhelming for many users. It's getting more and more
complicated to decide which mitigations are needed for a given
architecture. Complicating matters is the fact that each arch tends to
have its own custom way to mitigate the same vulnerability.
Most users fall into a few basic categories:
a) they want all mitigations off;
b) they want all reasonable mitigations on, with SMT enabled even if
it's vulnerable; or
c) they want all reasonable mitigations on, with SMT disabled if
vulnerable.
Define a set of curated, arch-independent options, each of which is an
aggregation of existing options:
- mitigations=off: Disable all mitigations.
- mitigations=auto: [default] Enable all the default mitigations, but
leave SMT enabled, even if it's vulnerable.
- mitigations=auto,nosmt: Enable all the default mitigations, disabling
SMT if needed by a mitigation.
Currently, these options are placeholders which don't actually do
anything. They will be fleshed out in upcoming patches.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> (on x86) Reviewed-by: Jiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <bp@alien8.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Jon Masters <jcm@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-arch@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Price <steven.price@arm.com> Cc: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/b07a8ef9b7c5055c3a4637c87d07c296d5016fe0.1555085500.git.jpoimboe@redhat.com
(cherry picked from commit 6cbbaa933b325234d6ffc93836d6b7c06dea7a56)
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
Documentation/admin-guide/kernel-parameters.txt
Different location: Documentation/kernel-parameters.txt
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
arch/x86/kernel/cpu/bugs.c
bugs.c vs bug_64.c in UEK4
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
arch/x86/kernel/cpu/bugs.c
bugs.c vs bugs_64.c on UEK4
Mihai Carabas [Fri, 12 Apr 2019 11:20:40 +0000 (14:20 +0300)]
x86/speculation/mds: update mds_mitigation to reflect debugfs configuration
If we enable mds_user_clear, we set mds_mitigation to MDS_MITIGATION_FULL or
MDS_MITIGATION_VMWERV. When we disable mds_user_clear, we set mds_mitigation to
MDS_MITIGATION_IDLE if mds_idle_clear, otherwise MDS_MITIGATION_OFF. If we
enable mds_idle_clear, we set mds_mitigation to MDS_MITIGATION_IDLE only if
mds_user_clear is disabled. When we disable mds_idle_clear, we set
mds_mitigation to MDS_MITIGATION_OFF if mds_user_clear is disabled.
Mihai Carabas [Mon, 8 Apr 2019 10:48:09 +0000 (13:48 +0300)]
x86/speculation/mds: fix microcode late loading
In the microcode late loading case we have to:
- clear the CPU bugs related to MDS to be re-evaluated
- add proper evaluation of the MDS state and enable mitigation if necessary.
If the user has enforced off or idle mitigation, we keep it. Also if the
microcode fixes the MDS bug, mitigation will be turned off.
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
arch/x86/kernel/cpu/bugs.c
bugs.c vs bugs_64.c: we do not have arch_smt_update. Squash the check in mds_select_mitigation.
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
Documentation/admin-guide/kernel-parameters.txt
arch/x86/kernel/cpu/bugs.c
bugs.64 vs bugs_64.c: different boot command line parsing code
Documentation/admin-guide/kernel-parameters.txt vs Documentation/kernel-parameters.txt
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jon Masters <jcm@redhat.com>
(cherry picked from commit 3b5d5994ef174daf5f77ba3f101cd879cf0c5fd6)
Move L1TF to a separate directory so the MDS stuff can be added at the
side. Otherwise the all hardware vulnerabilites have their own top level
entry. Should have done that right away.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jon Masters <jcm@redhat.com>
(cherry picked from commit 19049ee5b2543696dd3cb164a4c4e566984f9615)
In virtualized environments it can happen that the host has the microcode
update which utilizes the VERW instruction to clear CPU buffers, but the
hypervisor is not yet updated to expose the X86_FEATURE_MD_CLEAR CPUID bit
to guests.
Introduce an internal mitigation mode VWWERV which enables the invocation
of the CPU buffer clearing even if X86_FEATURE_MD_CLEAR is not set. If the
system has no updated microcode this results in a pointless execution of
the VERW instruction wasting a few CPU cycles. If the microcode is updated,
but not exposed to a guest then the CPU buffers will be cleared.
That said: Virtual Machines Will Eventually Receive Vaccine
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
(cherry picked from commit a2227b3f734fad5ace7c99103d6a7bc020c193fd)
Signed-off-by: Kanth Ghatraju <kanth.ghatraju@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
The changes to arch/x86/kernel/cpu/bugs.c instead need to be made to
arch/x86/kernel/cpu/bugs_64.c.
Add the sysfs reporting file for MDS. It exposes the vulnerability and
mitigation state similar to the existing files for the other speculative
hardware vulnerabilities.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
(cherry picked from commit db366061fff1f76407cb5d1b0975fcc381400cc3)
Signed-off-by: Kanth Ghatraju <kanth.ghatraju@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
The changes to arch/x86/kernel/cpu/bugs.c instead need to be made to
arch/x86/kernel/cpu/bugs_64.c.
X86_HYPER_NATIVE doesn't exist so just leave that change out.
sched_smt_active() does not exist, instead use cpu_smp_control.
hypervisor_is_type replaced with cpu_has_hypervisor
Now that the mitigations are in place, add a command line parameter to
control the mitigation, a mitigation selector function and a SMT update
mechanism.
This is the minimal straight forward initial implementation which just
provides an always on/off mode. The command line parameter is:
mds=[full|off]
This is consistent with the existing mitigations for other speculative
hardware vulnerabilities.
The idle invocation is dynamically updated according to the SMT state of
the system similar to the dynamic update of the STIBP mitigation. The idle
mitigation is limited to CPUs which are only affected by MSBDS and not any
other variant, because the other variants cannot be mitigated on SMT
enabled systems.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
(cherry picked from commit 4cad86e4abd472f637038e0ad70a70d0d7333f83)
Signed-off-by: Kanth Ghatraju <kanth.ghatraju@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
The changes to arch/x86/kernel/cpu/bugs.c instead need to be made to
arch/x86/kernel/cpu/bugs_64.c.
Add a static key which controls the invocation of the CPU buffer clear
mechanism on idle entry. This is independent of other MDS mitigations
because the idle entry invocation to mitigate the potential leakage due to
store buffer repartitioning is only necessary on SMT systems.
Add the actual invocations to the different halt/mwait variants which
covers all usage sites. mwaitx is not patched as it's not available on
Intel CPUs.
The buffer clear is only invoked before entering the C-State to prevent
that stale data from the idling CPU is spilled to the Hyper-Thread sibling
after the Store buffer got repartitioned and all entries are available to
the non idle sibling.
When coming out of idle the store buffer is partitioned again so each
sibling has half of it available. Now CPU which returned from idle could be
speculatively exposed to contents of the sibling, but the buffers are
flushed either on exit to user space or on VMENTER.
When later on conditional buffer clearing is implemented on top of this,
then there is no action required either because before returning to user
space the context switch will set the condition flag which causes a flush
on the return to user path.
Note, that the buffer clearing on idle is only sensible on CPUs which are
solely affected by MSBDS and not any other variant of MDS because the other
MDS variants cannot be mitigated when SMT is enabled, so the buffer
clearing on idle would be a window dressing exercise.
This intentionally does not handle the case in the acpi/processor_idle
driver which uses the legacy IO port interface for C-State transitions for
two reasons:
- The acpi/processor_idle driver was replaced by the intel_idle driver
almost a decade ago. Anything Nehalem upwards supports it and defaults
to that new driver.
- The legacy IO port interface is likely to be used on older and therefore
unaffected CPUs or on systems which do not receive microcode updates
anymore, so there is no point in adding that.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
(cherry picked from commit 7af0d7a8d40fa9dbf7700ed8e78f93c90b9a9e8b)
Signed-off-by: Kanth Ghatraju <kanth.ghatraju@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Conflicts:
Ignore added comments in the nonexistent function in mwait.h
Port the changes in bugs.c to bugs_64.c.
Move #include <cpufeature.h> in alternative.h to resolve implicit definition error.
Signed-off-by: Kanth Ghatraju <kanth.ghatraju@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
Changes from bugs.c imported to bugs_64.c
Add a static key which controls the invocation of the CPU buffer clear
mechanism on exit to user space and add the call into
prepare_exit_to_usermode() and do_nmi() right before actually returning.
Add documentation which kernel to user space transition this covers and
explain why some corner cases are not mitigated.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
(cherry picked from commit 62ba379c5925f285e3ab9362761c18823e5a049e)
Signed-off-by: Kanth Ghatraju <kanth.ghatraju@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
UEK4 uses bugs_64.c instead of bugs.c arch/x86/entry/common.c doesn't exist.
Make the corresponding changes to arch/x86/kernel/entry_64.S.
The Microarchitectural Data Sampling (MDS) vulernabilities are mitigated by
clearing the affected CPU buffers. The mechanism for clearing the buffers
uses the unused and obsolete VERW instruction in combination with a
microcode update which triggers a CPU buffer clear when VERW is executed.
Provide a inline function with the assembly magic. The argument of the VERW
instruction must be a memory operand as documented:
"MD_CLEAR enumerates that the memory-operand variant of VERW (for
example, VERW m16) has been extended to also overwrite buffers affected
by MDS. This buffer overwriting functionality is not guaranteed for the
register operand variant of VERW."
Documentation also recommends to use a writable data segment selector:
"The buffer overwriting occurs regardless of the result of the VERW
permission check, as well as when the selector is null or causes a
descriptor load segment violation. However, for lowest latency we
recommend using a selector that indicates a valid writable data
segment."
Add x86 specific documentation about MDS and the internal workings of the
mitigation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
(cherry picked from commit 2f7a0abf72199b1c9cbaae99270343993b104921)
Signed-off-by: Kanth Ghatraju <kanth.ghatraju@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
Added an assembly version of mds_clear_cpu_buffers
X86_FEATURE_MD_CLEAR is a new CPUID bit which is set when microcode
provides the mechanism to invoke a flush of various exploitable CPU buffers
by invoking the VERW instruction.
Hand it through to guests so they can adjust their mitigations.
This also requires corresponding qemu changes, which are available
separately.
[ tglx: Massaged changelog ]
Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
(cherry picked from commit 0908473b20312b30f2600e4b16027d6c7facef4a)
Signed-off-by: Kanth Ghatraju <kanth.ghatraju@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Conflicts:
arch/x86/kvm/cpuid.c
Different initial content of cpuid bits.
This bug bit is set on CPUs which are only affected by Microarchitectural
Store Buffer Data Sampling (MSBDS) and not by any other MDS variant.
This is important because the Store Buffers are partitioned between
Hyper-Threads so cross thread forwarding is not possible. But if a thread
enters or exits a sleep state the store buffer is repartitioned which can
expose data from one thread to the other. This transition can be mitigated.
That means that for CPUs which are only affected by MSBDS SMT can be
enabled, if the CPU is not affected by other SMT sensitive vulnerabilities,
e.g. L1TF. The XEON PHI variants fall into that category. Also the
Silvermont/Airmont ATOMs, but for them it's not really relevant as they do
not support SMT, but mark them for completeness sake.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
(cherry picked from commit 24a8b2c167461f94542ecce1b0e48b30a9ef1a3b)
Microarchitectural Data Sampling (MDS), is a class of side channel attacks
on internal buffers in Intel CPUs. The variants are:
- Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126)
- Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130)
- Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127)
MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a
dependent load (store-to-load forwarding) as an optimization. The forward
can also happen to a faulting or assisting load operation for a different
memory address, which can be exploited under certain conditions. Store
buffers are partitioned between Hyper-Threads so cross thread forwarding is
not possible. But if a thread enters or exits a sleep state the store
buffer is repartitioned which can expose data from one thread to the other.
MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage
L1 miss situations and to hold data which is returned or sent in response
to a memory or I/O operation. Fill buffers can forward data to a load
operation and also write data to the cache. When the fill buffer is
deallocated it can retain the stale data of the preceding operations which
can then be forwarded to a faulting or assisting load operation, which can
be exploited under certain conditions. Fill buffers are shared between
Hyper-Threads so cross thread leakage is possible.
MLDPS leaks Load Port Data. Load ports are used to perform load operations
from memory or I/O. The received data is then forwarded to the register
file or a subsequent operation. In some implementations the Load Port can
contain stale data from a previous operation which can be forwarded to
faulting or assisting loads under certain conditions, which again can be
exploited eventually. Load ports are shared between Hyper-Threads so cross
thread leakage is possible.
All variants have the same mitigation for single CPU thread case (SMT off),
so the kernel can treat them as one MDS issue.
Add the basic infrastructure to detect if the current CPU is affected by
MDS.
Greg pointed out that speculation related bit defines are using (1 << N)
format instead of BIT(N). Aside of that (1 << N) is wrong as it should use
1UL at least.
Clean it up.
[ Josh Poimboeuf: Fix tools build ]
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com>
(cherry picked from commit 4b2fc235844db92090c944b92174afb341395c97)
Only CPUs which speculate can speculate. Therefore, it seems prudent
to test for cpu_no_speculation first and only then determine whether
a specific speculating CPU is susceptible to store bypass speculation.
This is underlined by all CPUs currently listed in cpu_no_speculation
were present in cpu_no_spec_store_bypass as well.
Henry Willard [Thu, 14 Mar 2019 18:01:01 +0000 (11:01 -0700)]
x86/apic: Make arch_setup_hwirq NUMA node aware
This is a second attempt to fix bug 29292411. The first attempt ran
afoul of special behavior in cluster_vector_allocation_domain() explained
below. This was discovered by QA immediately before QU7 was to be release,
so the original was reverted.
In a xen VM with vNUMA enabled, irq affinity for a device on node 1
may become stuck on CPU 0. /proc/irq/nnn/smp_affinity_list may show
affinity for all the CPUs on node 1, but this is wrong. All interrupts
are on the first CPU of node 0 which is usually CPU 0.
The problem is caused when __assign_irq_vector() is called by
arch_setup_hwirq() with a mask of all online CPUs, and then called later
with a mask including only the node 1 CPUs. The first call assigns affinity
to CPU 0, and the second tries to move affinity to the first online node 1
CPU. In the reported case this is always CPU 2. For some reason, the
CPU 0 affinity is never cleaned up, and all interrupts remain with CPU 0.
Since an incomplete move appears to be in progress, all attempts to
reassign affinity for the irq fail. Because of a quirk in how affinity is
displayed in /proc/irq/nnn/smp_affinity_list, changes may appear to work
temporarily.
It was not reproducible on baremetal on the machine I had available for
testing, but it is possible that it was observed on other machines. It
does not appear in UEK5. The APIC and IRQ code is very different in UEK5,
and the code changed here doesn't exist in UEK5. Upstream has completely
abandoned the UEK4 approach to IRQ management. It is unknown whether KVM
guests might see the same problem with UEK4.
Making arch_setup_hwirq() NUMA sensitive eliminates the problem by
using the correct cpumask for the node for the initial assignment. The
second assignment becomes a noop. After initialization is complete,
affinity can be moved to any CPU on any node and back without a problem.
However, cluster_vector_allocation_domain() contains a hack designed to
reduce vector pressure in cluster x2apic. Specifically, it relies on
the address of the cpumask passed to it to determine if this allocation
is for the default device bringup case or explicit migration. If the
address of the cpumask does not match what is returned by
apic->target_cpus(), it assumes it is the explicit migration case and goes
into cluster mode which uses up vectors on multiple CPUs. Since the
original patch modifies arch_setup_hwirq() to pass a cpumask with only
local CPUs in it, cluster_vector_allocation_domain() allocates for the
entire cluster rather than a single CPU. This can cause vector allocation
failures when there are a very large number of devices such as can be the
case when there are a large number of VFs (see bug 29534769).
Orabug: 29534769 Signed-off-by: Henry Willard <henry.willard@oracle.com> Reviewed-by: Shan Hai <shan.hai@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Eric Biggers [Thu, 8 Jun 2017 13:48:18 +0000 (14:48 +0100)]
KEYS: encrypted: fix buffer overread in valid_master_desc()
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
(cherry picked from commit 794b4bc292f5d31739d89c0202c54e7dc9bc3add)
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Signed-off-by: Allen Pais <allen.pais@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Alan Adamson [Fri, 29 Mar 2019 18:01:30 +0000 (11:01 -0700)]
scsi: target: add device vendor id, product id and revision configfs attributes
The vendor_id, product_id and revision attributes will allow for the
modification of the T10 Model and Revision strings returned in
inquiry responses. Its value can be viewed and modified via the
ConfigFS path at:
Signed-off-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
In preparation for supporting user provided vendor strings, add an extra
byte to the vendor, model and revision arrays in struct t10_wwn. This
ensures that the full INQUIRY data can be carried in the arrays along with
a null-terminator.
Change a number of array readers and writers so that they account for
explicit null-termination:
- The pscsi_set_inquiry_info() and emulate_model_alias_store() codepaths
don't currently explicitly null-terminate; fix this.
- Existing t10_wwn field dumps use for-loops which step over
null-terminators for right-padding.
+ Use printf with width specifiers instead.
Signed-off-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
David Disseldorp [Wed, 5 Dec 2018 12:18:34 +0000 (13:18 +0100)]
scsi: target: use consistent left-aligned ASCII INQUIRY data
spc5r17.pdf specifies:
4.3.1 ASCII data field requirements
ASCII data fields shall contain only ASCII printable characters (i.e.,
code values 20h to 7Eh) and may be terminated with one or more ASCII null
(00h) characters. ASCII data fields described as being left-aligned
shall have any unused bytes at the end of the field (i.e., highest
offset) and the unused bytes shall be filled with ASCII space characters
(20h).
LIO currently space-pads the T10 VENDOR IDENTIFICATION and PRODUCT
IDENTIFICATION fields in the standard INQUIRY data. However, the PRODUCT
REVISION LEVEL field in the standard INQUIRY data as well as the T10 VENDOR
IDENTIFICATION field in the INQUIRY Device Identification VPD Page are
zero-terminated/zero-padded.
Fix this inconsistency by using space-padding for all of the above fields.
Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bryant G. Ly <bly@catalogicsoftware.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 0de263577de5d5e052be5f4f93334e63cc8a7f0b) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
drivers/target/target_core_spc.c
Signed-off-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Ext4 needs to serialize unaligned direct AIO because the zeroing of
partial blocks of two competing unaligned AIOs can result in data
corruption.
However it decides not to serialize if the potentially unaligned aio is
past i_size with the rationale that no pending writes are possible past
i_size. Unfortunately if the i_size is not block aligned and the second
unaligned write lands past i_size, but still into the same block, it has
the potential of corrupting the previous unaligned write to the same
block.
Without this patch the 512B range from 40960 up to the start of the
second unaligned write (41472) is going to be zeroed overwriting the data
written by the first write. This is a data corruption.
swiotlb: checking whether swiotlb buffer is full with io_tlb_used
This patch uses io_tlb_used to help check whether swiotlb buffer is full.
io_tlb_used is no longer used for only debugfs. It is also used to help
optimize swiotlb_tbl_map_single().
Suggested-by: Joe Jin <joe.jin@oracle.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 29582587
(cherry picked from commit 60513ed06a41049768a6875229b025b6e726e148)
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
swiotlb: add debugfs to track swiotlb buffer usage
The device driver will not be able to do dma operations once swiotlb buffer
is full, either because the driver is using so many IO TLB blocks inflight,
or because there is memory leak issue in device driver. To export the
swiotlb buffer usage via debugfs would help the user estimate the size of
swiotlb buffer to pre-allocate or analyze device driver memory leak issue.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 29582587
(cherry picked from commit 71602fe6d4e9291af105adfef8e893b57c735906) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
debugfs_create_ulong() is not available. debugfs_create_file() is used instead.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Fix the comment as swiotlb_bounce() is used to copy from original dma
location to swiotlb buffer during swiotlb_tbl_map_single(), while to
copy from swiotlb buffer to original dma location during
swiotlb_tbl_unmap_single().
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 29582587
(cherry picked from commit 6442ca2abf882d9d838fb844d852ba6acd1db7f4)
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
NeilBrown [Mon, 19 Dec 2016 00:19:31 +0000 (11:19 +1100)]
NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success.
Various places assume that if nfs4_fl_prepare_ds() turns a non-NULL 'ds',
then ds->ds_clp will also be non-NULL.
This is not necessasrily true in the case when the process received a fatal signal
while nfs4_pnfs_ds_connect is waiting in nfs4_wait_ds_connect().
In that case ->ds_clp may not be set, and the devid may not recently have been marked
unavailable.
So add a test for ds_clp == NULL and return NULL in that case.
Fixes: c23266d532b4 ("NFS4.1 Fix data server connection race") Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: Olga Kornievskaia <aglo@umich.edu> Acked-by: Adamson, Andy <William.Adamson@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
(cherry picked from commit cfd278c280f997cf2fe4662e0acab0fe465f637b)
Orabug: 29617508 Signed-off-by: Calum Mackay <calum.mackay@oracle.com> Reviewed-by: John Sobecki <john.sobecki@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Mukesh Kacker [Wed, 20 Feb 2019 19:10:04 +0000 (11:10 -0800)]
ib_core: initialize shpd field when allocating 'struct ib_pd'
The shared pd feature in Oracle Linux was added by
commit a1911c2c180d ("IB/Shared PD support from Oracle").
It adds a field named shpd to 'struct ib_pd' but fails to
initialize it in all the places it is allocated.
This results its uninitialized content being referenced
in mlx4_ib_dealloc_pd() and actions taken based on it
which eventually leads to resource leaks even when shared
pd feature is not being used.
This fix here initializes it to NULL in ib_alloc_pd() where
the ib_core module allocates the data structure.
qlcnic: fix Tx descriptor corruption on 82xx devices
In regular NIC transmission flow, driver always configures MAC using
Tx queue zero descriptor as a part of MAC learning flow.
But with multi Tx queue supported NIC, regular transmission can occur on
any non-zero Tx queue and from that context it uses
Tx queue zero descriptor to configure MAC, at the same time TX queue
zero could be used by another CPU for regular transmission
which could lead to Tx queue zero descriptor corruption and cause FW
abort.
This patch fixes this in such a way that driver always configures
learned MAC address from the same Tx queue which is used for
regular transmission.
Fixes: 7e2cf4feba05 ("qlcnic: change driver hardware interface mechanism") Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 27708787
(cherry picked from commit c333fa0c4f220f8f7ea5acd6b0ebf3bf13fd684d) Signed-off-by: aru kolappan <aru.kolappan@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Oliver Hartkopp [Fri, 4 Jan 2019 14:55:26 +0000 (15:55 +0100)]
can: gw: ensure DLC boundaries after CAN frame modification
Muyu Yu provided a POC where user root with CAP_NET_ADMIN can create a CAN
frame modification rule that makes the data length code a higher value than
the available CAN frame data size. In combination with a configured checksum
calculation where the result is stored relatively to the end of the data
(e.g. cgw_csum_xor_rel) the tail of the skb (e.g. frag_list pointer in
skb_shared_info) can be rewritten which finally can cause a system crash.
Michael Kubecek suggested to drop frames that have a DLC exceeding the
available space after the modification process and provided a patch that can
handle CAN FD frames too. Within this patch we also limit the length for the
checksum calculations to the maximum of Classic CAN data length (8).
CAN frames that are dropped by these additional checks are counted with the
CGW_DELETED counter which indicates misconfigurations in can-gw rules.
This fixes CVE-2019-3701.
Reported-by: Muyu Yu <ieatmuttonchuan@gmail.com> Reported-by: Marcus Meissner <meissner@suse.de> Suggested-by: Michal Kubecek <mkubecek@suse.cz> Tested-by: Muyu Yu <ieatmuttonchuan@gmail.com> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> # >= v3.2 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 29215299
CVE: CVE-2019-3701
(cherry picked from commit 0aaa81377c5a01f686bcdb8c7a6929a7bf330c68) Signed-off-by: Dan Duval <dan.duval@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: John Donnelly <John.P.Donnelly@oracle.com> Reviewed-by: Allen Pais <allen.pais@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
fs/cifs/smb2pdu.c
When converting from an inode from storing the data in-line to a data
block, ext4_destroy_inline_data_nolock() was only clearing the on-disk
copy of the i_blocks[] array. It was not clearing copy of the
i_blocks[] in ext4_inode_info, in i_data[], which is the copy actually
used by ext4_map_blocks().
This didn't matter much if we are using extents, since the extents
header would be invalid and thus the extents could would re-initialize
the extents tree. But if we are using indirect blocks, the previous
contents of the i_blocks array will be treated as block numbers, with
potentially catastrophic results to the file system integrity and/or
user data.
This gets worse if the file system is using a 1k block size and
s_first_data is zero, but even without this, the file system can get
quite badly corrupted.
Signed-off-by: John Donnelly <John.P.Donnelly@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: John Donnelly <John.P.Donnelly@oracle.com> Reviewed-by: Allen Pais <allen.pais@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
fs/ext4/inode.c
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com> Signed-off-by: Brian Maly <brian.maly@oracle.com>