Some displays are not lighting up when put in LTTPR Transparent Mode
Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Wed, 16 Jun 2021 21:11:12 +0000 (17:11 -0400)]
drm/amd/display: Fix updating infoframe for DCN3.1 eDP
[Why]
We're only treating TMDS as a valid target for infoframe updates which
results in PSR being unable to transition from state 4 to state 5.
[How]
Also allow infoframe updates for DCN3.1 - following how we handle
this path for earlier ASIC as well.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Eric Yang <eric.yang2@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stylon Wang [Sat, 29 May 2021 06:19:20 +0000 (14:19 +0800)]
drm/amd/display: Add Freesync HDMI support to DM with DMUB
[Why]
Changes in DM needed to support Freesync HDMI on DMUB.
[How]
Change implementation to parse CEA blocks in case of DMUB-enabled ASICs.
Signed-off-by: Stylon Wang <stylon.wang@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wang [Tue, 15 Jun 2021 21:22:57 +0000 (17:22 -0400)]
drm/amd/display: Add null checks
Added NULL checks before two problematic statements
Signed-off-by: Wang <anguwang@amd.com> Reviewed-by: Martin Leung <Martin.Leung@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
dmub would notify x86 response time violation by GPINT_DATAOUT
[How]
1. Use GPINT_DATAOUT to trigger x86 interrupt
2. Register GPINT_DATAOUT interrupt handler.
3. Trigger ACR while GPINT_DATAOUT occurred.
Signed-off-by: Chun-Liang Chang <Chun-Liang.Chang@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wenjing Liu [Tue, 4 May 2021 20:39:08 +0000 (16:39 -0400)]
drm/amd/display: isolate link training setting override to its own function
There is a difference between our default behavior and override
behavior. For default behavior we need to decide link training settings
within specs' limitation and mandates.
For override behavior we do not need to follow all these requirements.
We are isolating override decision to its own function to maintain the
integrity of our specs compliant default behavior.
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Reviewed-by: George Shen <George.Shen@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In amdgpu_ras_query_error_count() return an error
if the device doesn't support RAS. This prevents
that function from having to always set the values
of the integer pointers (if set), and thus
prevents function side effects--always to have to
set values of integers if integer pointers set,
regardless of whether RAS is supported or
not--with this change this side effect is
mitigated.
Also, if no pointers are set, don't count, since
we've no way of reporting the counts.
Also, give this function a kernel-doc.
Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Reported-by: Tom Rix <trix@redhat.com> Fixes: a46751fbcde505 ("drm/amdgpu: Fix RAS function interface") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amdgpu: The I2C IP doesn't support 0 writes/reads
The I2C IP doesn't support writes or reads of 0 bytes.
In order for a START/STOP transaction to take
place on the bus, the data written/read has to be
at least one byte.
That is, you cannot generate a write with 0 bytes,
just to get the ACK from a device, just so you can
probe that device if it is on the bus and so to
discover all devices on the bus--you'll have to
read at least one byte. Writes of 0 bytes generate
no START/STOP on this I2C IP--the bus is not
engaged at all.
Set the I2C_AQ_NO_ZERO_LEN to the existing I2C
quirk tables for Aldebaran, Arcturus, Navi10 and
Sienna Cichlid, and add a quirk table to the I2C
driver which drives the bus when the SMU
doesn't--for instance on Vega20.
YuBiao Wang [Tue, 29 Jun 2021 03:21:25 +0000 (11:21 +0800)]
drm/amdgpu: Read clock counter via MMIO to reduce delay (v5)
[Why]
GPU timing counters are read via KIQ under sriov, which will introduce
a delay.
[How]
It could be directly read by MMIO.
v2: Add additional check to prevent carryover issue.
v3: Only check for carryover for once to prevent performance issue.
v4: Add comments of the rough frequency where carryover happens.
v5: Remove mutex and gfxoff ctrl unused with current timing registers.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.co> Reviewed-by: Monk Liu <monk.liu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Wed, 30 Jun 2021 21:58:12 +0000 (17:58 -0400)]
drm/amdkfd: Only apply TLB flush optimization on ALdebaran
It is based on reverting two patches back.
drm/amdkfd: Make TLB flush conditional on mapping
drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nirmoy Das [Fri, 2 Jul 2021 09:10:52 +0000 (11:10 +0200)]
drm/amdgpu: separate out vm pasid assignment
Use new helper function amdgpu_vm_set_pasid() to
assign vm pasid value. This also ensures that we don't free
a pasid from vm code as pasids are allocated somewhere else.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nirmoy Das [Mon, 28 Jun 2021 21:29:39 +0000 (23:29 +0200)]
drm/amdgpu: use xarray for storing pasid in vm
Replace idr with xarray as we actually need hash functionality.
Cleanup code related to vm pasid by adding helper function.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jiri Kosina [Thu, 24 Jun 2021 11:11:36 +0000 (13:11 +0200)]
drm/amdgpu: Avoid printing of stack contents on firmware load error
In case when psp_init_asd_microcode() fails to load ASD microcode file,
psp_v12_0_init_microcode() tries to print the firmware filename that
failed to load before bailing out.
This is wrong because:
- the firmware filename it would want it print is an incorrect one as
psp_init_asd_microcode() and psp_v12_0_init_microcode() are loading
different filenames
- it tries to print fw_name, but that's not yet been initialized by that
time, so it prints random stack contents, e.g.
amdgpu 0000:04:00.0: Direct firmware load for amdgpu/renoir_asd.bin failed with error -2
amdgpu 0000:04:00.0: amdgpu: fail to initialize asd microcode
amdgpu 0000:04:00.0: amdgpu: psp v12.0: Failed to load firmware "\xfeTO\x8e\xff\xff"
Fix that by bailing out immediately, instead of priting the bogus error
message.
It is not true (as stated in the reverted commit changelog) that we never
unmap the BAR on failure; it actually does happen properly on
amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() ->
amdgpu_device_fini() error path.
What's worse, this commit actually completely breaks resource freeing on
probe failure (like e.g. failure to load microcode), as
amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too
early, leaving all the resources that'd normally be freed in
amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading
to all sorts of oopses when someone tries to, for example, access the
sysfs and procfs resources which are still around while the driver is
gone.
Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure") Reported-by: Vojtech Pavlik <vojtech@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lukas Bulwahn [Mon, 28 Jun 2021 10:53:34 +0000 (12:53 +0200)]
drm/amdgpu: rectify line endings in umc v8_7_0 IP headers
Commit 6b36fa6143f6 ("drm/amdgpu: add umc v8_7_0 IP headers") adds the new
file ./drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_7_0_sh_mask.h with
DOS line endings, which is very uncommon for the kernel repository.
Rectify the line endings in this file with dos2unix.
Identified by a checkpatch evaluation on the whole kernel repository and
spot-checking for really unexpected checkpatch rule violations.
Reported-by: Dwaipayan Ray <dwaipayanray1@gmail.com> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Fri, 11 Jun 2021 06:15:49 +0000 (02:15 -0400)]
drm/amdgpu: Correctly disable the I2C IP block
On long transfers to the EEPROM device,
i.e. write, it is observed that the driver aborts
the transfer.
The reason for this is that the driver isn't
patient enough--the IC_STATUS register's contents
is 0x27, which is MST_ACTIVITY | TFE | TFNF |
ACTIVITY. That is, while the transmission FIFO is
empty, we, the I2C master device, are still
driving the bus.
Implement the correct procedure to disable
the block, as described in the DesignWare I2C
Databook, section 3.8.3 Disabling DW_apb_i2c on
page 56. Now there are no premature aborts on long
data transfers.
Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Debugfs RAS EEPROM files are available when
the ASIC supports RAS, and when the debugfs is
enabled, an also when "ras_enable" module
parameter is set to 0. However in this case,
we get a kernel oops when accessing some of
the "ras_..." controls in debugfs. The reason
for this is that struct amdgpu_ras::adev is
unset. This commit sets it, thus enabling access
to those facilities. Note that this facilitates
EEPROM access and not necessarily RAS features or
functionality.
Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 30 Jun 2021 15:19:14 +0000 (11:19 -0400)]
drm/amdgpu: fix 64 bit divide in eeprom code
pos is 64 bits.
Fixes: c65b0805e77919 ("drm/amdgpu: RAS EEPROM table is now in debugfs") Cc: luben.tuikov@amd.com Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add "ras_eeprom_size" file in debugfs, which
reports the maximum size allocated to the RAS
table in EEROM, as the number of bytes and the
number of records it could store. For instance,
$cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size
262144 bytes or 10921 records
$_
Add "ras_eeprom_table" file in debugfs, which
dumps the RAS table stored EEPROM, in a formatted
way. For instance,
Split functionality between read and write, which
simplifies the code and exposes areas of
optimization and more or less complexity, and take
advantage of that.
Read and write the table in one go; use a separate
stage to decode or encode the data, as opposed to
on the fly, which keeps the I2C bus busy. Use a
single read/write to read/write the table or at
most two if the number of records we're
reading/writing wraps around.
Check the check-sum of a table in EEPROM on init.
Update the checksum at the same time as when
updating the table header signature, when the
threshold was increased on boot.
Take advantage of arithmetic modulo 256, that is,
use a byte!, to greatly simplify checksum
arithmetic.
Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Sat, 27 Mar 2021 07:57:49 +0000 (03:57 -0400)]
drm/amdgpu: Use explicit cardinality for clarity
RAS_MAX_RECORD_NUM may mean the maximum record
number, as in the maximum house number on your
street, or it may mean the maximum number of
records, as in the count of records, which is also
a number. To make this distinction whether the
number is ordinal (index) or cardinal (count),
rename this macro to RAS_MAX_RECORD_COUNT.
This makes it easy to understand what it refers
to, especially when we compute quantities such as,
how many records do we have left in the table,
especially when there are so many other numbers,
quantities and numerical macros around.
Also rename the long,
amdgpu_ras_eeprom_get_record_max_length() to the
more succinct and clear,
amdgpu_ras_eeprom_max_record_count().
When computing the threshold, which also deals
with counts, i.e. "how many", use cardinal
"max_eeprom_records_count", than the quantitative
"max_eeprom_records_len".
Simplify the logic here and there, as well.
Cc: Guchun Chen <guchun.chen@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Thu, 11 Mar 2021 15:34:28 +0000 (10:34 -0500)]
drm/amdgpu: Return result fix in RAS
The low level EEPROM write method, doesn't return
1, but the number of bytes written. Thus do not
compare to 1, instead, compare to greater than 0
for success.
Other cleanup: if the lower layers returned
-errno, then return that, as opposed to
overwriting the error code with one-fits-all
-EINVAL. For instance, some return -EAGAIN.
Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 9 Mar 2021 22:29:22 +0000 (17:29 -0500)]
drm/amd/pm: Simplify managed I2C transfer functions
Now that we have an I2C quirk table for
SMU-managed I2C controllers, the I2C core does the
checks for us, so we don't need to do them, and so
simplify the managed I2C transfer functions.
Also, for Arcturus and Navi10, fix setting the
command type from "cmd->CmdConfig" to "cmd->Cmd".
The latter is what appears to be taking in
the enumeration I2C_CMD_... as an integer,
not a bit-flag.
For Sienna, the "Cmd" field seems to have been
eliminated, and command type and flags all live in
the "CmdConfig" field--this is left untouched.
Fix: Detect and add changing of direction
bit-flag, as this is necessary for the SMU to
detect the direction change in the 1-d array of
data it gets.
Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Sat, 20 Feb 2021 04:37:28 +0000 (23:37 -0500)]
drm/amd/pm: Extend the I2C quirk table
Extend the I2C quirk table for SMU access
controlled I2C adapters. Let the kernel I2C layer
check that the messages all have the same address,
and that their combined size doesn't exceed the
maximum size of a SMU software I2C request.
Suggested-by: Jean Delvare <jdelvare@suse.de> Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Thu, 11 Mar 2021 16:20:15 +0000 (11:20 -0500)]
drm/amdgpu: RAS xfer to read/write
Wrap amdgpu_ras_eeprom_xfer(..., bool write),
into amdgpu_ras_eeprom_read() and
amdgpu_ras_eeprom_write(), as that makes reading
and understanding the code clearer.
Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Fri, 5 Feb 2021 23:48:01 +0000 (18:48 -0500)]
drm/amdgpu: Rename misspelled function
Instead of fixing the spelling in
amdgpu_ras_eeprom_process_recods(),
rename it to,
amdgpu_ras_eeprom_xfer(),
to look similar to other I2C and protocol
transfer (read/write) functions.
Also to keep the column span to within reason by
using a shorter name.
Change the "num" function parameter from "int" to
"const u32" since it is the number of items
(records) to xfer, i.e. their count, which cannot
be a negative number.
Also swap the order of parameters, keeping the
pointer to records and their number next to each
other, while the direction now becomes the last
parameter.
Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Thu, 4 Feb 2021 19:15:01 +0000 (14:15 -0500)]
drm/amdgpu: RAS: EEPROM --> RAS
In amdgpu_ras_eeprom.c--the interface from RAS to
EEPROM, rename macros from EEPROM to RAS, to
indicate that the quantities and objects are RAS
specific, not EEPROM. We can decrease the RAS
table, or put it in different offset of EEPROM as
needed in the future.
Remove EEPROM_ADDRESS_SIZE macro definition, equal
to 2, from the file and calculations, as that
quantity is computed and added on the stack,
in the lower layer, amdgpu_eeprom_xfer().
Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Tue, 2 Feb 2021 23:36:25 +0000 (18:36 -0500)]
drm/amdgpu: Fix wrap-around bugs in RAS
Fix the size of the EEPROM from 256000 bytes
to 262144 bytes (256 KiB).
Fix a couple or wrap around bugs. If a valid
value/address is 0 <= addr < size, the inverse of
this inequality (barring negative values which
make no sense here) is addr >= size. Fix this in
the RAS code.
Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Mon, 1 Feb 2021 18:38:54 +0000 (13:38 -0500)]
drm/amdgpu: I2C EEPROM full memory addressing
* "eeprom_addr" is now 32-bit wide.
* Remove "slave_addr" from the I2C EEPROM driver
interface. The I2C EEPROM Device Type Identifier
is fixed at 1010b, and the rest of the bits
of the Device Address Byte/Device Select Code,
are memory address bits, where the first three
of those bits are the hardware selection bits.
All this is now a 19-bit address and passed
as "eeprom_addr". This abstracts the I2C bus
for EEPROM devices for this I2C EEPROM driver.
Now clients only pass the 19-bit EEPROM memory
address, to the I2C EEPROM driver, as the 32-bit
"eeprom_addr", from which they want to read from
or write to.
Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Luben Tuikov [Wed, 27 Jan 2021 22:09:52 +0000 (17:09 -0500)]
drm/amdgpu: Fixes to the AMDGPU EEPROM driver
* When reading from the EEPROM device, there is no
device limitation on the number of bytes
read--they're simply sequenced out. Thus, read
the whole data requested in one go.
* When writing to the EEPROM device, there is a
256-byte page limit to write to before having to
generate a STOP on the bus, as well as the
address written to mustn't cross over the page
boundary (it actually rolls over). Maximize the
data written to per bus acquisition.
* Return the number of bytes read/written, or -errno.
* Add kernel doc.
Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Andrey Grodzovsky [Sat, 23 Jan 2021 04:25:31 +0000 (23:25 -0500)]
drm/amdgpu: Send STOP for the last byte of msg only
Let's just ignore the I2C_M_STOP hint from upper
layer for SMU I2C code as there is no clean
mapping between single per I2C message STOP flag
at the kernel I2C layer and the SMU, per each byte
STOP flag. We will just by default set it at the
end of the SMU I2C message.
Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 10 Mar 2021 04:12:15 +0000 (22:12 -0600)]
drm/amdkfd: Maintain svm_bo reference in page->zone_device_data
Each zone-device page holds a reference to the SVM BO that manages its
backing storage. This is necessary to correctly hold on to the BO in
case zone_device pages are shared with a child-process.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 12 May 2021 16:02:33 +0000 (11:02 -0500)]
drm/amdkfd: add invalid pages debug at vram migration
This is for debug purposes only.
It conditionally generates partial migrations to test mixed
CPU/GPU memory domain pages in a prange easily.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 30 Jun 2021 19:09:10 +0000 (15:09 -0400)]
drm/amdkfd: skip migration for pages already in VRAM
Migration skipped for pages that are already in VRAM
domain. These could be the result of previous partial
migrations to SYS RAM, and prefetch back to VRAM.
Ex. Coherent pages in VRAM that were not written/invalidated after
a copy-on-write.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 12 May 2021 15:54:58 +0000 (10:54 -0500)]
drm/amdkfd: skip invalid pages during migrations
Invalid pages can be the result of pages that have been migrated
already due to copy-on-write procedure or pages that were never
migrated to VRAM in first place. This is not an issue anymore,
as pranges now support mixed memory domains (CPU/GPU).
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 5 May 2021 19:15:50 +0000 (14:15 -0500)]
drm/amdkfd: classify and map mixed svm range pages in GPU
[Why]
svm ranges can have mixed pages from device or system memory.
A good example is, after a prange has been allocated in VRAM and a
copy-on-write is triggered by a fork. This invalidates some pages
inside the prange. Endding up in mixed pages.
[How]
By classifying each page inside a prange, based on its type. Device or
system memory, during dma mapping call. If page corresponds
to VRAM domain, a flag is set to its dma_addr entry for each GPU.
Then, at the GPU page table mapping. All group of contiguous pages within
the same type are mapped with their proper pte flags.
v2:
Instead of using ttm_res to calculate vram pfns in the svm_range. It is now
done by setting the vram real physical address into drm_addr array.
This makes more flexible VRAM management, plus removes the need to have
a BO reference in the svm_range.
v3:
Remove mapping member from svm_range
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 23 Jun 2021 22:06:22 +0000 (17:06 -0500)]
drm/amdkfd: use hmm range fault to get both domain pfns
Now that prange could have mixed domains (VRAM or SYSRAM),
actual_loc nor svm_bo can not be used to check its current
domain and eventually get its pfns to map them in GPU.
Instead, pfns from both domains, are now obtained from
hmm_range_fault through amdgpu_hmm_range_get_pages
call. This is done everytime a GPU map occur.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Thu, 6 May 2021 18:18:40 +0000 (13:18 -0500)]
drm/amdgpu: get owner ref in validate and map
Get the proper owner reference for amdgpu_hmm_range_get_pages function.
This is useful for partial migrations. To avoid migrating back to
system memory, VRAM pages, that are accessible by all devices in the
same memory domain.
Ex. multiple devices in the same hive.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Thu, 6 May 2021 18:06:54 +0000 (13:06 -0500)]
drm/amdkfd: set owner ref to svm range prefault
svm_range_prefault is called right before migrations to VRAM,
to make sure pages are resident in system memory before the migration.
With partial migrations, this reference is used by hmm range get pages
to avoid migrating pages that are already in the same VRAM domain.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Thu, 6 May 2021 17:23:07 +0000 (12:23 -0500)]
drm/amdkfd: add owner ref param to get hmm pages
The parameter is used in the dev_private_owner to decide if device
pages in the range require to be migrated back to system memory, based
if they are or not in the same memory domain.
In this case, this reference could come from the same memory domain
with devices connected to the same hive.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Wed, 5 May 2021 17:43:10 +0000 (12:43 -0500)]
drm/amdkfd: device pgmap owner at the svm migrate init
GPUs in the same XGMI hive have direct access to all
members'VRAM. When mapping memory to a GPU, we don't need
hmm_range_fault to fault device-private pages in the same
hive back to the host. Identifying the page owner as the hive,
rather than the individual GPU, accomplishes this.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Sierra [Tue, 29 Jun 2021 16:40:47 +0000 (11:40 -0500)]
drm/amdkfd: inc counter on child ranges with xnack off
During GPU page table invalidation with xnack off, new ranges
split may occur concurrently in the same prange. Creating a new
child per split. Each child should also increment its
invalid counter, to assure GPU page table updates in these
ranges.
Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Wed, 30 Jun 2021 14:07:29 +0000 (10:07 -0400)]
drm/amd/display: Extend DMUB diagnostic logging to DCN3.1
[Why & How]
Extend existing support for DCN2.1 DMUB diagnostic logging to
DCN3.1 so we can collect useful information if the DMUB hangs.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Joseph Greathouse [Wed, 30 Jun 2021 02:08:52 +0000 (21:08 -0500)]
drm/amdgpu: Update NV SIMD-per-CU to 2
Navi series GPUs have 2 SIMDs per CU (and then 2 CUs per WGP).
The NV enum headers incorrectly listed this as 4, which later meant
we were incorrectly reporting the number of SIMDs in the HSA
topology. This could cause problems down the line for user-space
applications that want to launch a fixed amount of work to each
SIMD.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Shyam Sundar S K [Mon, 28 Jun 2021 07:54:40 +0000 (13:24 +0530)]
drm/amd/pm: skip PrepareMp1ForUnload message in s0ix
The documentation around PrepareMp1ForUnload message says that
anything sent to SMU after this command would be stalled as the
PMFW would not be in a state to take further job requests.
Technically this is right in case of S3 scenario. But, this might
not be the case during s0ix as the PMC driver would be the last
to send the SMU on the OS_HINT. If SMU gets a PrepareMp1ForUnload
message before the OS_HINT, this would stall the entire S0ix process.
Results show that, this message to SMU is not required during S0ix
and hence skip it.
Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reka Norman [Tue, 29 Jun 2021 01:27:18 +0000 (11:27 +1000)]
drm/amd/display: Respect CONFIG_FRAME_WARN=0 in dml Makefile
Setting CONFIG_FRAME_WARN=0 should disable 'stack frame larger than'
warnings. This is useful for example in KASAN builds. Make the dml
Makefile respect this config.
Fixes the following build warnings with CONFIG_KASAN=y and
CONFIG_FRAME_WARN=0:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3642:6:
warning: stack frame size of 2216 bytes in function
'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than=]
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3957:6:
warning: stack frame size of 2568 bytes in function
'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than=]
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Reka Norman <rekanorman@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jing Xiangfeng [Tue, 29 Jun 2021 11:44:55 +0000 (19:44 +0800)]
drm/radeon: Add the missed drm_gem_object_put() in radeon_user_framebuffer_create()
radeon_user_framebuffer_create() misses to call drm_gem_object_put() in
an error path. Add the missed function call to fix it.
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Michal Suchanek [Wed, 23 Jun 2021 10:30:39 +0000 (12:30 +0200)]
drm/amdgpu/dc: Really fix DCN3.1 Makefile for PPC64
Also copy over the part that makes old gcc handling cross-platform.
Fixes: df7a1658f257 ("drm/amdgpu/dc: fix DCN3.1 Makefile for PPC64") Fixes: 926d6972efb6 ("drm/amd/display: Add DCN3.1 blocks to the DC Makefile") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Oak Zeng [Mon, 28 Jun 2021 22:53:38 +0000 (17:53 -0500)]
drm/amdgpu: Set ttm caching flags during bo allocation
The ttm caching flags (ttm_cached, ttm_write_combined etc) are
used to determine a buffer object's mapping attributes in both
CPU page table and GPU page table (when that buffer is also
accessed by GPU). Currently the ttm caching flags are set in
function amdgpu_ttm_io_mem_reserve which is called during
DRM_AMDGPU_GEM_MMAP ioctl. This has a problem since the GPU
mapping of the buffer object (ioctl DRM_AMDGPU_GEM_VA) can
happen earlier than the mmap time, thus the GPU page table
update code can't pick up the right ttm caching flags to
decide the right GPU page table attributes.
This patch moves the ttm caching flags setting to function
amdgpu_vram_mgr_new - this function is called during the
first step of a buffer object create (eg, DRM_AMDGPU_GEM_CREATE)
so the later both CPU and GPU mapping function calls will
pick up this flag for CPU/GPU page table set up.
v2: rebase (Alex)
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Suggested-by: Christian Koenig <Christian.Koenig@amd.com> Reviewed-by: Christian Koenig <Christian.Koenig@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Tested-by: Po Huang <Po.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Guchun Chen [Mon, 28 Jun 2021 09:08:00 +0000 (17:08 +0800)]
drm/amd/display: fix null pointer access in gpu reset
During GPU reset, when receiving a DMCUB OUTBUX0 interrupt,
DAL code will set it to be OUTBOX interrupt and sets hw interrupt.
However, OUTBOX interrupt is not registered yet, so a NULL pointer
access will be executed.
Fixes: effbf6ca7eafda ("drm/amdgpu/display: remove an old DCN3 guard") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aaron Liu [Fri, 25 Jun 2021 05:50:19 +0000 (13:50 +0800)]
drm/amdgpu: enable sdma0 tmz for Raven/Renoir(V2)
Without driver loaded, SDMA0_UTCL1_PAGE.TMZ_ENABLE is set to 1
by default for all asic. On Raven/Renoir, the sdma goldsetting
changes SDMA0_UTCL1_PAGE.TMZ_ENABLE to 0.
This patch restores SDMA0_UTCL1_PAGE.TMZ_ENABLE to 1.
Signed-off-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
for f in $FILES
do
echo $f = `cat $HWMON_DIR/$f` >> pp_show_power_cap.test.log
done
Signed-off-by: Darren Powell <darren.powell@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The patch that we are reverting here was originally applied because it
fixes multiple IGT issues and flickering in Android. However, after a
discussion with Sean Paul and Mark, it looks like that this patch might
cause problems on ChromeOS. For this reason, we decided to revert this
patch.
Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Cc: Harry Wentland <Harry.Wentland@amd.com> Cc: Hersen Wu <hersenxs.wu@amd.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: Mark Yacoub <markyacoub@chromium.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Reviewed-by: Sean Paul <seanpaul@chromium.org> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Veerabadhran Gopalakrishnan [Fri, 18 Jun 2021 18:40:46 +0000 (00:10 +0530)]
amdgpu/nv.c - Added codec query for Beige Goby
Added the Beige Goby capabilities in codec query.
v2: fix build error and indent (James)
Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>