]> www.infradead.org Git - users/hch/block.git/log
users/hch/block.git
4 years agopowerpc/32s: Handle PROTFAULT in hash_page() also for CONFIG_PPC_KUAP
Christophe Leroy [Mon, 16 Nov 2020 16:09:31 +0000 (16:09 +0000)]
powerpc/32s: Handle PROTFAULT in hash_page() also for CONFIG_PPC_KUAP

On hash 32 bits, handling minor protection faults like unsetting
dirty flag is heavy if done from the normal page_fault processing,
because it implies hash table software lookup for flushing the entry
and then a DSI is taken anyway to add the entry back.

When KUAP was implemented, as explained in commit a68c31fc01ef
("powerpc/32s: Implement Kernel Userspace Access Protection"),
protection faults has been diverted from hash_page() because
hash_page() was not able to identify a KUAP fault.

Implement KUAP verification in hash_page(), by clearing write
permission when the access is a kernel access and Ks is 1.
This works regardless of the address because kernel segments always
have Ks set to 0 while user segments have Ks set to 0 only
when kernel write to userspace is granted.

Then protection faults can be handled by hash_page() even for KUAP.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8a4ffe4798e9ea32aaaccdf85e411bb1beed3500.1605542955.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Make support for 603 and 604+ selectable
Christophe Leroy [Thu, 22 Oct 2020 06:29:44 +0000 (06:29 +0000)]
powerpc/32s: Make support for 603 and 604+ selectable

book3s/32 has two main families:
- CPU with 603 cores that don't have HASH PTE table and
perform SW TLB loading.
- Other CPUs based on 604+ cores that have HASH PTE table.

This leads to some complex logic and additionnal code to
support both. This makes sense for distribution kernels
that aim at running on any CPU, but when you are fine
tuning a kernel for an embedded 603 based board you
don't need all the HASH logic.

Allow selection of support for each family, in order to opt
out unneeded parts of code. At least one must be selected.

Note that some of the CPU supporting HASH also support SW TLB
loading, however it is not supported by Linux kernel at the
time being, because they do not have alternate registers in
the TLB miss exception handlers.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8dde0cdb629a71abc29b0d85a52a86e920376cb6.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Regroup 603 based CPUs in cputable
Christophe Leroy [Thu, 22 Oct 2020 06:29:43 +0000 (06:29 +0000)]
powerpc/32s: Regroup 603 based CPUs in cputable

In order to selectively build the kernel for 603 SW TLB handling,
regroup all 603 based CPUs together.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/45065263fdb9f5cc2a2d210ec2a762ac8bf5b2bc.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Remove CONFIG_PPC_BOOK3S_6xx
Christophe Leroy [Thu, 22 Oct 2020 06:29:42 +0000 (06:29 +0000)]
powerpc/32s: Remove CONFIG_PPC_BOOK3S_6xx

As 601 is gone, CONFIG_PPC_BOO3S_6xx and CONFIG_PPC_BOOK3S_32
are dedundant.

Remove CONFIG_PPC_BOOK3S_6xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f18c16af37f6f77b577bed8d9e12831b695617ae.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Move early_mmu_init() into mmu.c
Christophe Leroy [Thu, 22 Oct 2020 06:29:41 +0000 (06:29 +0000)]
powerpc/32s: Move early_mmu_init() into mmu.c

early_mmu_init() is independent of MMU type and not
directly linked to tlb handling.

In a following patch, tlb.c will be restricted to HASH mmu.

Move early_mmu_init() into mmu.c which is common.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e51b5e2fe6bca623b33116403043d3a1b5eaf826.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Inline flush_hash_entry()
Christophe Leroy [Thu, 22 Oct 2020 06:29:40 +0000 (06:29 +0000)]
powerpc/32s: Inline flush_hash_entry()

flush_hash_entry() is a simple function calling
flush_hash_pages() if it's a hash MMU or doing nothing otherwise.

Inline it.

And use it also in __ptep_test_and_clear_young().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/9af895be7d4b404d40e749a2659552fd138e62c4.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Inline tlb_flush()
Christophe Leroy [Thu, 22 Oct 2020 06:29:39 +0000 (06:29 +0000)]
powerpc/32s: Inline tlb_flush()

On book3s/32, tlb_flush() does nothing when the CPU has a hash table,
it calls _tlbia() otherwise.

Inline it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ebc933d1c530a19ef3cf7983f6ae94814f6e92ac.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Split and inline flush_range()
Christophe Leroy [Thu, 22 Oct 2020 06:29:38 +0000 (06:29 +0000)]
powerpc/32s: Split and inline flush_range()

flush_range() handle both the MMU_FTR_HPTE_TABLE case and
the other case.

The non MMU_FTR_HPTE_TABLE case is trivial as it is only a call
to _tlbie()/_tlbia() which is not worth a dedicated function.

Make flush_range() a hash specific and call it from tlbflush.h based
on mmu_has_feature(MMU_FTR_HPTE_TABLE).

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/132ab19aae52abc8e06ab524ec86d4229b5b9c3d.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Inline flush_tlb_range() and flush_tlb_kernel_range()
Christophe Leroy [Thu, 22 Oct 2020 06:29:37 +0000 (06:29 +0000)]
powerpc/32s: Inline flush_tlb_range() and flush_tlb_kernel_range()

flush_tlb_range() and flush_tlb_kernel_range() are trivial calls to
flush_range().

Make flush_range() global and inline flush_tlb_range()
and flush_tlb_kernel_range().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c7029a78e78709ad9272d7a44260e06b649169b2.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Split and inline flush_tlb_mm() and flush_tlb_page()
Christophe Leroy [Thu, 22 Oct 2020 06:29:36 +0000 (06:29 +0000)]
powerpc/32s: Split and inline flush_tlb_mm() and flush_tlb_page()

flush_tlb_mm() and flush_tlb_page() handle both the MMU_FTR_HPTE_TABLE
case and the other case.

The non MMU_FTR_HPTE_TABLE case is trivial as it is only a call
to _tlbie()/_tlbia() which is not worth a dedicated function.

Make flush_tlb_mm() and flush_tlb_page() hash specific and call
them from tlbflush.h based on mmu_has_feature(MMU_FTR_HPTE_TABLE).

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/11e932ded41ba6d9b251d89b7afa33cc060d3aa4.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Move _tlbie() and _tlbia() in a new file
Christophe Leroy [Thu, 22 Oct 2020 06:29:35 +0000 (06:29 +0000)]
powerpc/32s: Move _tlbie() and _tlbia() in a new file

_tlbie() and _tlbia() are used only on 603 cores while the
other functions are used only on cores having a hash table.

Move them into a new file named nohash_low.S

Add mmu_hash_lock var is used by both, it needs to go
in a common file.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/9a265b1b17a64153463d361280cb4b43eb1266a4.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Inline _tlbie() on non SMP
Christophe Leroy [Thu, 22 Oct 2020 06:29:34 +0000 (06:29 +0000)]
powerpc/32s: Inline _tlbie() on non SMP

On non SMP, _tlbie() is just a tlbie plus a sync instruction.

Make it static inline.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/475136425541db5c7c8a0395d19d400525b251bc.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Move _tlbie() and _tlbia() prototypes to tlbflush.h
Christophe Leroy [Thu, 22 Oct 2020 06:29:33 +0000 (06:29 +0000)]
powerpc/32s: Move _tlbie() and _tlbia() prototypes to tlbflush.h

In order to use _tlbie() and _tlbia() directly
from asm/book3s/32/tlbflush.h, move their prototypes
from mm/mm_decl.h to there.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/867587af929973ad65f8ef6972f2474a80c1737a.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Declare Hash related vars as __initdata
Christophe Leroy [Thu, 22 Oct 2020 06:29:32 +0000 (06:29 +0000)]
powerpc/32s: Declare Hash related vars as __initdata

Hash related vars are used at init only.

Declare them in __initdata.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3878ea30706839fcff9196790ff3f99c128c3f6a.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Make Hash var static
Christophe Leroy [Thu, 22 Oct 2020 06:29:31 +0000 (06:29 +0000)]
powerpc/32s: Make Hash var static

Hash var is used only locally in mmu.c now.

No need to set it in head_32.S anymore.

Make it static and initialises it to the early hash table.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/786c82a89cdfdaabb32b72a44f7c312fa81d192b.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Use mmu_has_feature(MMU_FTR_HPTE_TABLE) instead of checking Hash var
Christophe Leroy [Thu, 22 Oct 2020 06:29:30 +0000 (06:29 +0000)]
powerpc/32s: Use mmu_has_feature(MMU_FTR_HPTE_TABLE) instead of checking Hash var

We now have an early hash table on hash MMU, so no need to check
Hash var to know if the Hash table is set of not.

Use mmu_has_feature(MMU_FTR_HPTE_TABLE) instead. This will allow
optimisation via jump_label.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f1766631a9e014b6433f1a3c12c726ddfce34220.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Make bat_addrs[] static
Christophe Leroy [Thu, 22 Oct 2020 06:29:29 +0000 (06:29 +0000)]
powerpc/32s: Make bat_addrs[] static

This table is used only locally. Declare it static.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/054fec0c139fc4c0a306360b360784733c0a6e65.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/mm: Remove flush_tlb_page_nohash() prototype.
Christophe Leroy [Thu, 22 Oct 2020 06:29:28 +0000 (06:29 +0000)]
powerpc/mm: Remove flush_tlb_page_nohash() prototype.

flush_tlb_page_nohash() was removed by
commit 703b41ad1a87 ("powerpc/mm: remove flush_tlb_page_nohash")

Remove stale prototype and comment.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4a58831da6d6ba4fe309b94aa1dd8f02982d46b2.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/mm: Add mask of always present MMU features
Christophe Leroy [Thu, 22 Oct 2020 06:29:27 +0000 (06:29 +0000)]
powerpc/mm: Add mask of always present MMU features

On the same principle as commit 773edeadf672 ("powerpc/mm: Add mask
of possible MMU features"), add mask for MMU features that are
always there in order to optimise out dead branches.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4943775fbe91885eb3e09133b093aaf62e55c715.1603348103.git.christophe.leroy@csgroup.eu
4 years agopowerpc/rtas: Fix typo of ibm,open-errinjct in RTAS filter
Tyrel Datwyler [Tue, 8 Dec 2020 19:54:34 +0000 (13:54 -0600)]
powerpc/rtas: Fix typo of ibm,open-errinjct in RTAS filter

Commit bd59380c5ba4 ("powerpc/rtas: Restrict RTAS requests from userspace")
introduced the following error when invoking the errinjct userspace
tool:

  [root@ltcalpine2-lp5 librtas]# errinjct open
  [327884.071171] sys_rtas: RTAS call blocked - exploit attempt?
  [327884.071186] sys_rtas: token=0x26, nargs=0 (called by errinjct)
  errinjct: Could not open RTAS error injection facility
  errinjct: librtas: open: Unexpected I/O error

The entry for ibm,open-errinjct in rtas_filter array has a typo where
the "j" is omitted in the rtas call name. After fixing this typo the
errinjct tool functions again as expected.

  [root@ltcalpine2-lp5 linux]# errinjct open
  RTAS error injection facility open, token = 1

Fixes: bd59380c5ba4 ("powerpc/rtas: Restrict RTAS requests from userspace")
Cc: stable@vger.kernel.org
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201208195434.8289-1-tyreld@linux.ibm.com
4 years agopowerpc/powermac: Fix low_sleep_handler with CONFIG_VMAP_STACK
Christophe Leroy [Tue, 8 Dec 2020 05:24:19 +0000 (05:24 +0000)]
powerpc/powermac: Fix low_sleep_handler with CONFIG_VMAP_STACK

low_sleep_handler() can't restore the context from standard
stack because the stack can hardly be accessed with MMU OFF.

Store everything in a global storage area instead of storing
a pointer to the stack in that global storage area.

To avoid a complete churn of the function, still use r1 as
the pointer to the storage area during restore.

Fixes: cd08f109e262 ("powerpc/32s: Enable CONFIG_VMAP_STACK")
Reported-by: Giuseppe Sacco <giuseppe@sguazz.it>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Giuseppe Sacco <giuseppe@sguazz.it>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e3e0d8042a3ba75cb4a9546c19c408b5b5b28994.1607404931.git.christophe.leroy@csgroup.eu
4 years agopowerpc: fix spelling mistake in Kconfig "seleted" -> "selected"
Colin Ian King [Mon, 7 Dec 2020 15:54:20 +0000 (15:54 +0000)]
powerpc: fix spelling mistake in Kconfig "seleted" -> "selected"

There is a spelling mistake in the help text of the Kconfig. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207155420.172370-1-colin.king@canonical.com
4 years agopowerpc/pseries/mobility: refactor node lookup during DT update
Nathan Lynch [Mon, 7 Dec 2020 21:52:00 +0000 (15:52 -0600)]
powerpc/pseries/mobility: refactor node lookup during DT update

In pseries_devicetree_update(), with each call to ibm,update-nodes the
partition firmware communicates the node to be deleted or updated by
placing its phandle in the work buffer. Each of delete_dt_node(),
update_dt_node(), and add_dt_node() have duplicate lookups using the
phandle value and corresponding refcount management.

Move the lookup and of_node_put() into pseries_devicetree_update(),
and emit a warning on any failed lookups.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-29-nathanl@linux.ibm.com
4 years agopowerpc/rtas: remove unused rtas_suspend_me_data
Nathan Lynch [Mon, 7 Dec 2020 21:51:59 +0000 (15:51 -0600)]
powerpc/rtas: remove unused rtas_suspend_me_data

All code which used this type has been removed.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-28-nathanl@linux.ibm.com
4 years agopowerpc/pseries/hibernation: remove prepare_late() callback
Nathan Lynch [Mon, 7 Dec 2020 21:51:58 +0000 (15:51 -0600)]
powerpc/pseries/hibernation: remove prepare_late() callback

The pseries hibernate code no longer calls into the original
join/suspend code in kernel/rtas.c, so pseries_prepare_late() and
related code don't accomplish anything now.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-27-nathanl@linux.ibm.com
4 years agopowerpc/pseries/hibernation: perform post-suspend fixups later
Nathan Lynch [Mon, 7 Dec 2020 21:51:57 +0000 (15:51 -0600)]
powerpc/pseries/hibernation: perform post-suspend fixups later

The pseries hibernate code calls post_mobility_fixup() which is sort
of a dumping ground of fixups that need to run after resuming from
suspend regardless of whether suspend was a hibernation or a
migration. Calling post_mobility_fixup() from
pseries_suspend_enable_irqs() runs this code early in resume with
devices suspended and only one CPU up, while the much more commonly
used migration case runs these fixups in a more typical process
context.

Call post_mobility_fixup() after the suspend core returns a success
status to the hibernate sysfs store method and remove
pseries_suspend_enable_irqs().

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-26-nathanl@linux.ibm.com
4 years agopowerpc/pseries/hibernation: remove redundant cacheinfo update
Nathan Lynch [Mon, 7 Dec 2020 21:51:56 +0000 (15:51 -0600)]
powerpc/pseries/hibernation: remove redundant cacheinfo update

Partitions with cache nodes in the device tree can encounter the
following warning on resume:

CPU 0 already accounted in PowerPC,POWER9@0(Data)
WARNING: CPU: 0 PID: 3177 at arch/powerpc/kernel/cacheinfo.c:197 cacheinfo_cpu_online+0x640/0x820

These calls to cacheinfo_cpu_offline/online have been redundant since
commit e610a466d16a ("powerpc/pseries/mobility: rebuild cacheinfo
hierarchy post-migration").

Fixes: e610a466d16a ("powerpc/pseries/mobility: rebuild cacheinfo hierarchy post-migration")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-25-nathanl@linux.ibm.com
4 years agopowerpc/rtas: remove unused rtas_suspend_last_cpu()
Nathan Lynch [Mon, 7 Dec 2020 21:51:55 +0000 (15:51 -0600)]
powerpc/rtas: remove unused rtas_suspend_last_cpu()

rtas_suspend_last_cpu() is now unused, remove it and
__rtas_suspend_last_cpu() which also becomes unused.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-24-nathanl@linux.ibm.com
4 years agopowerpc/pseries/hibernation: switch to rtas_ibm_suspend_me()
Nathan Lynch [Mon, 7 Dec 2020 21:51:54 +0000 (15:51 -0600)]
powerpc/pseries/hibernation: switch to rtas_ibm_suspend_me()

rtas_suspend_last_cpu() and related code perform a lot of work that
isn't relevant to the hibernation workflow. All other CPUs are offline
when called so there is no need to place them in H_JOIN or prod them
on resume, nor is there need for retries or operations on shared
state.

Call the rtas_ibm_suspend_me() wrapper function directly from
pseries_suspend_enter() instead of using rtas_suspend_last_cpu().

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-23-nathanl@linux.ibm.com
4 years agopowerpc/rtas: remove rtas_suspend_cpu()
Nathan Lynch [Mon, 7 Dec 2020 21:51:53 +0000 (15:51 -0600)]
powerpc/rtas: remove rtas_suspend_cpu()

rtas_suspend_cpu() no longer has users; remove it and
__rtas_suspend_cpu() which now becomes unused as well.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-22-nathanl@linux.ibm.com
4 years agopowerpc/machdep: remove suspend_disable_cpu()
Nathan Lynch [Mon, 7 Dec 2020 21:51:52 +0000 (15:51 -0600)]
powerpc/machdep: remove suspend_disable_cpu()

There are no users left of the suspend_disable_cpu() callback, remove
it.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-21-nathanl@linux.ibm.com
4 years agopowerpc/pseries/hibernation: remove pseries_suspend_cpu()
Nathan Lynch [Mon, 7 Dec 2020 21:51:51 +0000 (15:51 -0600)]
powerpc/pseries/hibernation: remove pseries_suspend_cpu()

Since commit 48f6e7f6d948 ("powerpc/pseries: remove cede offline state
for CPUs"), ppc_md.suspend_disable_cpu() is no longer used and all
CPUs (save one) are placed into true offline state as opposed to
H_JOIN. So pseries_suspend_cpu() is effectively unused; remove it.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-20-nathanl@linux.ibm.com
4 years agopowerpc/pseries/hibernation: pass stream id via function arguments
Nathan Lynch [Mon, 7 Dec 2020 21:51:50 +0000 (15:51 -0600)]
powerpc/pseries/hibernation: pass stream id via function arguments

There is no need for the stream id to be a file-global variable; pass
it from hibernate_store() to pseries_suspend_begin() for the
H_VASI_STATE call.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-19-nathanl@linux.ibm.com
4 years agopowerpc/pseries/hibernation: drop pseries_suspend_begin() from suspend ops
Nathan Lynch [Mon, 7 Dec 2020 21:51:49 +0000 (15:51 -0600)]
powerpc/pseries/hibernation: drop pseries_suspend_begin() from suspend ops

There are three ways pseries_suspend_begin() can be reached:

1. When "mem" is written to /sys/power/state:

kobj_attr_store()
-> state_store()
  -> pm_suspend()
    -> suspend_devices_and_enter()
      -> pseries_suspend_begin()

This never works because there is no way to supply a valid stream id
using this interface, and H_VASI_STATE is called with a stream id of
zero. So this call path is useless at best.

2. When a stream id is written to /sys/devices/system/power/hibernate.
pseries_suspend_begin() is polled directly from store_hibernate()
until the stream is in the "Suspending" state (i.e. the platform is
ready for the OS to suspend execution):

dev_attr_store()
-> store_hibernate()
  -> pseries_suspend_begin()

3. When a stream id is written to /sys/devices/system/power/hibernate
(continued). After #2, pseries_suspend_begin() is called once again
from the pm core:

dev_attr_store()
-> store_hibernate()
  -> pm_suspend()
    -> suspend_devices_and_enter()
      -> pseries_suspend_begin()

This is redundant because the VASI suspend state is already known to
be Suspending.

The begin() callback of platform_suspend_ops is optional, so we can
simply remove that assignment with no loss of function.

Fixes: 32d8ad4e621d ("powerpc/pseries: Partition hibernation support")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-18-nathanl@linux.ibm.com
4 years agopowerpc/rtas: remove rtas_ibm_suspend_me_unsafe()
Nathan Lynch [Mon, 7 Dec 2020 21:51:48 +0000 (15:51 -0600)]
powerpc/rtas: remove rtas_ibm_suspend_me_unsafe()

rtas_ibm_suspend_me_unsafe() is now unused; remove it and
rtas_percpu_suspend_me() which becomes unused as a result.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-17-nathanl@linux.ibm.com
4 years agopowerpc/rtas: dispatch partition migration requests to pseries
Nathan Lynch [Mon, 7 Dec 2020 21:51:47 +0000 (15:51 -0600)]
powerpc/rtas: dispatch partition migration requests to pseries

sys_rtas() cannot call ibm,suspend-me directly in the same way it
handles other inputs. Instead it must dispatch the request to code
that can first perform the H_JOIN sequence before any call to
ibm,suspend-me can succeed. Over time kernel/rtas.c has accreted a fair
amount of platform-specific code to implement this.

Since a different, more robust implementation of the suspend sequence
is now in the pseries platform code, we want to dispatch the request
there.

Note that invoking ibm,suspend-me via the RTAS syscall is all but
deprecated; this change preserves ABI compatibility for old programs
while providing to them the benefit of the new partition suspend
implementation. This is a behavior change in that the kernel performs
the device tree update and firmware activation before returning, but
experimentation indicates this is tolerated fine by legacy user space.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-16-nathanl@linux.ibm.com
4 years agopowerpc/pseries/mobility: retry partition suspend after error
Nathan Lynch [Mon, 7 Dec 2020 21:51:46 +0000 (15:51 -0600)]
powerpc/pseries/mobility: retry partition suspend after error

This is a mitigation for the relatively rare occurrence where a
virtual IOA can be in a transient state that prevents the
suspend/migration from succeeding, resulting in an error from
ibm,suspend-me.

If the join/suspend sequence returns an error, it is acceptable to
retry as long as the VASI suspend session state is still
"Suspending" (i.e. the platform is still waiting for the OS to
suspend).

Retry a few times on suspend failure while this condition holds,
progressively increasing the delay between attempts. We don't want to
retry indefinitey because firmware emits an error log event on each
unsuccessful attempt.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-15-nathanl@linux.ibm.com
4 years agopowerpc/pseries/mobility: signal suspend cancellation to platform
Nathan Lynch [Mon, 7 Dec 2020 21:51:45 +0000 (15:51 -0600)]
powerpc/pseries/mobility: signal suspend cancellation to platform

If we're returning an error to user space, use H_VASI_SIGNAL to send a
cancellation request to the platform. This isn't strictly required but
it communicates that Linux will not attempt to complete the suspend,
which allows the various entities involved to promptly end the
operation in progress.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-14-nathanl@linux.ibm.com
4 years agopowerpc/pseries/mobility: use stop_machine for join/suspend
Nathan Lynch [Mon, 7 Dec 2020 21:51:44 +0000 (15:51 -0600)]
powerpc/pseries/mobility: use stop_machine for join/suspend

The partition suspend sequence as specified in the platform
architecture requires that all active processor threads call
H_JOIN, which:

- suspends the calling thread until it is the target of
  an H_PROD; or
- immediately returns H_CONTINUE, if the calling thread is the last to
  call H_JOIN. This thread is expected to call ibm,suspend-me to
  completely suspend the partition.

Upon returning from ibm,suspend-me the calling thread must wake all
others using H_PROD.

rtas_ibm_suspend_me_unsafe() uses on_each_cpu() to implement this
protocol, but because of its synchronizing nature this is susceptible
to deadlock versus users of stop_machine() or other callers of
on_each_cpu().

Not only is stop_machine() intended for use cases like this, it
handles error propagation and allows us to keep the data shared
between CPUs minimal: a single atomic counter which ensures exactly
one CPU will wake the others from their joined states.

Switch the migration code to use stop_machine() and a less complex
local implementation of the H_JOIN/ibm,suspend-me logic, which
carries additional benefits:

- more informative error reporting, appropriately ratelimited
- resets the lockup detector / watchdog on resume to prevent lockup
  warnings when the OS has been suspended for a time exceeding the
  threshold.

Fixes: 91dc182ca6e2 ("[PATCH] powerpc: special-case ibm,suspend-me RTAS call")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-13-nathanl@linux.ibm.com
4 years agopowerpc/pseries/mobility: extract VASI session polling logic
Nathan Lynch [Mon, 7 Dec 2020 21:51:43 +0000 (15:51 -0600)]
powerpc/pseries/mobility: extract VASI session polling logic

The behavior of rtas_ibm_suspend_me_unsafe() is to return -EAGAIN to
the caller until the specified VASI suspend session state makes the
transition from H_VASI_ENABLED to H_VASI_SUSPENDING. In the interest
of separating concerns to prepare for a new implementation of the
join/suspend sequence, extract VASI session polling logic into a
couple of local functions. Waiting for the session state to reach
H_VASI_SUSPENDING before calling rtas_ibm_suspend_me_unsafe() ensures
that we will never get an EAGAIN result necessitating a retry. No
user-visible change in behavior is intended.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-12-nathanl@linux.ibm.com
4 years agopowerpc/pseries/mobility: use rtas_activate_firmware() on resume
Nathan Lynch [Mon, 7 Dec 2020 21:51:42 +0000 (15:51 -0600)]
powerpc/pseries/mobility: use rtas_activate_firmware() on resume

It's incorrect to abort post-suspend processing if
ibm,activate-firmware isn't available. Use rtas_activate_firmware(),
which logs this condition appropriately and allows us to proceed.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-11-nathanl@linux.ibm.com
4 years agopowerpc/pseries/mobility: error message improvements
Nathan Lynch [Mon, 7 Dec 2020 21:51:41 +0000 (15:51 -0600)]
powerpc/pseries/mobility: error message improvements

- Convert printk(KERN_ERR) to pr_err().
- Include errno in property update failure message.
- Remove reference to "Post-mobility" from device tree update message:
  with pr_err() it will have a "mobility:" prefix.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-10-nathanl@linux.ibm.com
4 years agopowerpc/pseries/mobility: add missing break to default case
Nathan Lynch [Mon, 7 Dec 2020 21:51:40 +0000 (15:51 -0600)]
powerpc/pseries/mobility: add missing break to default case

update_dt_node() has a switch statement where the default case lacks a
break statement.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-9-nathanl@linux.ibm.com
4 years agopowerpc/pseries/mobility: don't error on absence of ibm, update-nodes
Nathan Lynch [Mon, 7 Dec 2020 21:51:39 +0000 (15:51 -0600)]
powerpc/pseries/mobility: don't error on absence of ibm, update-nodes

Treat the absence of the ibm,update-nodes function as benign instead
of reporting an error. If the platform does not provide that facility,
it's not a problem for Linux.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-8-nathanl@linux.ibm.com
4 years agopowerpc/hvcall: add token and codes for H_VASI_SIGNAL
Nathan Lynch [Mon, 7 Dec 2020 21:51:38 +0000 (15:51 -0600)]
powerpc/hvcall: add token and codes for H_VASI_SIGNAL

H_VASI_SIGNAL can be used by a partition to request cancellation of
its migration. To be used in future changes.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-7-nathanl@linux.ibm.com
4 years agopowerpc/rtas: add rtas_activate_firmware()
Nathan Lynch [Mon, 7 Dec 2020 21:51:37 +0000 (15:51 -0600)]
powerpc/rtas: add rtas_activate_firmware()

Provide a documented wrapper function for the ibm,activate-firmware
service, which must be called after a partition migration or
hibernation.

If the function is absent or the call fails, the OS will continue to
run normally with the current firmware, so there is no need to perform
any recovery. Just log it and continue.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-6-nathanl@linux.ibm.com
4 years agopowerpc/rtas: add rtas_ibm_suspend_me()
Nathan Lynch [Mon, 7 Dec 2020 21:51:36 +0000 (15:51 -0600)]
powerpc/rtas: add rtas_ibm_suspend_me()

Now that the name is available, provide a simple wrapper for
ibm,suspend-me which returns both a Linux errno and optionally the
actual RTAS status to the caller.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-5-nathanl@linux.ibm.com
4 years agopowerpc/rtas: rtas_ibm_suspend_me -> rtas_ibm_suspend_me_unsafe
Nathan Lynch [Mon, 7 Dec 2020 21:51:35 +0000 (15:51 -0600)]
powerpc/rtas: rtas_ibm_suspend_me -> rtas_ibm_suspend_me_unsafe

The pseries partition suspend sequence requires that all active CPUs
call H_JOIN, which suspends all but one of them with interrupts
disabled. The "chosen" CPU is then to call ibm,suspend-me to complete
the suspend. Upon returning from ibm,suspend-me, the chosen CPU is to
use H_PROD to wake the joined CPUs.

Using on_each_cpu() for this, as rtas_ibm_suspend_me() does to
implement partition migration, is susceptible to deadlock with other
users of on_each_cpu() and with users of stop_machine APIs. The
callback passed to on_each_cpu() is not allowed to synchronize with
other CPUs in the way it is used here.

Complicating the fix is the fact that rtas_ibm_suspend_me() also
occupies the function name that should be used to provide a more
conventional wrapper for ibm,suspend-me. Rename rtas_ibm_suspend_me()
to rtas_ibm_suspend_me_unsafe() to free up the name and indicate that
it should not gain users.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-4-nathanl@linux.ibm.com
4 years agopowerpc/rtas: complete ibm,suspend-me status codes
Nathan Lynch [Mon, 7 Dec 2020 21:51:34 +0000 (15:51 -0600)]
powerpc/rtas: complete ibm,suspend-me status codes

We don't completely account for the possible return codes for
ibm,suspend-me. Add definitions for these.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-3-nathanl@linux.ibm.com
4 years agopowerpc/rtas: prevent suspend-related sys_rtas use on LE
Nathan Lynch [Mon, 7 Dec 2020 21:51:33 +0000 (15:51 -0600)]
powerpc/rtas: prevent suspend-related sys_rtas use on LE

While drmgr has had work in some areas to make its RTAS syscall
interactions endian-neutral, its code for performing partition
migration via the syscall has never worked on LE. While it is able to
complete ibm,suspend-me successfully, it crashes when attempting the
subsequent ibm,update-nodes call.

drmgr is the only known (or plausible) user of ibm,suspend-me,
ibm,update-nodes, and ibm,update-properties, so allow them only in
big-endian configurations.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-2-nathanl@linux.ibm.com
4 years agopowerpc/book3s64/kuap: Improve error reporting with KUAP
Aneesh Kumar K.V [Tue, 8 Dec 2020 03:15:39 +0000 (08:45 +0530)]
powerpc/book3s64/kuap: Improve error reporting with KUAP

This partially reverts commit eb232b162446 ("powerpc/book3s64/kuap: Improve
error reporting with KUAP") and update the fault handler to print

[   55.022514] Kernel attempted to access user page (7e6725b70000) - exploit attempt? (uid: 0)
[   55.022528] BUG: Unable to handle kernel data access on read at 0x7e6725b70000
[   55.022533] Faulting instruction address: 0xc000000000e8b9bc
[   55.022540] Oops: Kernel access of bad area, sig: 11 [#1]
....

when the kernel access userspace address without unlocking AMR.

bad_kuap_fault() is added as part of commit 5e5be3aed230 ("powerpc/mm: Detect
bad KUAP faults") to catch userspace access incorrectly blocked by AMR. Hence
retain the full stack dump there even with hash translation. Also, add a comment
explaining the difference between hash and radix.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201208031539.84878-1-aneesh.kumar@linux.ibm.com
4 years agopowerpc/powernv/idle: Restore CIABR after idle for Power9
Jordan Niethe [Mon, 7 Dec 2020 01:05:19 +0000 (12:05 +1100)]
powerpc/powernv/idle: Restore CIABR after idle for Power9

On Power9, CIABR is lost after idle. This means that instruction
breakpoints set by xmon which use CIABR do not work. Fix this by
restoring CIABR after idle.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207010519.15597-2-jniethe5@gmail.com
4 years agopowerpc/book3s64/kexec: Clear CIABR on kexec
Jordan Niethe [Mon, 7 Dec 2020 01:05:18 +0000 (12:05 +1100)]
powerpc/book3s64/kexec: Clear CIABR on kexec

The value in CIABR persists across kexec which can lead to unintended
results when the new kernel hits the old kernel's breakpoint. For
example:

0:mon> bi $loadavg_proc_show
0:mon> b
   type            address
1 inst   c000000000519060  loadavg_proc_show+0x0/0x130
0:mon> x

$ kexec -l /mnt/vmlinux --initrd=/mnt/rootfs.cpio.gz --append='xmon=off'
$ kexec -e

$ cat /proc/loadavg
Trace/breakpoint trap

Make sure CIABR is cleared so this does not happen.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207010519.15597-1-jniethe5@gmail.com
4 years agopowerpc: Remove ucache_bsize
Christophe Leroy [Tue, 17 Nov 2020 05:07:59 +0000 (05:07 +0000)]
powerpc: Remove ucache_bsize

ppc601 and e200 were the users of ucache_bsize.
ppc601 and e200 are now gone.

Remove ucache_bsize.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/288b6048597c0fdc495b203fda57a223d89499d2.1605589460.git.christophe.leroy@csgroup.eu
4 years agopowerpc: Retire e200 core (mpc555x processor)
Christophe Leroy [Tue, 17 Nov 2020 05:07:58 +0000 (05:07 +0000)]
powerpc: Retire e200 core (mpc555x processor)

There is no defconfig selecting CONFIG_E200, and no platform.

e200 is an earlier version of booke, a predecessor of e500,
with some particularities like an unified cache instead of both an
instruction cache and a data cache.

Remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/34ebc3ba2c768d97f363bd5f2deea2356e9ae127.1605589460.git.christophe.leroy@csgroup.eu
4 years agopowerpc: Fix update form addressing in inline assembly
Christophe Leroy [Thu, 22 Oct 2020 09:29:21 +0000 (09:29 +0000)]
powerpc: Fix update form addressing in inline assembly

In several places, inline assembly uses the "%Un" modifier
to enable the use of instruction with update form addressing,
but the associated "<>" constraint is missing.

As mentioned in previous patch, this fails with gcc 4.9, so
"<>" can't be used directly.

Use UPD_CONSTR macro everywhere %Un modifier is used.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/62eab5ca595485c192de1765bdac099f633a21d0.1603358942.git.christophe.leroy@csgroup.eu
4 years agopowerpc: Fix incorrect stw{, ux, u, x} instructions in __set_pte_at
Mathieu Desnoyers [Thu, 22 Oct 2020 09:29:20 +0000 (09:29 +0000)]
powerpc: Fix incorrect stw{, ux, u, x} instructions in __set_pte_at

The placeholder for instruction selection should use the second
argument's operand, which is %1, not %0. This could generate incorrect
assembly code if the memory addressing of operand %0 is a different
form from that of operand %1.

Also remove the %Un placeholder because having %Un placeholders
for two operands which are based on the same local var (ptep) doesn't
make much sense. By the way, it doesn't change the current behaviour
because "<>" constraint is missing for the associated "=m".

[chleroy: revised commit log iaw segher's comments and removed %U0]

Fixes: 9bf2b5cdc5fe ("powerpc: Fixes for CONFIG_PTE_64BIT for SMP support")
Cc: <stable@vger.kernel.org> # v2.6.28+
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/96354bd77977a6a933fe9020da57629007fdb920.1603358942.git.christophe.leroy@csgroup.eu
4 years agopowerpc/xmon: Change printk() to pr_cont()
Christophe Leroy [Fri, 4 Dec 2020 10:35:38 +0000 (10:35 +0000)]
powerpc/xmon: Change printk() to pr_cont()

Since some time now, printk() adds carriage return, leading to
unusable xmon output if there is no udbg backend available:

  [   54.288722] sysrq: Entering xmon
  [   54.292209] Vector: 0  at [cace3d2c]
  [   54.292274]     pc:
  [   54.292331] c0023650
  [   54.292468] : xmon+0x28/0x58
  [   54.292519]
  [   54.292574]     lr:
  [   54.292630] c0023724
  [   54.292749] : sysrq_handle_xmon+0xa4/0xfc
  [   54.292801]
  [   54.292867]     sp: cace3de8
  [   54.292931]    msr: 9032
  [   54.292999]   current = 0xc28d0000
  [   54.293072]     pid   = 377, comm = sh
  [   54.293157] Linux version 5.10.0-rc6-s3k-dev-01364-gedf13f0ccd76-dirty (root@po17688vm.idsi0.si.c-s.fr) (powerpc64-linux-gcc (GCC) 10.1.0, GNU ld (GNU Binutils) 2.34) #4211 PREEMPT Fri Dec 4 09:32:11 UTC 2020
  [   54.293287] enter ? for help
  [   54.293470] [cace3de8]
  [   54.293532] c0023724
  [   54.293654]  sysrq_handle_xmon+0xa4/0xfc
  [   54.293711]  (unreliable)
  ...
  [   54.296002]
  [   54.296159] --- Exception: c01 (System Call) at
  [   54.296217] 0fd4e784
  [   54.296303]
  [   54.296375] SP (7fca6ff0) is in userspace
  [   54.296431] mon>
  [   54.296484]  <no input ...>

Use pr_cont() instead.

Fixes: 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Mention that it only happens when udbg is not available]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c8a6ec704416ecd5ff2bd26213c9bc026bdd19de.1607077340.git.christophe.leroy@csgroup.eu
4 years agopowerpc/powernv/npu: Do not attempt NPU2 setup on POWER8NVL NPU
Alexey Kardashevskiy [Sun, 22 Nov 2020 07:38:28 +0000 (18:38 +1100)]
powerpc/powernv/npu: Do not attempt NPU2 setup on POWER8NVL NPU

We execute certain NPU2 setup code (such as mapping an LPID to a device
in NPU2) unconditionally if an Nvlink bridge is detected. However this
cannot succeed on POWER8NVL machines and errors appear in dmesg. This is
harmless as skiboot returns an error and the only place we check it is
vfio-pci but that code does not get called on P8+ either.

This adds a check if pnv_npu2_xxx helpers are called on a machine with
NPU2 which initializes pnv_phb::npu in pnv_npu2_init();
pnv_phb::npu==NULL on POWER8/NVL (Naples).

While at this, fix NULL derefencing in pnv_npu_peers_take_ownership/
pnv_npu_peers_release_ownership which occurs when GPUs on mentioned P8s
cause EEH which happens if "vfio-pci" disables devices using
the D3 power state; the vfio-pci's disable_idle_d3 module parameter
controls this and must be set on Naples. The EEH handling clears
the entire pnv_ioda_pe struct in pnv_ioda_free_pe() hence
the NULL derefencing. We cannot recover from that but at least we stop
crashing.

Tested on
- POWER9 pvr=004e1201, Ubuntu 19.04 host, Ubuntu 18.04 vm,
  NVIDIA GV100 10de:1db1 driver 418.39
- POWER8 pvr=004c0100, RHEL 7.6 host, Ubuntu 16.10 vm,
  NVIDIA P100 10de:15f9 driver 396.47

Fixes: 1b785611e119 ("powerpc/powernv/npu: Add release_ownership hook")
Cc: stable@vger.kernel.org # 5.0
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201122073828.15446-1-aik@ozlabs.ru
4 years agolkdtm/powerpc: Add SLB multihit test
Ganesh Goudar [Mon, 30 Nov 2020 08:30:57 +0000 (14:00 +0530)]
lkdtm/powerpc: Add SLB multihit test

To check machine check handling, add support to inject slb
multihit errors.

Co-developed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
[mpe: Use CONFIG_PPC_BOOK3S_64 to fix compile errors reported by lkp@intel.com]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201130083057.135610-1-ganeshgr@linux.ibm.com
4 years agopowernv/pci: Print an error when device enable is blocked
Oliver O'Halloran [Thu, 9 Apr 2020 06:13:37 +0000 (16:13 +1000)]
powernv/pci: Print an error when device enable is blocked

If the platform decides to block enabling the device nothing is printed
currently. This can lead to some confusion since the dmesg output will
usually print an error with no context e.g.

e1000e: probe of 0022:01:00.0 failed with error -22

This shouldn't be spammy since pci_enable_device() already prints a
messages when it succeeds.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200409061337.9187-1-oohall@gmail.com
4 years agopowerpc/pci: Remove LSI mappings on device teardown
Oliver O'Halloran [Wed, 2 Dec 2020 00:52:22 +0000 (11:52 +1100)]
powerpc/pci: Remove LSI mappings on device teardown

When a passthrough IO adapter is removed from a pseries machine using hash
MMU and the XIVE interrupt mode, the POWER hypervisor expects the guest OS
to clear all page table entries related to the adapter. If some are still
present, the RTAS call which isolates the PCI slot returns error 9001
"valid outstanding translations" and the removal of the IO adapter fails.
This is because when the PHBs are scanned, Linux maps automatically the
INTx interrupts in the Linux interrupt number space but these are never
removed.

This problem can be fixed by adding the corresponding unmap operation when
the device is removed. There's no pcibios_* hook for the remove case, but
the same effect can be achieved using a bus notifier.

Because INTx are shared among PHBs (and potentially across the system),
this adds tracking of virq to unmap them only when the last user is gone.

[aik: added refcounter]

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201202005222.5477-1-aik@ozlabs.ru
4 years agopowerpc: add security.config, enforcing lockdown=integrity
Daniel Axtens [Thu, 3 Dec 2020 04:28:07 +0000 (15:28 +1100)]
powerpc: add security.config, enforcing lockdown=integrity

It's sometimes handy to have a config that boots a bit like a system
under secure boot (forcing lockdown=integrity, without needing any
extra stuff like a command line option).

This config file allows that, and also turns on a few assorted security
and hardening options for good measure.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201203042807.1293655-1-dja@axtens.net
4 years agopowerpc/44x: Don't support 47x code and non 47x code at the same time
Christophe Leroy [Sun, 18 Oct 2020 17:25:18 +0000 (17:25 +0000)]
powerpc/44x: Don't support 47x code and non 47x code at the same time

440/460 variants and 470 variants are not compatible, no
need to make code supporting both and using MMU features.

Just use CONFIG_PPC_47x to decide what to build.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c3e64da3d5d068c69a201e03bbae7da055761e5b.1603041883.git.christophe.leroy@csgroup.eu
4 years agopowerpc/44x: Don't support 440 when CONFIG_PPC_47x is set
Christophe Leroy [Sun, 18 Oct 2020 17:25:17 +0000 (17:25 +0000)]
powerpc/44x: Don't support 440 when CONFIG_PPC_47x is set

As stated in platform/44x/Kconfig, CONFIG_PPC_47x is not
compatible with 440 and 460 variants.

This is confirmed in asm/cache.h as L1_CACHE_SHIFT is different
for 47x, meaning a kernel built for 47x will not run correctly
on a 440.

In cputable, opt out all 440 and 460 variants when CONFIG_PPC_47x
is set. Also add a default match dedicated to 470.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/822833ce3dc10634339818f7d1ab616edf63b0c6.1603041883.git.christophe.leroy@csgroup.eu
4 years agopowerpc/feature: Remove CPU_FTR_NODSISRALIGN
Christophe Leroy [Tue, 13 Oct 2020 11:11:21 +0000 (11:11 +0000)]
powerpc/feature: Remove CPU_FTR_NODSISRALIGN

CPU_FTR_NODSISRALIGN has not been used since
commit 31bfdb036f12 ("powerpc: Use instruction emulation
infrastructure to handle alignment faults")

Remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/05d98136b24bbf11525445414bb18cffe2724f48.1602587470.git.christophe.leroy@csgroup.eu
4 years agopowerpc/mm: Desintegrate MMU_FTR_PPCAS_ARCH_V2
Christophe Leroy [Mon, 12 Oct 2020 08:04:24 +0000 (08:04 +0000)]
powerpc/mm: Desintegrate MMU_FTR_PPCAS_ARCH_V2

MMU_FTR_PPCAS_ARCH_V2 is defined in cpu_table.h
as MMU_FTR_TLBIEL | MMU_FTR_16M_PAGE.

MMU_FTR_TLBIEL and MMU_FTR_16M_PAGE are defined in mmu.h

MMU_FTR_PPCAS_ARCH_V2 is used only in mmu.h and it is used only once.

Remove MMU_FTR_PPCAS_ARCH_V2 and use
directly MMU_FTR_TLBIEL | MMU_FTR_16M_PAGE

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/829ae1aed1d2fc6b5fc5818362e573dee5d6ecde.1602489852.git.christophe.leroy@csgroup.eu
4 years agopowerpc/mm: MMU_FTR_NEED_DTLB_SW_LRU is only possible with CONFIG_PPC_83xx
Christophe Leroy [Mon, 12 Oct 2020 08:05:49 +0000 (08:05 +0000)]
powerpc/mm: MMU_FTR_NEED_DTLB_SW_LRU is only possible with CONFIG_PPC_83xx

Only mpc83xx will set MMU_FTR_NEED_DTLB_SW_LRU and its
definition is enclosed in #ifdef CONFIG_PPC_83xx.

Make MMU_FTR_NEED_DTLB_SW_LRU possible only when
CONFIG_PPC_83xx is set.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d01d7613664fafa43de1f1ae89924075bc24241c.1602489931.git.christophe.leroy@csgroup.eu
4 years agopowerpc/mm: Remove useless #ifndef CPU_FTR_COHERENT_ICACHE in mem.c
Christophe Leroy [Mon, 12 Oct 2020 08:02:30 +0000 (08:02 +0000)]
powerpc/mm: Remove useless #ifndef CPU_FTR_COHERENT_ICACHE in mem.c

Since commit 10b35d9978ac ("[PATCH] powerpc: merged asm/cputable.h"),
CPU_FTR_COHERENT_ICACHE has always been defined.

Remove the #ifndef CPU_FTR_COHERENT_ICACHE block.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e26ddc1d6f6aca739dd8d2b7c67351ead559b084.1602489664.git.christophe.leroy@csgroup.eu
4 years agopowerpc/feature: Add CPU_FTR_NOEXECUTE to G2_LE
Christophe Leroy [Mon, 12 Oct 2020 08:02:13 +0000 (08:02 +0000)]
powerpc/feature: Add CPU_FTR_NOEXECUTE to G2_LE

G2_LE has a 603 core, add CPU_FTR_NOEXECUTE.

Fixes: 385e89d5b20f ("powerpc/mm: add exec protection on powerpc 603")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/39a530ee41d83f49747ab3af8e39c056450b9b4d.1602489653.git.christophe.leroy@csgroup.eu
4 years agopowerpc/mm: Fix verification of MMU_FTR_TYPE_44x
Christophe Leroy [Sat, 10 Oct 2020 17:30:59 +0000 (17:30 +0000)]
powerpc/mm: Fix verification of MMU_FTR_TYPE_44x

MMU_FTR_TYPE_44x cannot be checked by cpu_has_feature()

Use mmu_has_feature() instead

Fixes: 23eb7f560a2a ("powerpc: Convert flush_icache_range & friends to C")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ceede82fadf37f3b8275e61fcf8cf29a3e2ec7fe.1602351011.git.christophe.leroy@csgroup.eu
4 years agopowerpc/time: Remove ifdef in get_vtb()
Christophe Leroy [Thu, 1 Oct 2020 10:59:20 +0000 (10:59 +0000)]
powerpc/time: Remove ifdef in get_vtb()

SPRN_VTB and CPU_FTR_ARCH_207S are always defined,
no need of an ifdef.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a0fc81cd85121407726bcf480fc9a0d8e7617fce.1601549933.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32: Use SPRN_SPRG_SCRATCH2 in exception prologs
Christophe Leroy [Wed, 25 Nov 2020 07:10:53 +0000 (07:10 +0000)]
powerpc/32: Use SPRN_SPRG_SCRATCH2 in exception prologs

Use SPRN_SPRG_SCRATCH2 as a third scratch register in
exception prologs in order to simplify them and avoid
data going back and forth from/to CR.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6f5c8a7faa8cc54acb89c55c20aa579a2f30a4e9.1606285014.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Use SPRN_SPRG_SCRATCH2 in DSI prolog
Christophe Leroy [Wed, 25 Nov 2020 07:10:52 +0000 (07:10 +0000)]
powerpc/32s: Use SPRN_SPRG_SCRATCH2 in DSI prolog

Use SPRN_SPRG_SCRATCH2 as an alternative scratch register in
the early part of DSI prolog in order to avoid clobbering
SPRN_SPRG_SCRATCH0/1 used by other prologs.

The 603 doesn't like a jump from DataLoadTLBMiss to the 10 nops
that are now in the beginning of DSI exception as a result of
the feature section. To workaround this, add a jump as alternative.
It also avoids fetching 10 nops for nothing.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f9f8df2a2be93568768ef1ac793639f7914cf103.1606285014.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32: Simplify EXCEPTION_PROLOG_1 macro
Christophe Leroy [Wed, 25 Nov 2020 07:10:51 +0000 (07:10 +0000)]
powerpc/32: Simplify EXCEPTION_PROLOG_1 macro

Make code more readable with a clear CONFIG_VMAP_STACK
section and a clear non CONFIG_VMAP_STACK section.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c0f16cf432d22fc80097264d94649460d3dd761d.1606285014.git.christophe.leroy@csgroup.eu
4 years agopowerpc/603: Use SPRN_SDR1 to store the pgdir phys address
Christophe Leroy [Wed, 25 Nov 2020 07:10:50 +0000 (07:10 +0000)]
powerpc/603: Use SPRN_SDR1 to store the pgdir phys address

On the 603, SDR1 is not used.

In order to free SPRN_SPRG2, use SPRN_SDR1 to store the pgdir
phys addr.

But only some bits of SDR1 can be used (0xffff01ff).
As the pgdir is 4k aligned, rotate it by 4 bits to the left.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7370574b49d8476878ce5480726197993cb76108.1606285014.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Don't use SPRN_SPRG_PGDIR in hash_page
Christophe Leroy [Wed, 25 Nov 2020 07:10:49 +0000 (07:10 +0000)]
powerpc/32s: Don't use SPRN_SPRG_PGDIR in hash_page

SPRN_SPRG_PGDIR is there mainly to speedup SW TLB miss handlers
for powerpc 603.

We need to free SPRN_SPRG2 to reduce the mess with CONFIG_VMAP_STACK.

In hash_page(), reading PGDIR from thread_struct will be in the noise
performance wise.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4adca19b7120cdf619956768ed09e74fc6a558f3.1606285014.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Fix an FTR_SECTION_ELSE
Christophe Leroy [Wed, 25 Nov 2020 07:10:48 +0000 (07:10 +0000)]
powerpc/32s: Fix an FTR_SECTION_ELSE

An FTR_SECTION_ELSE is in the middle of
BEGIN_MMU_FTR_SECTION/ALT_MMU_FTR_SECTION_END_IFSET

Change it to MMU_FTR_SECTION_ELSE

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/61790f1a91692950a6bb5bb53d6d514d9bcdad74.1606285014.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Don't hash_preload() kernel text
Christophe Leroy [Wed, 25 Nov 2020 07:10:47 +0000 (07:10 +0000)]
powerpc/32s: Don't hash_preload() kernel text

We now always map kernel text with BATs. Neither need to preload
hash with kernel text addresses nor ensure they are never evicted.

This is more or less a revert of commit ee4f2ea48674 ("[POWERPC] Fix
32-bit mm operations when not using BATs")

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0a0bab7fadd89aa829e33420fbc10d60c59040a7.1606285014.git.christophe.leroy@csgroup.eu
4 years agopowerpc/32s: Always map kernel text and rodata with BATs
Christophe Leroy [Wed, 25 Nov 2020 07:10:46 +0000 (07:10 +0000)]
powerpc/32s: Always map kernel text and rodata with BATs

Since commit 2b279c0348af ("powerpc/32s: Allow mapping with BATs with
DEBUG_PAGEALLOC"), there is no real situation where mapping without
BATs is required.

In order to simplify memory handling, always map kernel text
and rodata with BATs even when "nobats" kernel parameter is set.

Also fix the 603 TLB miss exceptions that don't require anymore
kernel page table if DEBUG_PAGEALLOC.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu
4 years agoocxl: Add new kernel traces
Christophe Lombard [Wed, 25 Nov 2020 15:50:13 +0000 (16:50 +0100)]
ocxl: Add new kernel traces

Add specific kernel traces which provide information on mmu notifier and on
pages range.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-6-clombard@linux.vnet.ibm.com
4 years agoocxl: Add mmu notifier
Christophe Lombard [Wed, 25 Nov 2020 15:50:12 +0000 (16:50 +0100)]
ocxl: Add mmu notifier

Add invalidate_range mmu notifier, when required (ATSD access of MMIO
registers is available), to initiate TLB invalidation commands.
For the time being, the ATSD0 set of registers is used by default.

The pasid and bdf values have to be configured in the Process Element
Entry.
The PEE must be set up to match the BDF/PASID of the AFU.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-5-clombard@linux.vnet.ibm.com
4 years agoocxl: Update the Process Element Entry
Christophe Lombard [Wed, 25 Nov 2020 15:50:11 +0000 (16:50 +0100)]
ocxl: Update the Process Element Entry

To complete the MMIO based mechanism, the fields: PASID, bus, device and
function of the Process Element Entry have to be filled. (See
OpenCAPI Power Platform Architecture document)

                   Hypervisor Process Element Entry
Word
    0 1 .... 7  8  ...... 12  13 ..15  16.... 19  20 ........... 31
0                  OSL Configuration State (0:31)
1                  OSL Configuration State (32:63)
2               PASID                      |    Reserved
3       Bus   |   Device    |Function |        Reserved
4                             Reserved
5                             Reserved
6                               ....

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-4-clombard@linux.vnet.ibm.com
4 years agoocxl: Initiate a TLB invalidate command
Christophe Lombard [Wed, 25 Nov 2020 15:50:10 +0000 (16:50 +0100)]
ocxl: Initiate a TLB invalidate command

When a TLB Invalidate is required for the Logical Partition, the following
sequence has to be performed:

1. Load MMIO ATSD AVA register with the necessary value, if required.
2. Write the MMIO ATSD launch register to initiate the TLB Invalidate
command.
3. Poll the MMIO ATSD status register to determine when the TLB Invalidate
   has been completed.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-3-clombard@linux.vnet.ibm.com
4 years agoocxl: Assign a register set to a Logical Partition
Christophe Lombard [Wed, 25 Nov 2020 15:50:09 +0000 (16:50 +0100)]
ocxl: Assign a register set to a Logical Partition

Platform specific function to assign a register set to a Logical Partition.
The "ibm,mmio-atsd" property, provided by the firmware, contains the 16
base ATSD physical addresses (ATSD0 through ATSD15) of the set of MMIO
registers (XTS MMIO ATSDx LPARID/AVA/launch/status register).

For the time being, the ATSD0 set of registers is used by default.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-2-clombard@linux.vnet.ibm.com
4 years agopowerpc/perf: MMCR0 control for PMU registers under PMCC=00
Athira Rajeev [Thu, 26 Nov 2020 16:54:44 +0000 (11:54 -0500)]
powerpc/perf: MMCR0 control for PMU registers under PMCC=00

PowerISA v3.1 introduces new control bit (PMCCEXT) for restricting
access to group B PMU registers in problem state when
MMCR0 PMCC=0b00. In problem state and when MMCR0 PMCC=0b00,
setting the Monitor Mode Control Register bit 54 (MMCR0 PMCCEXT),
will restrict read permission on Group B Performance Monitor
Registers (SIER, SIAR, SDAR and MMCR1). When this bit is set to zero,
group B registers will be readable. In other platforms (like power9),
the older behaviour is retained where group B PMU SPRs are readable.

Patch adds support for MMCR0 PMCCEXT bit in power10 by enabling
this bit during boot and during the PMU event enable/disable callback
functions.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-8-git-send-email-atrajeev@linux.vnet.ibm.com
4 years agopowerpc/perf: Fix to update cache events with l2l3 events in power10
Athira Rajeev [Thu, 26 Nov 2020 16:54:43 +0000 (11:54 -0500)]
powerpc/perf: Fix to update cache events with l2l3 events in power10

Export l2l3 events (PM_L2_ST_MISS and PM_L2_ST) and LLC-prefetches
(PM_L3_PF_MISS_L3) via sysfs, and also add these to list of
cache_events.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-7-git-send-email-atrajeev@linux.vnet.ibm.com
4 years agopowerpc/perf: Fix to update generic event codes for power10
Athira Rajeev [Thu, 26 Nov 2020 16:54:42 +0000 (11:54 -0500)]
powerpc/perf: Fix to update generic event codes for power10

Fix the event code for events: branch-instructions (to PM_BR_FIN),
branch-misses (to PM_MPRED_BR_FIN) and cache-misses (to
PM_LD_DEMAND_MISS_L1_FIN) for power10 PMU. Update the
list of generic events with this modified event code.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-6-git-send-email-atrajeev@linux.vnet.ibm.com
4 years agopowerpc/perf: Add generic and cache event list for power10 DD1
Athira Rajeev [Thu, 26 Nov 2020 16:54:41 +0000 (11:54 -0500)]
powerpc/perf: Add generic and cache event list for power10 DD1

There are event code updates for some of the generic events
and cache events for power10. Inorder to maintain the current
event codes work with DD1 also, create a new array of generic_events,
cache_events and pmu_attr_groups with suffix _dd1, example,
power10_events_attr_dd1. So that further updates to event codes
can be made in the original list, ie, power10_events_attr. Update the
power10 pmu init code to pick the dd1 list while registering
the power PMU, based on the pvr (Processor Version Register) value.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-5-git-send-email-atrajeev@linux.vnet.ibm.com
4 years agopowerpc/perf: Fix the PMU group constraints for threshold events in power10
Athira Rajeev [Thu, 26 Nov 2020 16:54:40 +0000 (11:54 -0500)]
powerpc/perf: Fix the PMU group constraints for threshold events in power10

The PMU group constraints mask for threshold events covers
all thresholding bits which includes threshold control value
(start/stop), select value as well as thresh_cmp value (MMCRA[9:18].
In power9, thresh_cmp bits were part of the event code. But in case
of power10, thresh_cmp bits are not part of event code due to
inclusion of MMCR3 bits. Hence thresh_cmp is not valid for
group constraints for power10.

Fix the PMU group constraints checking for threshold events in
power10 by using constraint mask and value for only threshold control
and select bits.

Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-4-git-send-email-atrajeev@linux.vnet.ibm.com
4 years agopowerpc/perf: Update the PMU group constraints for l2l3 events in power10
Athira Rajeev [Thu, 26 Nov 2020 16:54:39 +0000 (11:54 -0500)]
powerpc/perf: Update the PMU group constraints for l2l3 events in power10

In Power9, L2/L3 bus events are always available as a
"bank" of 4 events. To obtain the counts for any of the
l2/l3 bus events in a given bank, the user will have to
program PMC4 with corresponding l2/l3 bus event for that
bank.

Commit 59029136d750 ("powerpc/perf: Add constraints for power9 l2/l3 bus events")
enforced this rule in Power9. But this is not valid for
Power10, since in Power10 Monitor Mode Control Register2
(MMCR2) has bits to configure l2/l3 event bits. Hence remove
this PMC4 constraint check from power10.

Since the l2/l3 bits in MMCR2 are not per-pmc, patch handles
group constrints checks for l2/l3 bits in MMCR2.

Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-3-git-send-email-atrajeev@linux.vnet.ibm.com
4 years agopowerpc/perf: Fix to update radix_scope_qual in power10
Athira Rajeev [Thu, 26 Nov 2020 16:54:38 +0000 (11:54 -0500)]
powerpc/perf: Fix to update radix_scope_qual in power10

power10 uses bit 9 of the raw event code as RADIX_SCOPE_QUAL.
This bit is used for enabling the radix process events.
Patch fixes the PMU counter support functions to program bit
18 of MMCR1 ( Monitor Mode Control Register1 ) with the
RADIX_SCOPE_QUAL bit value. Since this field is not per-pmc,
add this to PMU group constraints to make sure events in a
group will have same bit value for this field. Use bit 21 as
constraint bit field for radix_scope_qual. Patch also updates
the power10 raw event encoding layout information, format field
and constraints bit layout to include the radix_scope_qual bit.

Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-2-git-send-email-atrajeev@linux.vnet.ibm.com
4 years agopowerpc/book3s64/pkeys: Optimize KUAP and KUEP feature disabled case
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:24 +0000 (10:14 +0530)]
powerpc/book3s64/pkeys: Optimize KUAP and KUEP feature disabled case

If FTR_BOOK3S_KUAP is disabled, kernel will continue to run with the same AMR
value with which it was entered. Hence there is a high chance that
we can return without restoring the AMR value. This also helps the case
when applications are not using the pkey feature. In this case, different
applications will have the same AMR values and hence we can avoid restoring
AMR in this case too.

Also avoid isync() if not really needed.

Do the same for IAMR.

null-syscall benchmark results:

With smap/smep disabled:
Without patch:
957.95 ns    2778.17 cycles
With patch:
858.38 ns    2489.30 cycles

With smap/smep enabled:
Without patch:
1017.26 ns    2950.36 cycles
With patch:
1021.51 ns    2962.44 cycles

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-23-aneesh.kumar@linux.ibm.com
4 years agopowerpc/book3s64/kup: Check max key supported before enabling kup
Aneesh Kumar K.V [Wed, 2 Dec 2020 04:38:54 +0000 (10:08 +0530)]
powerpc/book3s64/kup: Check max key supported before enabling kup

Don't enable KUEP/KUAP if we support less than or equal to 3 keys.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201202043854.76406-1-aneesh.kumar@linux.ibm.com
4 years agopowerpc/book3s64/hash/kuep: Enable KUEP on hash
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:22 +0000 (10:14 +0530)]
powerpc/book3s64/hash/kuep: Enable KUEP on hash

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-21-aneesh.kumar@linux.ibm.com
4 years agopowerpc/book3s64/hash/kuap: Enable kuap on hash
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:21 +0000 (10:14 +0530)]
powerpc/book3s64/hash/kuap: Enable kuap on hash

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-20-aneesh.kumar@linux.ibm.com
4 years agopowerpc/book3s64/kuep: Use Key 3 to implement KUEP with hash translation.
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:20 +0000 (10:14 +0530)]
powerpc/book3s64/kuep: Use Key 3 to implement KUEP with hash translation.

Radix use IAMR Key 0 and hash translation use IAMR key 3.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-19-aneesh.kumar@linux.ibm.com
4 years agopowerpc/book3s64/kuap: Use Key 3 to implement KUAP with hash translation.
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:19 +0000 (10:14 +0530)]
powerpc/book3s64/kuap: Use Key 3 to implement KUAP with hash translation.

Radix use AMR Key 0 and hash translation use AMR key 3.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-18-aneesh.kumar@linux.ibm.com
4 years agopowerpc/book3s64/kuap: Improve error reporting with KUAP
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:18 +0000 (10:14 +0530)]
powerpc/book3s64/kuap: Improve error reporting with KUAP

With hash translation use DSISR_KEYFAULT to identify a wrong access.
With Radix we look at the AMR value and type of fault.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-17-aneesh.kumar@linux.ibm.com
4 years agopowerpc/book3s64/kuap: Restrict access to userspace based on userspace AMR
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:17 +0000 (10:14 +0530)]
powerpc/book3s64/kuap: Restrict access to userspace based on userspace AMR

If an application has configured address protection such that read/write is
denied using pkey even the kernel should receive a FAULT on accessing the same.

This patch use user AMR value stored in pt_regs.amr to achieve the same.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-16-aneesh.kumar@linux.ibm.com