arch/sparc: Use new misaligned load instructions for memcpy and copy_from_user
Use the new instructions for Load Misaligned Integer and Load Misaligned
Integer Alternate space for M8 architecture.
Decide when to use FP or ldm based on the following condition.
In case of FP load/alignaddr logic, there is a fixed overhead of
FP save/restore regardless of memcpy length. But the overhead due to the
ldm instruction grows with the size of memcpy. With our tests noticed
that up to length about 4096, the ldm instructions performs significanty
better than the FP alignaddr/load logic. With that into consideration,
use the new ldm instructions for length of 4096 or less. For lengths
above 4096, we will continue to use FP alignaddr/load logic.
Added the fix noticed crypto key corruption while running AES crypto tests.
This is the same problem reported in NG4memcpy. The commit
f4da3628dc7c
("sparc64: Fix FPU register corruption with AES crypto offload.") fixes
the problem. Ported these changes to M8memcpy and verified the fix.
TODO: Encoded the ldmx and ldmxa instruction for now. Our build servers
are not updated with latest M8 instruction set yet. We need to decode it
back to assembly mnemonics when these instructions are available.
Orabug:
25381567
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>