Igor Mammedov [Thu, 5 Oct 2017 13:51:09 +0000 (15:51 +0200)]
mips: use object_new() instead of gnew()+object_initialize()
object_initialize() is intended for inplace initialization of
objects, but here it's first allocated with g_new0() and then
initialized with object_initialize(). QEMU already has API
to do this (object_new), so do object creation with suitable
for usecase API.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-36-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Igor Mammedov [Thu, 5 Oct 2017 13:51:03 +0000 (15:51 +0200)]
tricore: cleanup cpu type name composition
introduce TRICORE_CPU_TYPE_NAME macro and use it to construct
cpu type names. While at it move cpu type_infos into one
array and register it directly with type_init_from_array()
instead of custom tricore_cpu_register_types()/cpu_register()
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-30-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Igor Mammedov [Thu, 5 Oct 2017 13:51:01 +0000 (15:51 +0200)]
unicore32: cleanup cpu type name composition
use new UNICORE32_CPU_TYPE_NAME to compose CPU type
name and get rid of intermediate
UniCore32CPUInfo/uc32_cpu_register_types()
which is replaced by static TypeInfo array and
type_init_from_array()
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-28-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Igor Mammedov [Thu, 5 Oct 2017 13:50:57 +0000 (15:50 +0200)]
sh4: remove SuperHCPUClass::name field
the field contains upper-cased cpu model name and is used
for printing supported cpu model names for '-cpu help'.
Considering that cpu model lookup in superh_cpu_class_by_name()
is case-insensitive, we can drop upper-casing when
printing supported cpus list and use cpu type directly
to do the same by cutting out SUPERH_CPU_TYPE_SUFFIX from
typename.
It allows to remove SuperHCPUClass::name, which practically
duplicates names defined by TYPE_SH*_CPU definitions and
simplify sh*_class_init()/SuperHCPUClass a bit.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-24-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Igor Mammedov [Thu, 5 Oct 2017 13:50:56 +0000 (15:50 +0200)]
sh4: simplify superh_cpu_class_by_name()
currently for sh4 cpu_model argument for '-cpu' option
could be either 'cpu model' name or cpu_typename.
however typically '-cpu' takes 'cpu model' name and
cpu type for sh4 target isn't advertised publicly
('-cpu help' prints only 'cpu model' names) so we
shouldn't care about this use case (it's more of a bug).
1. Drop '-cpu cpu_typename' to align with the rest of
targets.
2. Compose searched for typename from cpu model and use
it with object_class_by_name() directly instead of
over-complicated
object_class_get_list()
g_slist_find_custom() + superh_cpu_name_compare()
With #1 droped, #2 could be used for both lookups which
simplifies superh_cpu_class_by_name() quite a bit.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-23-git-send-email-imammedo@redhat.com>
[ehabkost: Include fixup sent by Igor] Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Igor Mammedov [Thu, 5 Oct 2017 13:50:55 +0000 (15:50 +0200)]
sh4: cleanup cpu type name composition
introduce SUPERH_CPU_TYPE_NAME macro and use it to construct
cpu type names. While at it move cpu type_infos into one
array and register it directly with type_init_from_array()
instead of custom superh_cpu_register_types()
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-22-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Igor Mammedov [Thu, 5 Oct 2017 13:50:51 +0000 (15:50 +0200)]
openrisc: cleanup cpu type name composition
use new OPENRISC_CPU_TYPE_NAME to compose CPU type name and get
rid of intermediate OpenRISCCPUInfo/openrisc_cpu_register_types()
which is replaced by static TypeInfo array.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-18-git-send-email-imammedo@redhat.com> Acked-by: Stafford Horne <shorne@gmail.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Igor Mammedov [Thu, 5 Oct 2017 13:50:49 +0000 (15:50 +0200)]
moxie: cleanup cpu type name composition
introduce MOXIE_CPU_TYPE_NAME macro and consistently use it
to construct cpu type names. While at it replace dynamic
cpu type name composition with static data.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-16-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Igor Mammedov [Thu, 5 Oct 2017 13:50:48 +0000 (15:50 +0200)]
moxie: fix qemu-system-moxie failing to start with CLI "-cpu MoxieLite"
It 'works' with default CPU only because of bug in
moxie_cpu_class_by_name() where it treats cpu_model
as type name and default cpu_model also happens to be
type name. But specifying explicitly cpu on CLI,
ex: '-cpu MoxieLite', makes QEMU fail since
moxie_cpu_class_by_name() doesn't traslate cpu_model
to cpu type and fails to find corresponding object class.
Fix moxie_cpu_class_by_name() to do proper
cpu_model -> cpu type
translation and fix default cpu_model to be cpu_model
instead of being typename.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-15-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Igor Mammedov [Thu, 5 Oct 2017 13:50:45 +0000 (15:50 +0200)]
m68k: cleanup cpu type name composition
use new M68K_CPU_TYPE_NAME to compose CPU type names
and get rid of intermediate M68kCPUInfo/register_cpu_type()
which is replaced by static TypeInfo array.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1507211474-188400-12-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Igor Mammedov [Thu, 5 Oct 2017 13:50:42 +0000 (15:50 +0200)]
lm32: cleanup cpu type name composition
introduce LM32_CPU_TYPE_NAME macro and consistently use it
to construct cpu type names. While at it replace dynamic
cpu type name composition with static data.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Michael Walle <michael@walle.cc>
Message-Id: <1507211474-188400-9-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Igor Mammedov [Thu, 5 Oct 2017 13:50:38 +0000 (15:50 +0200)]
alpha: cleanup cpu type name composition
Introduce ALPHA_CPU_TYPE_NAME macro to replace rather ununique
TYPE macro that alpha uses. With new macro it will follow
the same naming convention as other targets.
While at it put scattered TypeInfo into one array which places
type desriptions at one place and reduces code a bit.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-5-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Peter Maydell [Thu, 26 Oct 2017 08:20:11 +0000 (09:20 +0100)]
Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2017-10-24-1' into staging
Merge tpm 2017/10/24 v1
# gpg: Signature made Wed 25 Oct 2017 06:06:55 BST
# gpg: using RSA key 0x75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2017-10-24-1:
tpm: print buffers received from TPM when debugging
vl: remove unnecessary #ifdef CONFIG_TPM
tpm: remove unnecessary #ifdef CONFIG_TPM
tpm: add stubs
tpm: add missing include
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 25 Oct 2017 15:38:57 +0000 (16:38 +0100)]
Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20171025' into staging
TCG patch queue
# gpg: Signature made Wed 25 Oct 2017 10:30:18 BST
# gpg: using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20171025: (51 commits)
translate-all: exit from tb_phys_invalidate if qht_remove fails
tcg: Initialize cpu_env generically
tcg: enable multiple TCG contexts in softmmu
tcg: introduce regions to split code_gen_buffer
translate-all: use qemu_protect_rwx/none helpers
osdep: introduce qemu_mprotect_rwx/none
tcg: allocate optimizer temps with tcg_malloc
tcg: distribute profiling counters across TCGContext's
tcg: introduce **tcg_ctxs to keep track of all TCGContext's
gen-icount: fold exitreq_label into TCGContext
tcg: define tcg_init_ctx and make tcg_ctx a pointer
tcg: take tb_ctx out of TCGContext
translate-all: report correct avg host TB size
exec-all: rename tb_free to tb_remove
translate-all: use a binary search tree to track TBs in TBContext
tcg: Remove CF_IGNORE_ICOUNT
tcg: Add CF_LAST_IO + CF_USE_ICOUNT to CF_HASH_MASK
cpu-exec: lookup/generate TB outside exclusive region during step_atomic
tcg: check CF_PARALLEL instead of parallel_cpus
target/sparc: check CF_PARALLEL instead of parallel_cpus
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# gpg: Signature made Mon 23 Oct 2017 17:05:14 BST
# gpg: using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg: aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/juanquintela/tags/migration/20171023: (21 commits)
migration: Improve migration thread error handling
qapi: Fix grammar in x-multifd-page-count descriptions
migration: add bitmap for received page
migration: introduce qemu_ufd_copy_ioctl helper
migration: postcopy_place_page factoring out
migration: new ram_init_bitmaps()
migration: clean up xbzrle cache init/destroy
migration: provide ram_state_cleanup
migration: provide ram_state_init()
migration: pause-before-switchover for postcopy
migration: allow cancel to unpause
migrate: HMP migate_continue
migration: migrate-continue
migration: Wait for semaphore before completing migration
migration: Add 'pre-switchover' and 'device' statuses
migration: Add 'pause-before-switchover' capability
migration: Make cache_init() take an error parameter
migration: Move xbzrle cache resize error handling to xbzrle_cache_resize
migration: Make cache size elements use the right types
migratiom: Remove max_item_age parameter
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Philippe Mathieu-Daudé [Tue, 24 Oct 2017 12:20:45 +0000 (09:20 -0300)]
vl: remove unnecessary #ifdef CONFIG_TPM
a stub is now provided.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Philippe Mathieu-Daudé [Tue, 24 Oct 2017 12:20:44 +0000 (09:20 -0300)]
tpm: remove unnecessary #ifdef CONFIG_TPM
Makefile.objs now checks for $(CONFIG_TPM).
Suggested-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Philippe Mathieu-Daudé [Tue, 24 Oct 2017 12:20:43 +0000 (09:20 -0300)]
tpm: add stubs
Commit c37cacabf22 moved tpm_cleanup() in the main loop exit, however this
function is not available when compiling with --disable-tpm.
Provides necessary stubs to keep code clean of #ifdef'fery.
Reported-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20171023102903.256AF7456A0@zero.eik.bme.hu> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Emilio G. Cota [Thu, 19 Oct 2017 20:31:54 +0000 (16:31 -0400)]
translate-all: exit from tb_phys_invalidate if qht_remove fails
Two or more threads might race while invalidating the same TB. We currently
do not check for this at all despite taking tb_lock, which means we would
wrongly invalidate the same TB more than once. This bug has actually been
hit by users: I recently saw a report on IRC, although I have yet to see
the corresponding test case.
Fix this by using qht_remove as the synchronization point; if it fails,
that means the TB has already been invalidated, and therefore there
is nothing left to do in tb_phys_invalidate.
Note that this solution works now that we still have tb_lock, and will
continue working once we remove tb_lock.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1508445114-4717-1-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 10 Oct 2017 21:34:37 +0000 (14:34 -0700)]
tcg: Initialize cpu_env generically
This is identical for each target. So, move the initialization to
common code. Move the variable itself out of tcg_ctx and name it
cpu_env to minimize changes within targets.
This also means we can remove tcg_global_reg_new_{ptr,i32,i64},
since there are no longer global-register temps created by targets.
Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This enables parallel TCG code generation. However, we do not take
advantage of it yet since tb_lock is still held during tb_gen_code.
In user-mode we use a single TCG context; see the documentation
added to tcg_region_init for the rationale.
Note that targets do not need any conversion: targets initialize a
TCGContext (e.g. defining TCG globals), and after this initialization
has finished, the context is cloned by the vCPU threads, each of
them keeping a separate copy.
TCG threads claim one entry in tcg_ctxs[] by atomically increasing
n_tcg_ctxs. Do not be too annoyed by the subsequent atomic_read's
of that variable and tcg_ctxs; they are there just to play nice with
analysis tools such as thread sanitizer.
Note that we do not allocate an array of contexts (we allocate
an array of pointers instead) because when tcg_context_init
is called, we do not know yet how many contexts we'll use since
the bool behind qemu_tcg_mttcg_enabled() isn't set yet.
Previous patches folded some TCG globals into TCGContext. The non-const
globals remaining are only set at init time, i.e. before the TCG
threads are spawned. Here is a list of these set-at-init-time globals
under tcg/:
Only written by tcg_context_init:
- indirect_reg_alloc_order
- tcg_op_defs
Only written by tcg_target_init (called from tcg_context_init):
- tcg_target_available_regs
- tcg_target_call_clobber_regs
- arm: arm_arch, use_idiv_instructions
- i386: have_cmov, have_bmi1, have_bmi2, have_lzcnt,
have_movbe, have_popcnt
- mips: use_movnz_instructions, use_mips32_instructions,
use_mips32r2_instructions, got_sigill (tcg_target_detect_isa)
- ppc: have_isa_2_06, have_isa_3_00, tb_ret_addr
- s390: tb_ret_addr, s390_facilities
- sparc: qemu_ld_trampoline, qemu_st_trampoline (build_trampolines),
use_vis3_instructions
Only written by tcg_prologue_init:
- 'struct jit_code_entry one_entry'
- aarch64: tb_ret_addr
- arm: tb_ret_addr
- i386: tb_ret_addr, guest_base_flags
- ia64: tb_ret_addr
- mips: tb_ret_addr, bswap32_addr, bswap32u_addr, bswap64_addr
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is groundwork for supporting multiple TCG contexts.
The naive solution here is to split code_gen_buffer statically
among the TCG threads; this however results in poor utilization
if translation needs are different across TCG threads.
What we do here is to add an extra layer of indirection, assigning
regions that act just like pages do in virtual memory allocation.
(BTW if you are wondering about the chosen naming, I did not want
to use blocks or pages because those are already heavily used in QEMU).
We use a global lock to serialize allocations as well as statistics
reporting (we now export the size of the used code_gen_buffer with
tcg_code_size()). Note that for the allocator we could just use
a counter and atomic_inc; however, that would complicate the gathering
of tcg_code_size()-like stats. So given that the region operations are
not a fast path, a lock seems the most reasonable choice.
The effectiveness of this approach is clear after seeing some numbers.
I used the bootup+shutdown of debian-arm with '-tb-size 80' as a benchmark.
Note that I'm evaluating this after enabling per-thread TCG (which
is done by a subsequent commit).
Again, 8 flushes. Note how buffer utilization is not 100%, but it
is close. Smaller region sizes would yield higher utilization,
but we want region allocation to be rare (it acquires a lock), so
we do not want to go too small.
That is, 20 flushes. Note how a static partitioning approach uses
the code buffer poorly, leading to many unnecessary flushes.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The helpers require the address and size to be page-aligned, so
do that before calling them.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
exec time (s) Relative slowdown wrt original (%)
---------------------------------------------------------------
original 20.213321616 0.
tcg_malloc 20.441130078 1.1270214
TCGContext 20.477846517 1.3086662
g_malloc 20.780527895 2.8061013
The other two alternatives shown in the table are:
- TCGContext: embed temps[TCG_MAX_TEMPS] and TCGTempSet used_temps
in TCGContext. This is simple enough but it isn't faster than using
tcg_malloc; moreover, it wastes memory.
- g_malloc: allocate/deallocate both temps and used_temps every time
tcg_optimize is executed.
Suggested-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
tcg: distribute profiling counters across TCGContext's
This is groundwork for supporting multiple TCG contexts.
To avoid scalability issues when profiling info is enabled, this patch
makes the profiling info counters distributed via the following changes:
1) Consolidate profile info into its own struct, TCGProfile, which
TCGContext also includes. Note that tcg_table_op_count is brought
into TCGProfile after dropping the tcg_ prefix.
2) Iterate over the TCG contexts in the system to obtain the total counts.
This change also requires updating the accessors to TCGProfile fields to
use atomic_read/set whenever there may be conflicting accesses (as defined
in C11) to them.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
tcg: introduce **tcg_ctxs to keep track of all TCGContext's
Groundwork for supporting multiple TCG contexts.
Note that having n_tcg_ctxs is unnecessary. However, it is
convenient to have it, since it will simplify iterating over the
array: we'll have just a for loop instead of having to iterate
over a NULL-terminated array (which would require n+1 elems)
or having to check with ifdef's for usermode/softmmu.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Note that for now we set *tcg_ctx to whatever TCGContext is passed
to tcg_context_init -- in this case &tcg_init_ctx.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Sat, 24 Jun 2017 00:04:43 +0000 (20:04 -0400)]
tcg: take tb_ctx out of TCGContext
Groundwork for supporting multiple TCG contexts.
Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Sat, 24 Jun 2017 00:57:44 +0000 (20:57 -0400)]
translate-all: report correct avg host TB size
Since commit 6e3b2bfd6 ("tcg: allocate TB structs before the
corresponding translated code") we are not fully utilizing
code_gen_buffer for translated code, and therefore are
incorrectly reporting the amount of translated code as well as
the average host TB size. Address this by:
- Making the conscious choice of misreporting the total translated code;
doing otherwise would mislead users into thinking "-tb-size" is not
honoured.
- Expanding tb_tree_stats to accurately count the bytes of translated code on
the host, and using this for reporting the average tb host size,
as well as the expansion ratio.
In the future we might want to consider reporting the accurate numbers for
the total translated code, together with a "bookkeeping/overhead" field to
account for the TB structs.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We don't really free anything in this function anymore; we just remove
the TB from the binary search tree.
Suggested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Fri, 23 Jun 2017 23:00:11 +0000 (19:00 -0400)]
translate-all: use a binary search tree to track TBs in TBContext
This is a prerequisite for supporting multiple TCG contexts, since
we will have threads generating code in separate regions of
code_gen_buffer.
For this we need a new field (.size) in struct tb_tc to keep
track of the size of the translated code. This field uses a size_t
to avoid adding a hole to the struct, although really an unsigned
int would have been enough.
The comparison function we use is optimized for the common case:
insertions. Profiling shows that upon booting debian-arm, 98%
of comparisons are between existing tb's (i.e. a->size and b->size
are both !0), which happens during insertions (and removals, but
those are rare). The remaining cases are lookups. From reading the glib
sources we see that the first key is always the lookup key. However,
the code does not assume this to always be the case because this
behaviour is not guaranteed in the glib docs. However, we embed
this knowledge in the code as a branch hint for the compiler.
Note that tb_free does not free space in the code_gen_buffer anymore,
since we cannot easily know whether the tb is the last one inserted
in code_gen_buffer. The next patch in this series renames tb_free
to tb_remove to reflect this.
Performance-wise, lookups in tb_find_pc are the same as before:
O(log n). However, insertions are O(log n) instead of O(1), which
results in a small slowdown when booting debian-arm:
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
cpu-exec: lookup/generate TB outside exclusive region during step_atomic
Now that all code generation has been converted to check CF_PARALLEL, we can
generate !CF_PARALLEL code without having yet set !parallel_cpus --
and therefore without having to be in the exclusive region during
cpu_exec_step_atomic.
While at it, merge cpu_exec_step into cpu_exec_step_atomic.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Thereby decoupling the resulting translated code from the current state
of the system.
The tb->cflags field is not passed to tcg generation functions. So
we add a field to TCGContext, storing there a copy of tb->cflags.
Most architectures have <= 32 registers, which results in a 4-byte hole
in TCGContext. Use this hole for the new field.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/sparc: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/sh4: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/s390x: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/m68k: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/i386: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/hppa: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/arm: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Then manually fixed the few errors that checkpatch reported.
Compile-tested for all targets.
Suggested-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 13 Oct 2017 17:50:02 +0000 (10:50 -0700)]
tcg: Add CPUState cflags_next_tb
We were generating code during tb_invalidate_phys_page_range,
check_watchpoint, cpu_io_recompile, and (seemingly) discarding
the TB, assuming that it would magically be picked up during
the next iteration through the cpu_exec loop.
Instead, record the desired cflags in CPUState so that we request
the proper TB so that there is no more magic.
Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
tcg: define CF_PARALLEL and use it for TB hashing along with CF_COUNT_MASK
This will enable us to decouple code translation from the value
of parallel_cpus at any given time. It will also help us minimize
TB flushes when generating code via EXCP_ATOMIC.
Note that the declaration of parallel_cpus is brought to exec-all.h
to be able to define there the "curr_cflags" inline.
Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 20 Oct 2017 19:08:19 +0000 (12:08 -0700)]
tcg: Use offsets not indices for TCGv_*
Using the offset of a temporary, relative to TCGContext, rather than
its index means that we don't use 0. That leaves offset 0 free for
a NULL representation without having to leave index 0 unused.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 16 Oct 2017 02:02:42 +0000 (19:02 -0700)]
qom: Introduce CPUClass.tcg_initialize
Move target cpu tcg initialization to common code,
called from cpu_exec_realizefn.
Acked-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 20 Oct 2017 03:27:27 +0000 (20:27 -0700)]
tcg: Remove TCGV_EQUAL*
When we used structures for TCGv_*, we needed a macro in order to
perform a comparison. Now that we use pointers, this is just clutter.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 20 Oct 2017 07:30:24 +0000 (00:30 -0700)]
tcg: Remove GET_TCGV_* and MAKE_TCGV_*
The GET and MAKE functions weren't really specific enough.
We now have a full complement of functions that convert exactly
between temporaries, arguments, tcgv pointers, and indices.
The target/sparc change is also a bug fix, which would have affected
a host that defines TCG_TARGET_HAS_extr[lh]_i64_i32, i.e. MIPS64.
Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 20 Oct 2017 07:05:45 +0000 (00:05 -0700)]
tcg: Introduce temp_tcgv_{i32,i64,ptr}
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sun, 15 Oct 2017 20:27:56 +0000 (13:27 -0700)]
tcg: Introduce tcgv_{i32,i64,ptr}_{arg,temp}
Transform TCGv_* to an "argument" or a temporary.
For now, an argument is simply the temporary index.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sun, 15 Oct 2017 19:19:01 +0000 (12:19 -0700)]
tcg: Push tcg_ctx into tcg_gen_callN
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sun, 15 Oct 2017 18:50:16 +0000 (11:50 -0700)]
tcg: Push tcg_ctx into generator functions
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 2 Nov 2016 17:21:44 +0000 (11:21 -0600)]
tcg: Avoid loops against variable bounds
Copy s->nb_globals or s->nb_temps to a local variable for the purposes
of iteration. This should allow the compiler to use low-overhead
looping constructs on some hosts.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Thu, 8 Dec 2016 18:52:57 +0000 (10:52 -0800)]
tcg: Merge opcode arguments into TCGOp
Rather than have a separate buffer of 10*max_ops entries,
give each opcode 10 entries. The result is actually a bit
smaller and should have slightly more cache locality.
Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
Philippe Mathieu-Daudé [Tue, 24 Oct 2017 12:20:42 +0000 (09:20 -0300)]
tpm: add missing include
else file including "sysemu/tpm.h" fails to compile:
In file included from qemu/stubs/tpm.c:2:0:
qemu/include/sysemu/tpm.h:36:19: error: implicit declaration of function ‘object_resolve_path_type’ [-Werror=implicit-function-declaration]
Object *obj = object_resolve_path_type("", TYPE_TPM_TIS, NULL);
^~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
* remotes/kraxel/tags/input-20171023-pull-request:
ui: pull in latest keycodemapdb
ui: normalize the 'sysrq' key into the 'print' key
ps2: fix scancodes sent for Ctrl+Pause key combination
ps2: fix scancodess sent for Pause key in AT set 1
ps2: fix scancodes sent for Shift/Ctrl+Print key combination
ps2: fix scancodes sent for Alt-Print key combination (aka SysRq)
ui: use correct union field for key number
ui: fix crash with sendkey and raw key numbers
input: use hex in ps2 keycode trace events
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* remotes/kraxel/tags/fixes-20171023-pull-request:
scripts: don't throw away stderr when checking out git submodules
ui: add qemu-keymap and shader to .gitignore
configure: disable qemu-keymap for linux-user qemu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>