Solaris created its own protocol on top of domain service registration. This
matters because the control domain that linux is talking to is Solaris. The
hypervisor specs say that the handle used for service identification is simply
an opaque 64 bit number. The only constraint is that a handle never be used
twice (within a reasonable time frame) to prevent connection to a prior stale
registered handle. Solaris on the other hand reserves the bit 0x80000000 to
indicate what it calls client registration requests. These registration requests
are sent to the guest domain to prod it to send its own registration requests to
the control domain.
When a guest (linux in this case) sends its own registration requests with this
bit set, Solaris assumes that these come from clients running in the guest that
should not do this since there can only be one control domain. Linux not
knowing this uses the top 32 bits as a quick lookup index and sets the bottom 32
bits based off jiffies. Of course there are times when a handle is constructed
with the Solaris client bit not set and everything appears to work correctly
with no errors or warnings and times when the client bit is set and everything
works except the Solaris kernel puts a bunch of warnings into its dmesg buffer.
The fix is literally 1 character, changing the mask used to grab the bottom 32
bits of sched_clock() (jiffy based) to use only the bottom 31 bits. Halving the
roll-over time should not be an issue. Worse case additional jiffy bits can be
shifted into the upper 32 bits of the handle.
Addresses: BZ 15161
Orabug:
18038829
Signed-off-by: Chris Hyser <chris.hyser@oracle.com>
Acked-by: Karl Volz <karl.volz@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
(cherry picked from commit
01b84806a126706ed5b725ae716608019eda24c8)
(cherry picked from commit
29965550ad60982c510435a7afbba338446986c9)