Kris Van Hees [Mon, 23 May 2016 17:38:49 +0000 (10:38 -0700)]
dtrace: ensure pdata and sdt_tab handling works on module reload
The handling of the sdt_tab member in pdata caused crashes when the sdt
module was unloaded and then reloaded. This member holds the trampolines
for SDT probe points in kernel modules, and is allocated when a module is
loaded and resides in modules-only address space). This memory block was
incorrectly free'd from the sdt module code when sdt was unloaded.
This commit also adds verification that the anticipated trampoline max
size as used in the kernel is sufficient for what the sdt module needs.
This commit also adds verification (at runtime, with an assert) that the
reserved allocation to hold the pdata for a module is of a sufficient
size.
Orabug: 23331667 Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com> Acked-by: Nick Alcock <nick.alcock@oracle.com>
Nick Alcock [Fri, 29 Jan 2016 14:53:46 +0000 (14:53 +0000)]
dtrace: use copy_from_user() when walking userspace stacks
We were using get_user(), but that doesn't reliably work on all
platforms (such as SPARC64) and cannot trap faults, which meant we were
jumping through extra hoops to trap faults when copy_from_user() does
that anyway.
The extra copy is notably less efficient (since we end up looping over,
and copying, essentially the entire user stack in one-word increments),
but has the advantage of actually working.
Orabug: 22629102 Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Acked-by: Kris Van Hees <kris.van.hees@oracle.com>
Nick Alcock [Fri, 29 Jan 2016 14:47:03 +0000 (14:47 +0000)]
dtrace: do not overrun the start of the user stack
When scanning user stacks in dtrace_getufpstack(), we iterate from the
current stack pointer back to the start of the stack, getting the
unsigned long at each location and seeing if we can interpret it as a
pointer.
However, since the stack grows down on all platforms supported by
DTrace, the 'start' of the stack is the end of the VMA -- so we should
stop one unsigned long before the beginning, or we'll try to read off
the end (harmlessly, but still.)
Orabug: 22629102 Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Acked-by: Kris Van Hees <kris.van.hees@oracle.com>
Nick Alcock [Tue, 26 Jan 2016 17:53:50 +0000 (17:53 +0000)]
dtrace: fix access to uregs[R_L7]
An off-by-one bug causes this access to happen relative to REG_I0 rather
than REG_L0, leading to an invalid memory access (trapped by DTrace, so
no undefined behaviour is incurred, only a spurious ERROR firing).
Orabug: 22602870 Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Acked-by: Kris Van Hees <kris.van.hees@oracle.com>
Kris Van Hees [Mon, 18 Jan 2016 10:28:30 +0000 (05:28 -0500)]
dtrace: correct probe disable behaviour for syscalls
Previously, when both entry and return probes were enabled for a
syscall, upon disabling one of them, the function pointer in the
syscall table would already be reset to the default, removing the
interceptor. This resulted in an inconsistent state when the
2nd probe would get removed, and could cause a nasty race if one
were to try to enable one of the probes in between.
We now only remove the interceptor when we know the last probe
is being disabled.
Orabug: 22352636 Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com> Acked-by: Nick Alcock <nick.alcock@oracle.com>
The dtrace modules packages have long depended on their associated kernel, in
order to cause them to be removed when the associated kernel is removed. It
turns out that the installonly feature (which forces installation of kernel
packages and removal of old ones rather than upgrades on 'yum upgrade') fails to
cope with this situation: you get a broken-packages notice rather than a removal
of the dependent package.
So remove the dependency, and instead install an at job from a %postun trigger
that removes old modules a little later. If this hits the rpm lock, it will fail
and leave modules around; if the kernel is later reinstalled, it will remove a
module that has a corresponding kernel still installed. However, both of these
cases are harmless: the first case is expected to be extremely rare (you'd have
to be, by chance, doing an rpm run precisely four hours after the upgrade that
removed the old kernel) and has no negative consequences but the loss of a bit
of disk space to a useless kernel module; the second case is harmless because
the next dtrace run against that kernel will reinstall the module anyway.
Orabug: 21669543 Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Acked-by: Kris Van Hees <kris.van.hees@oracle.com>
Kris Van Hees [Wed, 14 Oct 2015 12:09:29 +0000 (08:09 -0400)]
dtrace: Support Linux-specific handling of envp / argv in psinfo
The implementation of retrievable envp and argv psinfo in Linux
requires those arrays to be located in kernel memory whereas in
traditional systems with DTrace implementations this was found in
userspace memory. Therefore, scripts expect to be able to access
this memory using copyin(). We look at the address passed in for
a copyin operation (or copyinstr) and if it is one of these special
cases, we simply pretend to retrieve data from userspace while in
reality we're simply retrieving the data from kernel space.
Orabug: 21984854 Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com> Acked-by: Nick Alcock <nick.alcock@oracle.com>
Nick Alcock [Tue, 6 Oct 2015 21:06:28 +0000 (22:06 +0100)]
dtrace: add missing dtrace_*canload() for copyout() and copyoutstr().
On Solaris, where unprivileged tracing is permitted and zone tracing is
implemented, this is a security hole since it allows breaking through
both zone and unprivileged-dtrace boundaries. Linux does not implement
either of these, so this fix is currently unobservable here.
Originally reported as a Solaris DTrace bug, it seems worth fixing here
too, against the day when we implement unprivileged tracing.
Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Acked-by: Kris Van Hees <kris.van.hees@oracle.com>
dtrace: ensure dt_perf does not clash with dt_test
The dt_perf provider module (only used for internal testing) mistakenly
registered its device file with the same minor number as dt_test. This
made it impossible for both to be loaded at the same time.
Orabug: 21814949 Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com> Acked-by: Nick Alcock <nick.alcock@oracle.com>
Kris Van Hees [Tue, 18 Aug 2015 21:47:27 +0000 (17:47 -0400)]
dtrace: provide OL6 and OL7 spec file with new features
Because of some small differences in building the DTrace modules for OL6
vs OL7,currently two different spec file are used. This commit removes
the old single spec file, and introduces the two specific ones.
This commit also adds support for module signing.
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
This commit introduces pdata_init() and pdata_cleanup() to allow
an architecture to perform arch-dependent operations on the pdata
information prior to assigning it to the module, and right before
getting rid of it (at module unloading time).
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com> Acked-by: Nick Alcock <nick.alcock@oracle.com>
Kris Van Hees [Tue, 9 Jun 2015 05:07:54 +0000 (01:07 -0400)]
dtrace: kernel provides SDT trampoline area on SPARC
The allocation of the SDT trampolines was done previously using vmalloc
which may cause the trampolines to be too far away from the code that
they provide a call to dtrace_probe() for, making it impossible to put
a jump to the trampoline in a single instruction at the probe location.
By using module_alloc on SPARC, the trampolines are allocated in the
memory region where modules live, which is by design within the jump
range.
The allocated memory is known to be of sufficient size for trampolines,
yet its actual use is not determined at the kernel level. It is simply
provided as a chunk of memory in the appropriate range.
When requesting a userspace stack trace, the initial frame instruction
pointer should be recorded as frame 0, with the remainder of the stack
trace being filled in based on the stack content. Previously, all the
IP values were taken from the stack. Special handling is provided for
obtaining the correct value of the stack pointer because in pre-4.1
kernels, there isn't an arch-independent way to do so. Once support
for 4.0 is no longer necessary, this can be generalized by using the
current_user_stack_pointer() macro.
If the current task is not a userspace task, an empty stack trace is
returned (and ustackdepth will also report 0).
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com> Acked-by: Nick Alcock <nick.alcock@oracle.com>
Kris Van Hees [Tue, 23 Jun 2015 10:51:35 +0000 (06:51 -0400)]
dtrace: validate argument pointer to d_path()
When an invalid pointer was being passed to d_path(), the system could
crash with an OOPS. This was the result of the kernel implementation
(reasonably) expecting the pointer to be referencing a valid path
struct. We now validate the argument passed to d_path() against the
paths for files known to the current task.
Kris Van Hees [Thu, 4 Jun 2015 14:08:05 +0000 (10:08 -0400)]
dtrace: support USDT for 32-bit applications on 64-bit hosts
A 32-bit application on a 64-bit host was not able to register USDT
probes because the helper ioctl interface was not hooked up to the
compat_ioctl file operation. This has been corrected.
Nick Alcock [Fri, 8 May 2015 13:20:37 +0000 (14:20 +0100)]
dtrace: use the initial user namespace in suitable {from,make}_kuid() calls
There are several places in DTrace (mostly related to privileged or destructive
operations or unprivileged tracing) where we try to compare uids for equality,
thus need to convert them from or to kuid_ts so we can do that. We want to look
in the initial user namespace for this (since it is only in that namespace that
all uids on the system are unambiguous). We were doing this by passing a NULL
to from_kuid() / make_kuid(), but in the presence of CONFIG_USER_NS this results
in dereferencing a null pointer.
So acquire the initial user namespace from a temporary kernel-thread creds
structure, and use it in all such places.
Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Acked-by: Kris Van Hees <kris.van.hees@oracle.com>
Nick Alcock [Thu, 7 May 2015 14:19:07 +0000 (15:19 +0100)]
dtrace: use the current user namespace for DIF_VAR_[UG]ID lookups
These lookups are not used for authentication, but rather are passed back
to DTrace itself: it seems reasonable that in this case the user would expect
them to be relative to the user namespace of the current process.
Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Acked-by: Kris Van Hees <kris.van.hees@oracle.com>
Nick Alcock [Mon, 16 Feb 2015 15:38:52 +0000 (15:38 +0000)]
Revise dependencies to get out of the shadow of dtrace-modules-headers.
Before the big dependency revamp in 0.4.3 (in the kernel 3.8.13-22 era), the
headers shared between kernel and userland resided in a versioned package named
something like dtrace-modules-3.8.13-21-headers, which provided a
dtrace-modules-headers symbol as well for users to pull in. Unfortunately some
very old packages both had unversioned provides of the same symbol, and had
versioned provides with a numeric scheme indicating compatibility, starting at
'1'. We could use epochs to force 0.4.5-5 to be greater than 1, but nothing
will get us out of the shadow of the unversioned symbol: these are always
considered both greater and less than all other symbols, leading to wildly
counterintuitive behaviour when yum does dependency resolution on them.
So get out from under their shadow: rename the dtrace-modules-headers package to
dtrace-modules-shared-headers, obsolete the old package so that
already-installed copies are upgraded appropriately, and provide
dtrace-modules-headers 1:1 -- epoched, so that it's higher in version than any
we have ever provided. Older userspace should pick up that epoch and upgrade
accordingly, newer userspace will use the new name. Unfortunately nothing can
stop older packages from attempting to pick up ancient kernels -- the
unversioned provide is out there, and nothing can remove it. But installs of
new dtrace-utils, at least, will work, as will updates: all that may break
is explicitly putting the older packages (but not the newer) into your own
yum repo and installing from that: an unimportant use case.
As usual with package configuration changes, we have to bump the module version
number and introduce the new name only in the new version: the older stanzas
will still be used when building old security errata modules, and we don't
want to introduce the new name there. Even there, so little has changed that
we can share nearly all the RPM headers between 0.4.3 and 0.4.4: only
the Obsoletes/Provides needs to be special-cased.
Orabug: 20508087 Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Acked-by: Kris Van Hees <kris.van.hees@oracle.com>
Nick Alcock [Fri, 10 Apr 2015 23:00:04 +0000 (00:00 +0100)]
dtrace: no longer expose kuid_t in the userspace dtrace API
The public header installed as <linux/dtrace/stability.h> exposed
<linux/uidgid.h> to userspace as part of the dtrace_ppriv_t.dtpp_uid member.
This member (used for unprivileged tracing) is part of a facility that is not
yet ported, but using a kuid_t for this is clearly wrong, and as of kernel 4.0
won't compile when used in userspace either.
Fix by migrating to a uid_t and converting it to a kuid at the point of use.
Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Acked-by: Kris Van Hees <kris.van.hees@oracle.com>
Kris Van Hees [Tue, 10 Feb 2015 17:31:14 +0000 (12:31 -0500)]
dtrace: fix dtrace_helptrace_buffer memory leak
When the help tracing facility is enabled in DTrace, upon loading the
DTrace core module, a buffer was being allocated using vmalloc(), yet
is was never freed upon unloading of the dtrace module. This caused a
leak of (by default) 64K with every load of the dtrace module. This
commit ensures that the memory is freed.
The commit also fixes the problem that the help tracing facility
variables in DTrace were defined in two places.
Kris Van Hees [Tue, 10 Feb 2015 17:18:39 +0000 (12:18 -0500)]
dtrace: support building on UEK4
Support building DTrace modules on UEK4. Various things changed at
the kernel level between UEK3 and UEK4 that require adjustments in the
building of the DTrace modules.
- ARCH no longer reflects the difference between x86 and x86_64. So,
we now use UTS_MACHINE to drive the architecture-specific portions
of DTrace during the building process.
- The trick used to implement a direct call probe in dt_test_probe()
required updating to avoid compiler warnings/errors. It is a little
bit less "ugly" now :)
- The uid and gid used in the task structure now uses kuid_t and kgid_t
as datatypes, which are no longer numeric values but rather a struct.
- The API for the IDR facility in the Linux kernel changed.
- The flush_delayed_work_sync() function has been removed. Source code
has been updated to use flush_delayed_work().
- The mechanism to enforce turning preemption on and off has been
updated.
Kris Van Hees [Tue, 10 Feb 2015 17:12:27 +0000 (12:12 -0500)]
dtrace: add support for DTrace on sparc64
This commit adds support for sparc64 to the DTrace modules. It also
includes some changes to the arch-independent code, to account for
some extra support pieces that are necessary for sparc64 without needing
to unnecessarily increase the portion of arch-dependent code.
- Add sparc64 implementations for arch-specific portions of DTrace.
- Add support for a provider API function (dtps_cleanup_module) to be
called for modules when a provider module is being unloaded. When
defined, this function can take care of any final cleanup that may
be necessary. This facility is used by the SDT code on sparc64 to
clean up the trampolines for the SDT probes.
- Add support for the pdata member in the module struct. This member
(generic pointer) can be populated with a pointer to a structure that
holds implementation specific DTrace data for the module. Each arch
must define dtrace_module_t (in include/<arch>/dtrace/mod_arch.h),
containing at a minimum:
size_t sdt_probe_cnt
int sdt_enabled
size_t fbt_probe_cnt
For sparc64 there is also a sdt_instr_t *sdt_tab member that will
hold a memory block for SDT trampolines.
The dtrace_module_t structs are allocated from a kmem cache. For
modules that exist before dtrace is loaded, the pdata member is
populated during the loading of dtrace. Modules loaded after dtrace
get it populated from a module notifier. When modules are unloaded,
the module notifier cleans up the pdata member. When dtrace itself
is unloaded, all remaining modules have their pdata member cleaned
up.
- Provide a generic method for calling a function on every loaded
module in the absence of a kernel facility to allow modules access
to the actual list of loaded modules. This adds an exported function
Kris Van Hees [Tue, 10 Feb 2015 16:33:19 +0000 (11:33 -0500)]
dtrace: restructuring to support DTrace on multiple architectures
Restructure the DTrace modules code to facilitate supporting ultiple
architectures (rather than just x86_64).
- The assembler implementation of support functions is now in a file
named dtrace_asm_<arch>.S and arch-specific aspects are found in
dtrace_isa_<arch>.c. The SDT provider requires an arch-specific
portion of code as well (in sdt_<arch>.c).
- The number of frames to skip for specific probes has been updated
to be more accurate (mistakes in this area were found during code
review).
- The mechanism for direct calling the test probe in dt_test_probe()
has been updated to work around compiler warnings.
- Removed dtrace_modload and dtrace_modunload. They were expected to
be needed for multi-arch support but it turns out that was not the
case.
- Add conditionals to not try to build anything that relates to providers
not necessarily being supported on all platforms.
- Various fixes for varable datatype issues that were not noticed on
x86 because they mapped to the same or similar numeric datatypes.
- Pass the dtrace_mstate_t struct to dtrace_getstackdepth() to support
the limitation that memory allocation cannot be done from probe
context. The dtrace_getstackdepth() function uses the dtrace_mstate_t
information to obtain a scratch area of memory to use as temporary
storage for PCs in the processing of dtrace_stacktrace().
- Handle the fact that on x86, the user sp for the current task can be
obtained using current_user_stack_pointer() whereas other platforms
use user_stack_pointer(current_pt_regs).
- Support that fact that the current instruction pointer is not always
an 'ip' member of the pt_regs struct. Always obtain the value of
the instruction pointer using the instruction_pointer(regs function.
- Support the use of asm/dtrace_syscall.h to list the system calls
that are implemented using an assembler stub.
- Ensure that membar functions use the SMP-versions.
- Clean up byte order conditionals.
- Remove dead code.
- Ensure needed header files are explicitly included.
dtrace: ensure one can try to get user pages without locking or faulting
This commit changes the FOLL_NOFAULT flag into a FOLL_IMMED flag, to
more accurately convey its meaning, i.e. to request user pages without
waiting for any locks and without servicing any page faults as a result
of the request. This is necessary in order to request user pages from
interrupt context.
This also completes the implementation by ensuring that the PTE spinlock
is checked rather than trying to lock it (and possibly get stuck in a
deadlock spinning for it).
dtrace_getufpstack() had several flaws exposed by ustack() of multithreaded
processes. All the flaws touch the same small body of code, and none could be
verified to work until all were in place: hence this rather do-everything
commit.
Firstly, it was detecting the end of the stack using mm->start_stack. This is
incorrect for all threads but the first, and is even incorrect for the first
thread in languages such as Go with split stacks. As it is, this causes the
stack traversal to attempt to walk over a gap with no VMAs, causing a crash.
The correct solution is of course to look at the VMAs to find the VMA which
covers the user's stack address. We are already looking at the VMAs in
is_code_addr(), but this is both a linear scan when all but no-mmu platforms
have better ways, and a *lockless* scan. This is barely safe in the
single-threaded case, but in the multithreaded case other tasks sharing the same
mm may well be executing in parallel, and it becomes crucial that scanning the
VMAs be done under the mmap_sem. Unfortunately we cannot always take the
mmap_sem: DTrace may well be invoked in contexts in which sleeping is
prohibited, and in which other threads have the semaphore. So we must do a
down_read_trylock() on the mmap_sem, aborting the ustack() if we cannot take it
just as we already do if this is a process with no mm at all. (We also need to
boost the mm_users to prevent problems with group exits.)
We are also accessing the pages themselves without pinning, which means
concurrent memory pressure could swap them out, or memory compaction move them
around. We can use __get_user_pages() to get the VMA and pin the pages we need
simultaneously, as long as we use the newly-introduced FOLL_NOFAULT to ensure
that __get_user_pages() does not incur page faults. We wrap __get_user_pages()
in a local find_user_vma(), which also arranges to optionally fail if particular
pages (such as the stack pages) are not in core. (We need the VMA for some
pages so we can see if they are likely to be text-segment VMAs or not: such
pages do not need to be in core and ustack() need not fail if they are swapped
out.)
For efficiency's sake, we pin each stack page as we cross the page boundary into
it, releasing it afterwards.
But even this does not suffice. FOLL_NOFAULT ensures that __get_user_pages()
will not fault, but does not ensure that a page fault will not happen when
accessing the page itself. So we use the newly-introduced CPU_DTRACE_NOPF
machinery to entirely suppress page faults inside get_user() (and nowhere else),
and check it afterwards.
As an additional feature, dtrace_getufpstack() can now be called with a NULL
pcstack and a pcstack_limit of zero, meaning that the stack frame entries are
only counted, not recorded. We use this feature to reimplement
dtrace_getustackdepth() in terms of dtrace_getufpstack().
With this change, multithreaded ustack()s appear to work, even in the presence
of non-glibc stack layouts (as used by Java and other non-glibc threading
libraries) and concurrent group exits and VMA changes.
Orabug: 18412802 Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com> Acked-by: Chuck Anderson <chuck.anderson@oracle.com>
Updated the NEWS and specfile to add a note that there is a known
regression on test stress/buffering/tst.resize1.d due to the memory
allocation checking changes that were made a while ago. This
non-harmful regression will be fixed in the next release.
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Nick Alcock [Mon, 24 Mar 2014 22:51:43 +0000 (22:51 +0000)]
Drop CPU_DTRACE_NOFAULT manipulation in progenyof().
This is only doing a traversal of task_structs via real_parent. This is
nonswappable, so faults are impossible, and blocking faults unnecessary.
Orabug: 18412802 Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com> Acked-by: Chuck Anderson <chuck.anderson@oracle.com>
Nick Alcock [Mon, 24 Mar 2014 22:50:06 +0000 (22:50 +0000)]
Drop CPU_DTRACE_NOFAULT manipulation around ustack calls.
dtrace_getufpstack() and (as of the last commit) dtrace_getustackdepth() both
manipulate the CPU_DTRACE_NOFAULT flag themselves: clearing it after calling
those functions is redundant, and setting it is actually dangerous, since
other functions dtrace_getustackdepth() calls (such as __get_user_pages() do
not expect to have instructions that incur page faults silently skipped without
faulting.
Orabug: 18412802 Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com> Acked-by: Chuck Anderson <chuck.anderson@oracle.com>
Nick Alcock [Mon, 17 Mar 2014 16:39:07 +0000 (16:39 +0000)]
Pass down the tgid to userspace in u{stack,sym,mod,addr}().
Userspace does not know how to attach to threads, only processes (thread group
leaders). All it's doing after attaching is looking up symbols, which are per-
process anyway, so rather than go to the effort of teaching userspace to grab
and release non-thread-group-leaders, simply pass the tgid to userspace so that
it can grab everything the same way.
Also pass the pid (== tid) down, because DTrace consumers could reasonably want
to know the actual thread ID in which the u*() fired (though our userspace does
not care).
This means we are passing one extra item on the buffer for ustack() et al:
internal uses are adjusted accordingly.
Orabug: 18412802 Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com> Acked-by: Chuck Anderson <chuck.anderson@oracle.com>
Nick Alcock [Mon, 17 Mar 2014 16:29:34 +0000 (16:29 +0000)]
Fix the pid and ppid variables in multithreaded processes.
pid is currently equal to the Linux-side PID: i.e., from userspace's
perspective, the thread ID. tgid is equal to the thread ID of the parent. Both
of these are at best inconvenient and at worst wrong: they should both use the
thread group ID of their respective task, which corresponds to the
userspace-visible PID.
Orabug: 18412802 Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
Kris Van Hees [Fri, 14 Mar 2014 15:40:53 +0000 (11:40 -0400)]
dtrace: add support for profile-* probes
This commit adds support in the profile provider for profile-*
probes, i.e. probes that fire at a specifid frequency/interval on
all active CPUs. Support is also added for passing the appropriate
program counter (kernel or user) as probe argument, as required for
tick-* and profile-* probes.
Nick Alcock [Wed, 29 Jan 2014 20:35:12 +0000 (20:35 +0000)]
Have the new dtrace-modules-provider-headers obsolete the old.
The package name has changed but the new package contains the same files as the
old, so we need to Obsolete: the old ones so that yum will remove them.
(Because the old scheme generated package names on the fly according to the
running kernel, the list in this patch may well be missing a few packages.)
Caveat: this fixes 'yum update' but cannot fix direct RPM installation.
You'll have to uninstall the old package manually if you do that.
Orabug: 18061595 Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com> Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Nick Alcock [Thu, 16 Jan 2014 13:22:08 +0000 (13:22 +0000)]
Remove kernel version from name of dtrace-modules-provider-headers package.
This package had a kernel-version-dependent name on the grounds that it
consisted of kernel headers meant to be included by a single kernel version.
This reasoning was flawed: the headers do not change as the kernel is rebuilt,
and as the package provides files that are not under a kernel-version-specific
path, 'yum update' can attempt to install two versions at once, and conflict.
The right solution is to name the package without a kernel-version-specific part.
(We keep the name unchanged when built against earlier kernels, to avoid
sneaking unrelated changes into security errata releases.)
Orabug: 18061595 Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com> Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Kris Van Hees [Fri, 20 Dec 2013 16:19:01 +0000 (11:19 -0500)]
dtrace: Fix RPM dependencies.
Userspace depends on dtrace-modules-headers so that it can #include the headers
shared between kernel and userspace. However, it is crucial that this inclusion
not drag in the dtrace module itself, nor the kernel on which it depends,
because that module might be of a version different to that already on the
system (likely older, which would cause yum upgrade to fail).
So drop the dependency between dtrace-modules-headers and the module itself.
Also, userspace has ceased depending on the dtrace-kernel-interface capability,
in favour of automatic but explicit yum installation of module RPMs when needed:
so drop that capability, unversion the dtrace-modules-headers capability, and
remove the kernel version from the dtrace-modules-headers package's name, since
it is not dependent on the running kernel in any way. Unversion the
modules-provider-headers capability too, but leave its name versioned: since it
is meant for provider authors, and providers are kernel modules, it is
necessarily kernel-version-dependent.
--
Modified to allow building of modules prior to 0.4.2 using the older scheme
for dependencies, and use the new scheme starting with 0.4.2.
Orabug: 17804881 Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Kris Van Hees [Tue, 17 Dec 2013 23:08:17 +0000 (18:08 -0500)]
dtrace: vtimestamp implementation
This commit adds DTrace vtimestamp support. It keeps track of how much
time a task has spent actually processing on a CPU. The time is set to
zero at task creation, and is updated whenever the task leaves a CPU
(gets scheduled off), and when the dtrace_probe() function is entered,
to enusre that the most recent value of consumed time is reported.
Some code got moved around for consistency of the implementation.
Kris Van Hees [Tue, 17 Dec 2013 23:06:57 +0000 (18:06 -0500)]
dtrace: implement SDT in kernel modules
Full implementation of SDT probes in kernel modules.
The dtrace_sdt.sh script has been modified to handle both the creation
of the SDT stubs and the SDT info. It's syntax has therefore changed:
dtrace_sdt.sh sdtstub <stubfile> <object-file> <object-file>*
or
dtrace_sdt.sh sdtinfo <infofile> vmlinux.o
or
dtrace_sdt.sh sdtinfo <infofile> vmlinux.o .tmp_vmlinux1
or
dtrace_sdt.sh sdtinfo <infofile> <kmod>.o kmod
The first form generates a stub file in assembler to ensure that the
(fake) functions that are called from SDT probe points will not longer
be reported as undefined symbols, and to ensure that when SDT is not
enabled, the probes become calls to a function that simply returns.
The second form creates the initial (dummy) SDT info file for the kernel
linking process, mainly to ensure that its size is known. The third
form then creates the true SDT info file for the kernel, based on the
kernel object file and the first stage linked kernel image.
The fourth and final form generates SDT info for a kernel module, based
on its initial linked object.
This commit also enables the test probes in the dt_test module.
Kris Van Hees [Mon, 16 Dec 2013 19:42:07 +0000 (14:42 -0500)]
dtrace: fix conditionals for changelog composition
Build failure indicated that under some conditions, the changelog created in
the specfile by means of build version conditionals resulted in out-of-order
entries in the changelog. This has been corrected.
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Kris Van Hees [Thu, 31 Oct 2013 09:22:56 +0000 (05:22 -0400)]
dtrace: provide a corrected implementation of the 'errno' D variable
This commit provides a corrected implementation for the 'errno' D variable.
It is defined as holding the error code (if non-zero) during the current
system call execution. If the system call is successful, or if no system
call is being executed, its value is to be 0. On (Open)Solaris, this was
retrieved from a task variable that is assigned an error code as soon as
an error is encountered during the processing of a system call, i.e. system
calls use a task variable to store any error code encountered during
execution, and this is used upon return from the system call to alert
userspace of the error code status of the system call. In Linux, system
calls are implemented in the more regular fashion (for Linux at least)
of returning error codes as return values of functions, and therefore
there is no task level variable to consult. So, instead we recognize that
at this point) 'errno' only has meaning during the processing of syscall
return probes, which are handled from the system call wrapper, after the
system call implementation has been executed.
It would therefore be sufficient and correct to assign the value of 'errno'
at that point, but that would require a task variable to be added to the
task struct in order for this value to be recorded.
In order to avoid adding a member to the task struct, we (ab)use the fact
that we can recognize whether we are executing a D action for a syscall
return probe, and if we are *and* if 'errno' is being retrieved, we look
at the arg0 value for the probe (which is defined as the return value of
the syscall), and if the value is between 0 and -2048, we return the error
code it represents as errno.
Kris Van Hees [Thu, 17 Oct 2013 23:18:44 +0000 (19:18 -0400)]
dtrace: fix lock ordering issues, mutex_owned(), and mutex debugging
Several cases of potential lock ordering issues were identified and resolved.
Both static and dynamic analysis of locking comes clean for DTrace after this
commit is applied.
The mutex_owned() function was not accounting for the possibility that a lock
might have an owner registered while unlocked.
Kris Van Hees [Thu, 10 Oct 2013 20:17:13 +0000 (16:17 -0400)]
dtrace: update getufpstack implementation to be safer
The dtrace_getufpstack() function was a death trap when called for cases where
current happened to be in a transitional state (no mm) or a kthread. It was
also using find_vma() when that was not quite necessary. Finally, it was not
correctly using the saved stack pointer from userspace correctly (in one place
it used old_rsp as appropriate, but in another p->thread.usersp). The code
has been rewritten to make use of the fact that the only valid stack addresses
that can be in use when this function is called must appear between the current
stack pointer position (old_rsp) and the bottom of the stack (mm->start_stack).
Therefore, no vma is necessary anymore.
The new implementation also ensures that when there is no mm, or we're dealing
with a kthread, the resulting data is still formatted correctly, i.e. with a
PID in the first slot, and zeros in all other slots.
This commit effectively builds on top of the fix applied by Nick Alcock.
Nick Alcock [Thu, 10 Oct 2013 23:32:25 +0000 (00:32 +0100)]
dtrace: armour ustack() against kernel threads, !task->mm, and corrupt usersp.
Kernel threads have no userspace stack, by definition: we should not assume they
do. Further, tasks with no mm (whether because they are kernel threads or for
any other reason) should not be ustack()ed, nor tasks in which find_vma() cannot
find the vma corresponding to the usersp. (Possible causes for this might be a
task which just smashed its own userspace sp or a task which is in the middle of
exiting or exec()ing.)
dtrace: prevent Oops caused by preemption issues with probes
It was possible (specifically with direct-call probes) for the execution of
actions (using the DIF emulator) to get preempted, causing interesting side
effects because dtrace_probe() (and the functions it calls) are designed to
run on a specific CPU without any interruption especially not from another
call to dtrace_probe()). Since the UEK3 kernel uses voluntary preemption,
the behaviour was not as expected and explicit preemption protection had to
be added to resolve this.
Orabug: 17403196 Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Allocate the psinfo structure from a slab (alike other structures related to
the task_struct), and use kmalloc() for the argv and envp members (with size
limit to avoid allocation issues).
Orabug: 17407069 Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Kris Van Hees [Thu, 29 Aug 2013 22:07:21 +0000 (18:07 -0400)]
dtrace: Ensure that USDT probes are carried over correctly across fork().
When a process forks, its child will have a copy of the address space of the
parent, and therefore any enabled USDT probes from the parent will also fire
for the child. In order for those probe firings to be valid, the child must
have its own pid-specific providers (created by duplicating the parent's
providers).
Kris Van Hees [Tue, 27 Aug 2013 19:49:50 +0000 (15:49 -0400)]
dtrace: fix retrieval of arg5 through arg9
Fix the retrieval of arguments passed on the stack for SDT, USDT, and direct
call probes. This commit also adds trivial support for testcases related to
this fix.
Kris Van Hees [Wed, 14 Aug 2013 12:44:01 +0000 (08:44 -0400)]
Bug fix for logic to determine the (inode, offset) pair for uprobes.
The logic used to determine the (inode, offset) pair needed by uprobes, and
caculated based on an address in a process memory space. was flawed. This
caused USDT probes in shared libraries to not work correctly.
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Kris Van Hees [Wed, 7 Aug 2013 20:09:27 +0000 (16:09 -0400)]
Bug fix for fasttrap module unloading.
Various scenarios have been uncovered where unloading of the fasttrap module
would result in an assertion failure.
Essentially, what is going on is that an executable may register providers
for USDT probes (i.e. pass a DOF object to DTrace through the helper interface)
while there isn't any consumer active. Any attempt to remove the fasttrap
provider at this point in time causes an assertion failure on the reference
count for all providers instantiated by the the meta-provider (fasttrap),
because module removal cannot be stopped once it has been initiated.
The solution is to take a reference on the meta-provider module whenever a new
provider is instantiated in it, and to put the reference back when that
provider is retired (removed from use). I.e. the module will be listed as in
use as long as there are providers associated with it.
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Kris Van Hees [Wed, 7 Aug 2013 07:44:08 +0000 (03:44 -0400)]
Bug fix for module unloading.
Once a consumer has opened the /dev/dtrace/dtrace device file, providers can
no longer be safely unloaded, until the last consumer closed the device file.
This commit adds code to ensure this behaviour. Failing to do so results in
almost certain OOPSes because Linux does not support kernel modules "refusing"
to be unloaded in response to an rmmod.
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Nick Alcock [Wed, 31 Jul 2013 19:01:52 +0000 (20:01 +0100)]
Fix fasttrap ioctls and headers_check.
These were being included before their structure definitions, leading to the
possibility of wrong values for FASTTRAPIOC_*.
Instead, split the FASTTRAPIOC* definitions into a new header, fasttrap_ioctl.h:
this includes fasttrap.h to get the structure definitions, and is also included
by it, so that either header can be included to get the ioctl definitions. We
then extend "make headers_check" so that it does the extended ioctl checks on
all headers named *ioctl.h, not just ioctl.h. (These checks are quite grotesque:
we don't want to run them on every DTrace uapi header if that can be avoided.)
Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Nick Alcock [Wed, 31 Jul 2013 15:17:30 +0000 (16:17 +0100)]
Re-enable DTrace ioctl()-size debugging.
We move the shared user/kernel ioctl-debugging function to a new
<linux/dtrace/ioctl_debug.h> header, to avoid problems with other
<linux/dtrace/ioctl.h> users.
Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Nick Alcock [Tue, 23 Jul 2013 18:54:32 +0000 (19:54 +0100)]
Fix provider header requirements.
This should depend on dtrace-modules-headers with a version equal to the dtrace
kernel interface, not on dtrace-modules-headers-%{kver}, which doesn't exist.
Signed-off-by: Nick Alcock <nick.alcock@oracle.com>