]>
www.infradead.org Git - users/mchehab/rasdaemon.git/log
Mauro Carvalho Chehab [Wed, 29 May 2013 14:10:44 +0000 (11:10 -0300)]
Bump to version 0.4.1
The sqlite3 bugfix is important enough to deserve a version.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 29 May 2013 14:03:04 +0000 (11:03 -0300)]
README: update to reflect the need of perl DBI sqlite
This is now needed by ras-mc-ctl.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 29 May 2013 13:59:43 +0000 (10:59 -0300)]
Makefile.am: create ${prefix}/var/lib/rasdaemon on install
rasdaemon -r requires that directory to be created, otherwise,
sql open will fail.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 29 May 2013 12:33:45 +0000 (09:33 -0300)]
ras-mc-ctl: add support for queuing the errors
As the mc_event table is filled by rasdaemon, we need a tool to
extract data from it.
So, use the existing perl script for the basic queries.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 29 May 2013 10:41:30 +0000 (07:41 -0300)]
ras-record: use sqlite3_reset to allow reusing the prepared statement
Instead of using sqlite3_finalize, we should use sqlite3_reset, or
otherwise the prepared statement will be de-allocated.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 29 May 2013 10:40:46 +0000 (07:40 -0300)]
rasdaemon.spec.in: Require sqlite-devel
This library is needed on builds when --enable-sqlite3 is used.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tony Luck [Tue, 28 May 2013 18:20:36 +0000 (11:20 -0700)]
ras-events: Fence-post error when reporting number of cpus we listen to
I see:
rasdaemon: Listening to events for cpus 0 to 64
which would be 65 total cpus - I only have 64.
Fix the log message to use "n_cpus - 1" rather than "n_cpus".
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 28 May 2013 18:10:05 +0000 (15:10 -0300)]
Add a tool to automate releasing new versions
This small script automates the process of building newer
versions of the tool.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 28 May 2013 18:09:29 +0000 (15:09 -0300)]
Replace some hard-coded strings by the autotools macro names
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 28 May 2013 18:00:22 +0000 (15:00 -0300)]
Bump version to 0.4.0
There are too many changes already. Bump it to version 0.4.0.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 28 May 2013 17:58:36 +0000 (14:58 -0300)]
ras-events: parse errors at select_tracing_timestamp()
This fixes the following warnings:
ras-events.c: In function 'select_tracing_timestamp':
ras-events.c:501:6: warning: ignoring return value of 'read', declared with attribute warn_unused_result [-Wunused-result]
ras-events.c:531:8: warning: ignoring return value of 'fscanf', declared with attribute warn_unused_result [-Wunused-result]
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 28 May 2013 17:08:07 +0000 (14:08 -0300)]
Store RAS sqlite3 db file on a proper place
Instead of creating it on the same directory as when it
is called, put it at ${prefix}/var/lib/rasdaemon directory.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 28 May 2013 14:37:50 +0000 (11:37 -0300)]
ras-events: use sysconf to get the number of CPU's
There are several "per-cpu" files at sysfs that seem to be
utterly bogus, as trying to poll from them just return POLLERR.
Let's use, instead, sysconf() to get the number of CPU's, avoiding
such bug.
Not sure if this would work with hotplugged CPU's, though, so
let's preserve the old code there, for now.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 28 May 2013 11:47:57 +0000 (08:47 -0300)]
ras-events: Only use pthreads for collect if poll() not available
Before kernel 3.10, one pthread per cpu was used, as the code
would need to run an endless loop, in order to get events.
With kernel 3.10 and upper, we can simply use poll() there.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 28 May 2013 11:13:17 +0000 (08:13 -0300)]
ras-mce-handler: change the test order to avoid leaked memory
As getdelim allocates memory, the better is to swap the
tests, or otherwise the code will allocate some memory that
will never be de-allocated.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 28 May 2013 10:47:53 +0000 (07:47 -0300)]
ras-mce-handler: Fix /proc/cpuinfo parser
The test for the parsing completion is wrong. Fix it.
While here, change the namespace to avoid latter
conflicts.
Reported-by: Chen Gong <gong.chen@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 27 May 2013 21:19:08 +0000 (18:19 -0300)]
ras-mce-handler: Fix a warning
ras-mce-handler.c: In function ‘register_mce_handler’:
ras-mce-handler.c:200:13: warning: ‘mce’ may be used uninitialized in this function [-Wuninitialized]
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 27 May 2013 20:47:15 +0000 (17:47 -0300)]
Enable MCE parsing at RPM files
As this is known to work, enable it.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 27 May 2013 20:46:56 +0000 (17:46 -0300)]
README: update to reflect the current status
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 27 May 2013 20:26:04 +0000 (17:26 -0300)]
Update TODO list
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 27 May 2013 20:23:48 +0000 (17:23 -0300)]
mce-intel-sb: add memory controller decoding
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 27 May 2013 20:19:11 +0000 (17:19 -0300)]
Add support to decode memory controller data on Nehalem
xeon75xx code can be dropped as it doesn't exist anyway on
mcelog. According to the code there, it lacks support for it
to work at the Kernel.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 27 May 2013 19:46:12 +0000 (16:46 -0300)]
mce-intel: Enable iMC log where available
Add a code to enable iMC log where available.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 27 May 2013 18:50:51 +0000 (15:50 -0300)]
mce-intel-ivb: enable the code that parses memory controller errors
Enable the code that parses the memory controller errors.
This code assumes that iMC log is already enabled.
A latter patch will add support for enabling it.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tony Luck [Fri, 24 May 2013 16:55:40 +0000 (09:55 -0700)]
spelling: Fix spelling in ras-record.c
s/interted/inserted/
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tony Luck [Fri, 24 May 2013 16:29:06 +0000 (09:29 -0700)]
configure: Fix help string for sqlite3
The AS_HELP_STRING has a typo and says to use "--enable-sqlite" when
it should say "-enable-sqlite3"
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 24 May 2013 14:21:32 +0000 (11:21 -0300)]
mce: Some improvements at the output format
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 24 May 2013 11:21:51 +0000 (08:21 -0300)]
ras-mce-handler: fix /proc/cpuinfo parser
The scanf parsers for /proc/cpuinfo were broken, as they
got a "mce->" prefix by mistake. Remove it to fix.
With that, MCE parser will successfully register.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 24 May 2013 11:18:48 +0000 (08:18 -0300)]
event-parse: Remove a temporary debug message
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 24 May 2013 11:16:57 +0000 (08:16 -0300)]
Don't require that all tracing types to be supported
Not all systems support all 3 types of RAS (EDAC, PCIe AER, MCELOG).
Don't bail out if at least one of them is supported.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 24 May 2013 10:37:06 +0000 (07:37 -0300)]
Update edac-tests to use ras-mc-ctl instead of ./edac-ctl
All functionalities previously found on my test version of
edac-ctl is present on ras-mc-ctl. So, let's rename it.
The test code still tries to run edac-util. This tool,
which is part of edac-utils, use the edac error counters to
check the errors. For now, let's keep it, as it might be useful,
although this will likely be removed on future versions of this
testing script.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 24 May 2013 09:18:54 +0000 (06:18 -0300)]
ras-events: Fix the logic that retrieves the debugfs mount point
While on Fedora/RHEL the mount device for debugfs is called "debugfs",
it is usual to use "none" on some other distros or for manually
mounted debugfs.
So, fix the logic to look at the filesystem type, instead, as it should
always be "debugfs", on both cases.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tony Luck [Thu, 23 May 2013 20:27:31 +0000 (13:27 -0700)]
ras-record: Avoid NULL pointer when running without sqlite
When running "rasdaemon -f" we can dereference a NULL pointer in
ras_store_mc_event() since "ras->db_priv" is NULL.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 19:42:08 +0000 (16:42 -0300)]
ras-events: Fix MCE binding
The #ifdef for detecting MCE was wrong. Due to that, the MCE
handler was not being enabled.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 19:37:54 +0000 (16:37 -0300)]
Make the enable function more generic
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 17:58:21 +0000 (14:58 -0300)]
Get rid of ras-record warnings
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 17:44:36 +0000 (14:44 -0300)]
get rid of MCE warnings
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 17:26:07 +0000 (14:26 -0300)]
Cleanup warnings at ras-aer-handler.c
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 16:35:07 +0000 (13:35 -0300)]
Fix event handler parser logic
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 14:48:02 +0000 (11:48 -0300)]
ras-events: Add some hacks to make it work with 3.6.10-rc2
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 14:07:29 +0000 (11:07 -0300)]
libtrace: sync with the latest code from trace-cmd
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 13:24:03 +0000 (10:24 -0300)]
edac-fake-inject: Check if the Kernel supports error injection
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 12:35:38 +0000 (09:35 -0300)]
Get rid of mc_event_error_type
Somehow, the tracing library is not finding it on some systems:
overriding event (710) ras:mc_event with new print handler
trace-cmd: File exists
function mc_event_error_type not defined
Let's just get rid of it.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 12:09:19 +0000 (09:09 -0300)]
Better handle parser errors with MC events
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 12:01:10 +0000 (09:01 -0300)]
edac-fake-inject: Make it more generic
The tool used to support only 2 or 3 layer memory controllers,
faling with edac_ghes driver. Make it more generic to also work
there.
Also, don't assume that the SYSFS is mounted at /sys/kernel/debug,
but look at its mount location via /proc/mounts.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 11:21:55 +0000 (08:21 -0300)]
Fix rasdaemon -d
We need to get the debugfs pointer in order to toggle the MC events.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 10:25:54 +0000 (07:25 -0300)]
Get rid of the remaining warnings
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 10:23:11 +0000 (07:23 -0300)]
libtrace: get rid of breakpoint() function
This isn't used anywhere.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 10:22:40 +0000 (07:22 -0300)]
Get rid of most warnings at libtrace
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 10:10:53 +0000 (07:10 -0300)]
Fix usage of toggle_ras_mc_event() by -d parameter
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 23 May 2013 10:07:44 +0000 (07:07 -0300)]
Enable gcc warnings
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 21 May 2013 02:37:23 +0000 (23:37 -0300)]
rasdaemon.spec: specify the root directory on a consistent way
As fedora-review tool complained:
- Package consistently uses macro is (instead of hard-coded directory names).
Note: Using both %{buildroot} and $RPM_BUILD_ROOT
See: http://fedoraproject.org/wiki/Packaging/Guidelines#macros
Let's just use %{buildroot}.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 21 May 2013 02:24:09 +0000 (23:24 -0300)]
Update it to point it to fedorapeople
The tarball used to generate the src.rpm is the one produced
by "make dist-bz2", with doesn't contain .gitignore files,
while fedorahosted only generates an snapshot with them.
That makes its hash to not match the one used at .src.rpm.
Fix it by using the uploaded file.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 21 May 2013 01:54:54 +0000 (22:54 -0300)]
Add a target to upload a new version
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 21 May 2013 01:51:05 +0000 (22:51 -0300)]
Update the spec file to require autotools for building it
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 21 May 2013 01:00:50 +0000 (22:00 -0300)]
rasdaemon.spec: Don't install INSTALL file
rpmlint complains with that:
rasdaemon.x86_64: W: install-file-in-docs /usr/share/doc/rasdaemon-0.3.0/INSTALL
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 21 May 2013 00:45:04 +0000 (21:45 -0300)]
rpmlint: fix version compliant
rasdaemon.x86_64: W: incoherent-version-in-changelog 0.2.0-1 ['0.3.0-1.fc18', '0.3.0-1']
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 21 May 2013 00:40:01 +0000 (21:40 -0300)]
ras-mc-ctl.8.in: fix rpmlint compliants
rasdaemon.x86_64: W: manual-page-warning /usr/share/man/man8/ras-mc-ctl.8.gz 79: a space character is not allowed in an escape name
rasdaemon.x86_64: W: manual-page-warning /usr/share/man/man8/ras-mc-ctl.8.gz 122: warning: macro `EL' not defined
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 21 May 2013 00:34:17 +0000 (21:34 -0300)]
Whitespace cleanups
No functional changes here, just whitespacing cleanups.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 21 May 2013 00:30:54 +0000 (21:30 -0300)]
rpmlint target: RPMS files are wrong. Fix it.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 21 May 2013 00:28:26 +0000 (21:28 -0300)]
Fix rpmlint check line
It should not be checking the .tar.bz2, but, instead, the generated
rpm files.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Tue, 21 May 2013 00:03:53 +0000 (21:03 -0300)]
Add missing header files to Makefile.am
This is needed, in order to generate the proper dist tar files.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 20 May 2013 23:53:09 +0000 (20:53 -0300)]
Bump it to version 0.3.0
As we now have initial mcelog/PCEe AER parsing, bump version
to 0.3.0.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 20 May 2013 23:52:40 +0000 (20:52 -0300)]
Add a rule to build a source rpm file
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Mon, 20 May 2013 22:28:34 +0000 (19:28 -0300)]
Auto-fill the rasdaemon.spec version
Instead of keeping it static, let ./configure to fill the
version of the rasdaemon.spec. That makes it a little easier
to be used on rpm-based distros.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 May 2013 19:49:33 +0000 (16:49 -0300)]
Add decoder for Ivy Bridge
The code came from mcelog. For now, let's disable the part that
handles the memory controller.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 May 2013 19:43:58 +0000 (16:43 -0300)]
Add decoder for Sandy Bridge
The code came from mcelog. For now, let's disable the part that
handles the memory controller.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 May 2013 15:16:46 +0000 (12:16 -0300)]
Add decoder for Intel MCE tulsa
The code came almost as-is from mcelog.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 May 2013 15:10:53 +0000 (12:10 -0300)]
Add decoder for Intel Dunnington CPUs
The code came almost as-is from mcelog.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 May 2013 14:35:55 +0000 (11:35 -0300)]
Add a decoder for Nehalem-specific types
Note: Memory Controller-specific decoding was excluded.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 May 2013 14:20:37 +0000 (11:20 -0300)]
Add a parser for Intel P4/P6 specific CPU error messages
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 May 2013 14:13:07 +0000 (11:13 -0300)]
Add a parser for Intel P4/P6 processors
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 May 2013 09:26:01 +0000 (06:26 -0300)]
mce-intel: Add support to decode MCI/MCA
As almost all mce decoding code, those code came from Andi Kleen's
mcelog application.
While the code added there came from p4.c and nehalem.c, they're
used by all Intel CPUs so far.
Intel CPU-specific code parsing is still not implemented.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 May 2013 08:23:48 +0000 (05:23 -0300)]
mce-intel: simplify code and add an user_action field
While for pure print messages, the user recommended action can be
together with the error message, having it in a separate field
helps to latter handle the error. So, split it.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Sat, 18 May 2013 08:15:30 +0000 (05:15 -0300)]
mce-amd-k8: Code cleanups
Instead of doing the error_msg buffer filling logic everywhere,
move it to a common routine.
That cleans up the code a lot, and makes easier to use the same
code to also handle other *_msg fields on latter patches.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 16 May 2013 14:54:13 +0000 (11:54 -0300)]
mce-intel: add support to decode termal bank and mcg
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 16 May 2013 11:16:12 +0000 (08:16 -0300)]
Improve MCE parser for AMD k8
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 15 May 2013 20:43:32 +0000 (17:43 -0300)]
mce-amd-k8: add status decoding logic
Add the status decoding logic from mcelog's k8.c file.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 15 May 2013 19:34:49 +0000 (16:34 -0300)]
Add per-cpu-type handlers for MCE log
For now, only the bank information is handled.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 15 May 2013 18:16:53 +0000 (15:16 -0300)]
Add a basic handler for MCE logs
For now, this handler just detects the CPU type and parses all
fields at the MCE event trace.
Latter patches will add decoding capabilities to the event.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 15 May 2013 11:56:25 +0000 (08:56 -0300)]
ras-events: prepare to handle MCE events
Parsing MCE events is hard, as it requires per-cpu-type parsing.
We can at least get those events and send them to syslog/journald.
So, ask tracing to collect them as well and add a hook for the
future mcelog parsing code.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 15 May 2013 11:07:08 +0000 (08:07 -0300)]
Add support for PCIe AER events
The code is currently untested, as I'm missing a testing
system where I could inject PCIe AER events.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 15 May 2013 11:27:06 +0000 (08:27 -0300)]
Fix dummy function arguments when compiled without sqlite3 support
That shuts up a warning.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 15 May 2013 10:24:56 +0000 (07:24 -0300)]
ras-mc-handler: remove some unused headers
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 10 May 2013 19:07:15 +0000 (16:07 -0300)]
rasdaemon: Better handle error conditions
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 10 May 2013 14:35:36 +0000 (11:35 -0300)]
Print cpu number at event records log
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 10 May 2013 14:28:59 +0000 (11:28 -0300)]
ras-record: retry open if busy
As we'll have several concurrent opens at the same time, we
need to retry if race conditions happen.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 10 May 2013 14:23:56 +0000 (11:23 -0300)]
ras-events: make the error patch to do the right thing
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 10 May 2013 13:49:56 +0000 (10:49 -0300)]
README: add project goals
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Fri, 10 May 2013 13:11:26 +0000 (10:11 -0300)]
Update README
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Thu, 9 May 2013 16:11:26 +0000 (13:11 -0300)]
ras-events: open database on each thread
sqlite3 is only able to prevent race issues between different
threads if each thread opens its own connection to the database.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 8 May 2013 18:17:03 +0000 (15:17 -0300)]
Update tarball URL
That makes rpmlint happy
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 8 May 2013 18:14:31 +0000 (15:14 -0300)]
Fix make dist-* targets
Those targets require to know what are the header files.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 8 May 2013 17:52:05 +0000 (14:52 -0300)]
Add a RPM spec file to build it with rpmbuild
Add a rasdaemon.spec template file useful for compiling it with
Fedora. It may require changes to work with other distributions
that also use rpm files, as each distro has their own rules for
rpm's, but at least this file can be used as a reference.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 8 May 2013 17:35:57 +0000 (14:35 -0300)]
Add a service to register EDAC labels
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 8 May 2013 15:24:36 +0000 (12:24 -0300)]
Add a manpage for the rasdaemon
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 8 May 2013 15:12:56 +0000 (12:12 -0300)]
Modify the ras-mc-ctl manpage to reflect the current tool
Several changes happened at the tool, since when it was
part of edac-utils. Also, a few new options got added there.
Add the missing parts and change it to reflect its new name.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 8 May 2013 14:30:44 +0000 (11:30 -0300)]
Add a man page for ras-mc-ctl
This is currently the same as edac-utils, but needs to be
re-written.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 8 May 2013 14:24:43 +0000 (11:24 -0300)]
Parse ras-mc-ctl via autoconf tools
Instead of using fixed directory prefixes, let the building
system to tell them, via ./configure.
This uses the very same solution as edac-utils do.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Mauro Carvalho Chehab [Wed, 8 May 2013 11:42:35 +0000 (08:42 -0300)]
Add more autotools stuff into .gitignore
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>