The main repository for the rasdaemon is at Fedora hosted:
- http://git.infradead.org/users/mchehab/rasdaemon.git
+- <http://git.infradead.org/users/mchehab/rasdaemon.git>
And two mirrors are available:
- https://github.com/mchehab/rasdaemon
- https://gitlab.com/mchehab_kernel/rasdaemon
+- <https://github.com/mchehab/rasdaemon>
+- <https://gitlab.com/mchehab_kernel/rasdaemon>
Tarballs for each release can be found at:
- http://www.infradead.org/~mchehab/rasdaemon/
+- <http://www.infradead.org/~mchehab/rasdaemon/>
GOALS
=====
Its initial goal is to replace the edac-tools that got bitroted after
-the addition of the HERM (Hardware Events Report Method )patches[1] at
+the addition of the HERM (Hardware Events Report Method )patches[^1] at
the EDAC Kernel drivers.
-[1] http://lkml.indiana.edu/hypermail/linux/kernel/1205.1/02075.html
+[^1]: <http://lkml.indiana.edu/hypermail/linux/kernel/1205.1/02075.html>
Its long term goal is to be the userspace tool that will collect all
hardware error events reported by the Linux Kernel from several sources
It is not meant to provide tools for doing error injection, as there are
other tools already covering it, like:
- git://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/mce-test.git
+<git://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/mce-test.git>
Yet, a few set of testing scripts are provided under /contrib dir.
In other words, the rationale for not exposing such the information is that:
- 1) can be easily accounted on userspace;
- 2) they're not really meaningful. E. g. one system with, let's say
-10 corrected errors can be fine, while another one with the same amount
-of errors can have problems, as the error counters don't take into
-account things like system uptime, memory error bursts (that could be
-caused by a solar storm, for example), etc.
+1. can be easily accounted on userspace;
+2. they're not really meaningful. E. g. one system with, let's say
+ 10 corrected errors can be fine, while another one with the same amount
+ of errors can have problems, as the error counters don't take into
+ account things like system uptime, memory error bursts (that could be
+ caused by a solar storm, for example), etc.
So, the idea since them was to make the kernel-userspace interface
simpler and move the policy to the userspace daemon. It is up to the
COMPILING AND INSTALLING
========================
-sqlite3 and autoconf needs to be installed. On Fedora, this is done
-by installing the following packages:
- make
- gcc
- autoconf
- automake
- libtool
- libtraceevent-devel
- tar
- sqlite-devel (if sqlite3 will be used)
- perl-DBD-SQLite (if sqlite3 will be used)
+sqlite3 and autoconf needs to be installed. On Fedora, this is done by
+installing the following packages:
+
+```
+ make
+ gcc
+ autoconf
+ automake
+ libtool
+ libtraceevent-devel
+ tar
+ sqlite-devel (if sqlite3 will be used)
+ perl-DBD-SQLite (if sqlite3 will be used)
+```
To install then on Fedora, run:
- dnf install -y make gcc autoconf automake libtool tar perl-dbd-sqlite libtraceevent-devel
-
+```
+ $ dnf install -y make gcc autoconf automake libtool tar perl-dbd-sqlite \
+ libtraceevent-devel
+```
Or, if sqlite3 database will be used to store data:
- dnf install -y make gcc autoconf automake libtool tar sqlite-devel libtraceevent-devel
+
+```
+ $ dnf install -y make gcc autoconf automake libtool tar sqlite-devel \
+ libtraceevent-devel
+```
There are currently 3 features that are enabled optionally, via
./configure parameters:
- --enable-sqlite3 enable storing data at SQL lite database (currently
+```
+ --enable-sqlite3 enable storing data at SQL lite database (currently
experimental)
- --enable-aer enable PCIe AER events (currently experimental)
- --enable-mce enable MCE events (currently experimental)
+ --enable-aer enable PCIe AER events (currently experimental)
+ --enable-mce enable MCE events (currently experimental)
+```
In order to compile it, run:
- $ autoreconf -vfi
- $ ./configure [parameters]
- $ make
+```
+ $ autoreconf -vfi
+ $ ./configure [parameters]
+ $ make
+```
So, for example, to enable everything but sqlite3:
- $ autoreconf -vfi && ./configure --enable-aer --enable-mce && make
+```
+ $ autoreconf -vfi && ./configure --enable-aer --enable-mce && make
+```
After compiling, run, as root:
- # make install
+```
+ $ make install
+```
-COMPILING AND INSTALLING
-========================
+RPM-based compilation
+=====================
If the distribution is rpm-based, an alternative method would be to do:
- $ autoreconf -vfi && ./configure
+```
+ $ autoreconf -vfi && ./configure
+```
The above procedure will generate a file at misc/rasdaemon.spec.
-You may edit it, in order to add/remove the --enable-[option]
+You may edit it, in order to add/remove the --enable-\[option\]
parameters.
To generate the rpm files, do:
- $ make dist-rpm
+```
+ # make mock
+```
To install the rpm files, run, as root:
- # rpm -i `rpm --eval %{_topdir}`/RPMS/x86_64/rasdaemon-0.*.fc18.x86_64.rpm
+```
+ # rpm -i $(ls SRPMS/rasdaemon-*.rpm|tail -1)
+```
RUNNING
=======
To run the rasdaemon in background, just call it without any parameters:
- # rasdaemon
+```
+ # rasdaemon
+```
The output will be available via syslog. Or, to run it in foreground and see
the logs in console, run it as:
- # rasdaemon -f
+
+```
+ # rasdaemon -f
+```
or, if you also want to record errors at the database (--enable-sqlite3 is
required):
- # rasdaemon -f -r
+
+```
+ # rasdaemon -f -r
+```
You may also start it via systemd:
- # systemctl start rasdaemon
+
+```
+ # systemctl start rasdaemon
+```
The rasdaemon will then output the messages to journald.
A script is provided under /contrib, in order to test the daemon EDAC
handler. While the daemon is running, just run:
- # contrib/edac-fake-inject
+
+```
+# contrib/edac-fake-inject
+```
The script requires a Kernel compiled with CONFIG_EDAC_DEBUG and a running
EDAC driver.
MCE error handling can use the MCE inject:
- https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
+<https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git>
+
For it to work, Kernel mce-inject module should be compiled and loaded.
APEI error injection can use this tool:
- https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/mce-test.git/
+<https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/mce-test.git/>
AER error injection can use this tool:
- https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git/
+<https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git/>
-SUBMITTING PATCHES
-==================
+# SUBMITTING PATCHES
If you want to help improving this tool, be my guest! We try to follow
the Kernel's CodingStyle and submission rules as a reference.
In order to contribute with rasdaemon, please send a Merge Request via
github repository at:
- https://github.com/mchehab/rasdaemon
+- <https://github.com/mchehab/rasdaemon>
Or, alternatively, send a pull request against gitlab repository at:
- https://gitlab.com/mchehab_kernel/rasdaemon
+- <https://gitlab.com/mchehab_kernel/rasdaemon>
-Or send patches enclosed in an email, in plain text, to:
+It is also recommended to send patches to <linux-edac@vger.kernel.org>
+with a copy to:
- linux-edac@vger.kernel.org
-
-With a copy to:
-
- Mauro Carvalho Chehab <mchehab@kernel.org>
+- Mauro Carvalho Chehab \<<mchehab@kernel.org>\>
Please notice that github is the preferred way. If you're not using
it, please be kind enough to add an issue there for us to track the
We use Signed-off-by the same way as in kernel, so I'm transcribing
bellow the same text as found under Kernel's Documentation/SubmittingPatches:
- "To improve tracking of who did what, especially with patches that can
- percolate to their final resting place in the kernel through several
- layers of maintainers, we've introduced a "sign-off" procedure on
- patches that are being emailed around.
+```
+ "To improve tracking of who did what, especially with patches that can
+ percolate to their final resting place in the kernel through several
+ layers of maintainers, we've introduced a "sign-off" procedure on
+ patches that are being emailed around.
- The sign-off is a simple line at the end of the explanation for the
- patch, which certifies that you wrote it or otherwise have the right to
- pass it on as an open-source patch. The rules are pretty simple: if you
- can certify the below:
+ The sign-off is a simple line at the end of the explanation for the
+ patch, which certifies that you wrote it or otherwise have the right to
+ pass it on as an open-source patch. The rules are pretty simple: if you
+ can certify the below:
- Developer's Certificate of Origin 1.1
+ Developer's Certificate of Origin 1.1
- By making a contribution to this project, I certify that:
+ By making a contribution to this project, I certify that:
- (a) The contribution was created in whole or in part by me and I
- have the right to submit it under the open source license
+ (a) The contribution was created in whole or in part by me and I
+ have the right to submit it under the open source license
indicated in the file; or
- (b) The contribution is based upon previous work that, to the best
- of my knowledge, is covered under an appropriate open source
- license and I have the right under that license to submit that
- work with modifications, whether created in whole or in part
- by me, under the same open source license (unless I am
- permitted to submit under a different license), as indicated
- in the file; or
+ (b) The contribution is based upon previous work that, to the best
+ of my knowledge, is covered under an appropriate open source
+ license and I have the right under that license to submit that
+ work with modifications, whether created in whole or in part
+ by me, under the same open source license (unless I am
+ permitted to submit under a different license), as indicated
+ in the file; or
+
+ (c) The contribution was provided directly to me by some other
+ person who certified (a), (b) or (c) and I have not modified
+ it.
- (c) The contribution was provided directly to me by some other
- person who certified (a), (b) or (c) and I have not modified
- it.
+ (d) I understand and agree that this project and the contribution
+ are public and that a record of the contribution (including all
+ personal information I submit with it, including my sign-off) is
+ maintained indefinitely and may be redistributed consistent with
+ this project or the open source license(s) involved.
- (d) I understand and agree that this project and the contribution
- are public and that a record of the contribution (including all
- personal information I submit with it, including my sign-off) is
- maintained indefinitely and may be redistributed consistent with
- this project or the open source license(s) involved.
+ then you just add a line saying
- then you just add a line saying
+ Signed-off-by: Random J Developer <random@developer.example.org>
- Signed-off-by: Random J Developer <random@developer.example.org>
+ using your real name (sorry, no pseudonyms or anonymous contributions.)"
- using your real name (sorry, no pseudonyms or anonymous contributions.)"
+```