www.infradead.org Git - users/mchehab/edac.git/log

]> www.infradead.org Git - users/mchehab/edac.git/log

projects / users / mchehab / edac.git / log

Mauro Carvalho Chehab [Thu, 10 May 2012 15:43:01 +0000 (12:43 -0300)]

edac: Increase version to 3.0.0 (aka: "HERM" version)

There were lots of changes introduced to justify renaming it to
3.0.0 (HERM - Hardware Events Report Method):

  - EDAC core were redesigned to represent all types of
    memory controllers;

  - EDAC API were redesigned to properly represent the memory
    controller hierarchy;

  - a tracepoint-based API were added to report memory errors.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 30 Apr 2012 13:24:43 +0000 (10:24 -0300)]

edac_mc: Cleanup per-dimm_info debug messages

The edac_mc_alloc() routine allocates one dimm_info device for all
possible memories, including the non-filled ones. The debug messages
there are somewhat confusing. So, cleans them, by moving the code
that prints the memory location to edac_mc, and using it on both
edac_mc_sysfs and edac_mc.

Also, only dumps information when DIMM/ranks are actually
filled.

After this patch, a dimm-based memory controller will print the debug
info as:

[ 1011.380027] EDAC DEBUG: edac_mc_dump_csrow: csrow->csrow_idx = 0
[ 1011.380029] EDAC DEBUG: edac_mc_dump_csrow:   csrow = ffff8801169be000
[ 1011.380031] EDAC DEBUG: edac_mc_dump_csrow:   csrow->first_page = 0x0
[ 1011.380032] EDAC DEBUG: edac_mc_dump_csrow:   csrow->last_page = 0x0
[ 1011.380034] EDAC DEBUG: edac_mc_dump_csrow:   csrow->page_mask = 0x0
[ 1011.380035] EDAC DEBUG: edac_mc_dump_csrow:   csrow->nr_channels = 3
[ 1011.380037] EDAC DEBUG: edac_mc_dump_csrow:   csrow->channels = ffff8801149c2840
[ 1011.380039] EDAC DEBUG: edac_mc_dump_csrow:   csrow->mci = ffff880117426000
[ 1011.380041] EDAC DEBUG: edac_mc_dump_channel:   channel->chan_idx = 0
[ 1011.380042] EDAC DEBUG: edac_mc_dump_channel:     channel = ffff8801149c2860
[ 1011.380044] EDAC DEBUG: edac_mc_dump_channel:     channel->csrow = ffff8801169be000
[ 1011.380046] EDAC DEBUG: edac_mc_dump_channel:     channel->dimm = ffff88010fe90400
...
[ 1011.380095] EDAC DEBUG: edac_mc_dump_dimm: dimm0: channel 0 slot 0 mapped as virtual row 0, chan 0
[ 1011.380097] EDAC DEBUG: edac_mc_dump_dimm:   dimm = ffff88010fe90400
[ 1011.380099] EDAC DEBUG: edac_mc_dump_dimm:   dimm->label = 'CPU#0Channel#0_DIMM#0'
[ 1011.380101] EDAC DEBUG: edac_mc_dump_dimm:   dimm->nr_pages = 0x40000
[ 1011.380103] EDAC DEBUG: edac_mc_dump_dimm:   dimm->grain = 8
[ 1011.380104] EDAC DEBUG: edac_mc_dump_dimm:   dimm->nr_pages = 0x40000
...

(a rank-based memory controller would print, instead of "dimm?", "rank?"
on the above debug info)

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Joe Perches [Sun, 29 Apr 2012 20:08:39 +0000 (17:08 -0300)]

edac: Convert debugfX to edac_dbg(X,

Use a more common debugging style.

Remove __FILE__ uses, add missing newlines,
coalesce formats and align arguments.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Joe Perches [Sat, 28 Apr 2012 19:41:46 +0000 (16:41 -0300)]

edac: Use more normal debugging macro style

Convert macros to a simpler style and enforce appropriate
format checking when not CONFIG_EDAC_DEBUG.

Use fmt and __VA_ARGS__, neaten macros.

Move some string arrays to the debugfx uses and remove the
now unnecessary CONFIG_EDAC_DEBUG variable block definitions.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Sun, 29 Apr 2012 14:59:14 +0000 (11:59 -0300)]

edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs

The debug macro already adds that. Most of the work here was
made by this small script:

$f .=$_ while (<>);

$f =~ s/(debugf[0-9]\s*$\s*)__FILE__\s*": /\1"/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*/\1/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*"MC: /\1"/g;

$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\($]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*$\")\%s[\:\,\($]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;
$f =~ s/(debugf[0-9]\s*$\"MC\:\s*)\%s[\:\,\($]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*$\"MC\:\s*)\%s[\:\,\($]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;

$f =~ s/\"MC\: \\n\"/"MC:\\n"/g;

print $f;

After running the script, manual cleanups were done to fix it the remaining
places.

While here, removed the __LINE__ on most places, as it doesn't actually give
useful info on most places.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Thu, 26 Apr 2012 14:47:29 +0000 (11:47 -0300)]

i7core: fix ranks information at the per-channel struct

There is a flag at the per-channel struct that indicates if there are
any 4R dimm on it. The way the presence of this flag were reported
is not ok, as it might give the false idea that the channel were filled
with 2R memories:

[ 580.588701] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f7431): 2 ranks, UDIMMs
[ 580.588704] EDAC DEBUG: get_dimm_config: dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400

(in this case, just one 1R memory is filled on channel 1)

So, use a better way to represent the per-channel ranks information.
After the patch, it will show:

[ 2002.233978] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f7431): UDIMMs
[ 2002.233982] EDAC DEBUG: get_dimm_config: dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
[ 2002.233988] EDAC DEBUG: get_dimm_config: dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400

(in this case, there isn't any 4R memories)

Reported-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 25 Apr 2012 14:47:36 +0000 (11:47 -0300)]

i5000: Fix the fatal error handling

The fatal error channel bits point to a single channel, and not
to a range of channels. Fix the code to properly report it,
instead of printing messages like:
kernel: EDAC MC0: INTERNAL ERROR: channel-b out of range (4 >= 4)

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Tue, 17 Apr 2012 14:13:10 +0000 (11:13 -0300)]

Edac: Add ABI Documentation for the new device nodes

The EDAC ABI were extended to add support for per-DIMM or per-rank
information and silkscreen labels. Properly document them.

Most of the comments there came from edac.txt descriptions of the
fields that are part of the legacy csrowX ABI (e. g.
/sys/devices/system/edac/mc/mc*/csrow*/*).

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Tue, 17 Apr 2012 11:53:34 +0000 (08:53 -0300)]

edac: move documentation ABI to ABI/testing/sysfs-devices-edac

The EDAC MC API is currently stored at the wrong place. Move the
parts of the EDAC MC ABI that will be kept to
ABI/testing/sysfs-devices-edac.

The Date: field were added based on git timestamps for the git
commit patches that added the functionality at edac.txt.

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Fri, 30 Mar 2012 19:10:51 +0000 (16:10 -0300)]

i7core_edac: change the mem allocation scheme to make Documentation/kobject.txt happy

Kernel kobjects have rigid rules: each container object should be
dynamically allocated, and can't be allocated into a single kmalloc.

EDAC never obeyed this rule: it has a single malloc function that
allocates all needed data into a single kzalloc.

As this is not accepted anymore, change the allocation schema of the
EDAC *_info structs to enforce this Kernel standard.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Tue, 24 Apr 2012 18:05:43 +0000 (15:05 -0300)]

edac: change the mem allocation scheme to make Documentation/kobject.txt happy

Kernel kobjects have rigid rules: each container object should be
dynamically allocated, and can't be allocated into a single kmalloc.

EDAC never obeyed this rule: it has a single malloc function that
allocates all needed data into a single kzalloc.

As this is not accepted anymore, change the allocation schema of the
EDAC *_info structs to enforce this Kernel standard.

Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Greg K H <gregkh@linuxfoundation.org>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Thu, 29 Mar 2012 15:20:22 +0000 (12:20 -0300)]

edac: Only expose csrows/channels on legacy API if they're populated

This patch actually fixes a bug with the legacy API, where, at the
same csrow, some channels may have different DIMMs. This can happen
on FB-DIMM/RAMBUS and modern Intel controllers.

This is the case, for example, of Nehalem machines:

$ ./edac-ctl --layout
       +-----------------------------------+
       |                mc0                |
       | channel0  | channel1  | channel2  |
-------+-----------------------------------+
slot2: |     0 MB  |     0 MB  |     0 MB  |
slot1: |  1024 MB  |     0 MB  |     0 MB  |
slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
-------+-----------------------------------+

Before this patch, non-filled memories were shown. Now, only what's
filled is there:

grep . /sys/devices/system/edac/mc/mc0/csrow*/ch?*
/sys/devices/system/edac/mc/mc0/csrow0/ch0_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow0/ch0_dimm_label:CPU#0Channel#0_DIMM#0
/sys/devices/system/edac/mc/mc0/csrow0/ch1_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow0/ch1_dimm_label:CPU#0Channel#0_DIMM#1
/sys/devices/system/edac/mc/mc0/csrow1/ch0_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow1/ch0_dimm_label:CPU#0Channel#1_DIMM#0
/sys/devices/system/edac/mc/mc0/csrow2/ch0_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow2/ch0_dimm_label:CPU#0Channel#2_DIMM#0

Thanks-to: Aristeu Rozanski Filho <arozansk@redhat.com>
Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Thu, 29 Mar 2012 13:41:38 +0000 (10:41 -0300)]

i7300_edac: Get rid of some wrongly-solved rebase conflict

Drivers should not increment tot_dimms manually. This code were
there for an older approach that got removed. After one of the
several rebases that this code suffered until the final version,
the conflict was wrongly merged, and this code returned.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Thu, 29 Mar 2012 11:41:08 +0000 (08:41 -0300)]

i5100_edac: Fix a warning when compiled with 32 bits

drivers/edac/i5100_edac.c: In function ‘i5100_init_csrows’:
drivers/edac/i5100_edac.c:862:3: warning: format ‘%zd’ expects argument of type ‘signed size_t’, but argument 5 has type ‘long unsigned int’ [-Wformat]

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 28 Mar 2012 23:13:36 +0000 (20:13 -0300)]

i82975x_edac: Test nr_pages earlier to save a few CPU cycles

Avoid test nr_pages twice, and initializing some data that won't
be used.

Cleanup patch only.

Reported-by: Aristeu Rozanski Filho <arozansk@redhat.com>
Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 28 Mar 2012 22:37:59 +0000 (19:37 -0300)]

edac: Move grain/dtype/edac_type calculus to be out of channel loop

The 3e7bddc changeset (edac: move dimm properties to struct memset_info)
moved the calculus inside a loop. However, at those stuff are common to
all channels, on several drivers, it is better to put the calculus
outside the loop, to optimize the code.

Reported-by: Aristeu Rozanski Filho <arozansk@redhat.com>
Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 26 Mar 2012 12:35:11 +0000 (09:35 -0300)]

edac: Add debufs nodes to allow doing fake error inject

Sometimes, it is useful to have a mechanism that generates fake
errors, in order to test the EDAC core code, and the userspace
tools.

Provide such mechanism by adding a few debugfs nodes.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 21 Mar 2012 20:13:24 +0000 (17:13 -0300)]

edac: add a sysfs node to report the maximum location for the system

The userspace tools need to know what's the maximum location on each
system, as it helps to create nice maps showing how the memory was
filled at the system.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 21 Mar 2012 20:06:53 +0000 (17:06 -0300)]

edac: add a new per-dimm API and make the old per-virtual-rank API obsolete

The old EDAC API is broken. It only works fine for systems manufatured
before 2005 and for AMD 64. The reason is that it forces all memory
controller drivers to discover rank info.

Also, it doesn't allow grouping the several ranks into a DIMM.

So, what almost all modern drivers do is to create a fake virtual-rank
information, and use it to cheat the EDAC core to accept the driver.

While this works if the user has enough time to discover what DIMM slot
corresponds to each "virtual-rank" information, it prevents EDAC usage
for users with less available time. It also makes life hard for vendors
that may want to provide a table with their motherboards to the userspace
tool (edac-utils) as each driver has its own logic for the virtual
mapping.

So, the old API should be removed, in favor of a more flexible API that
allows newer drivers to not lie to the EDAC core.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Hui Wang <jason77.wang@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 21 Mar 2012 19:55:02 +0000 (16:55 -0300)]

edac: Get rid of the old kobj's from the edac mc code

Now that al users for the old kobj raw access are gone,
we can get rid of the legacy kobj-based structures and
data.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 21 Mar 2012 14:08:06 +0000 (11:08 -0300)]

i7core_edac: convert it to use struct device

Instead of relying on a complex logic inside the edac core to create
a "device tree-like" sysfs struct, just use device_add.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 21 Mar 2012 17:00:44 +0000 (14:00 -0300)]

amd64_edac: convert sysfs logic to use struct device

Now that the EDAC core supports struct device, there's no sense
on having any logic at the EDAC core to simulate it. So, instead
of adding such logic there, change the logic at amd64_edac to
use it.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 21 Mar 2012 18:16:20 +0000 (15:16 -0300)]

mpc85xx_edac: convert sysfs logic to use struct device

Now that the EDAC core supports struct device, there's no sense on
having any logic at the EDAC core to simulate it. So, instead of adding
such logic there, change the logic at mpc85xx_edac to use it

compile-tested only.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 19:41:11 +0000 (16:41 -0300)]

edac: rewrite the sysfs code to use struct device

The EDAC subsystem uses the old struct sysdev approach,
creating all nodes using the raw sysfs API. This is bad,
as the API is deprecated.

As we'll be changing the EDAC API, let's first port the existing
code to struct device.

There's one drawback on this patch: driver-specific sysfs
nodes, used by mpc85xx_edac, amd64_edac and i7core_edac
won't be created anymore. While it would be possible to
also port the device-specific code, that would mix kobj with
struct device, with is not recommended. Also, it is easier and nicer
to move the code to the drivers, instead, as the core can get rid
of some complex logic that just emulates what the device_add()
and device_create_file() already does.

The next patches will convert the driver-specific code to use
the device-specific calls. Then, the remaining bits of the old
sysfs API will be removed.

NOTE: a per-MC bus is required, otherwise devices with more than
one memory controller will hit a bug like the one below:

[  819.094946] EDAC DEBUG: find_mci_by_dev: find_mci_by_dev()
[  819.094948] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device() idx=1
[  819.094952] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device(): creating device mc1
[  819.094967] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device creating dimm0, located at channel 0 slot 0
[  819.094984] ------------[ cut here ]------------
[  819.100142] WARNING: at fs/sysfs/dir.c:481 sysfs_add_one+0xc1/0xf0()
[  819.107282] Hardware name: S2600CP
[  819.111078] sysfs: cannot create duplicate filename '/bus/edac/devices/dimm0'
[  819.119062] Modules linked in: sb_edac(+) edac_core ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge stp llc sunrpc binfmt_misc dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun kvm microcode pcspkr iTCO_wdt iTCO_vendor_support igb i2c_i801 i2c_core sg ioatdma dca sr_mod cdrom sd_mod crc_t10dif ahci libahci isci libsas libata scsi_transport_sas scsi_mod wmi dm_mod [last unloaded: scsi_wait_scan]
[  819.175748] Pid: 10902, comm: modprobe Not tainted 3.3.0-0.11.el7.v12.2.x86_64 #1
[  819.184113] Call Trace:
[  819.186868]  [<ffffffff8105adaf>] warn_slowpath_common+0x7f/0xc0
[  819.193573]  [<ffffffff8105aea6>] warn_slowpath_fmt+0x46/0x50
[  819.200000]  [<ffffffff811f53d1>] sysfs_add_one+0xc1/0xf0
[  819.206025]  [<ffffffff811f5cf5>] sysfs_do_create_link+0x135/0x220
[  819.212944]  [<ffffffff811f7023>] ? sysfs_create_group+0x13/0x20
[  819.219656]  [<ffffffff811f5df3>] sysfs_create_link+0x13/0x20
[  819.226109]  [<ffffffff813b04f6>] bus_add_device+0xe6/0x1b0
[  819.232350]  [<ffffffff813ae7cb>] device_add+0x2db/0x460
[  819.238300]  [<ffffffffa0325634>] edac_create_dimm_object+0x84/0xf0 [edac_core]
[  819.246460]  [<ffffffffa0325e18>] edac_create_sysfs_mci_device+0xe8/0x290 [edac_core]
[  819.255215]  [<ffffffffa0322e2a>] edac_mc_add_mc+0x5a/0x2c0 [edac_core]
[  819.262611]  [<ffffffffa03412df>] sbridge_register_mci+0x1bc/0x279 [sb_edac]
[  819.270493]  [<ffffffffa03417a3>] sbridge_probe+0xef/0x175 [sb_edac]
[  819.277630]  [<ffffffff813ba4e8>] ? pm_runtime_enable+0x58/0x90
[  819.284268]  [<ffffffff812f430c>] local_pci_probe+0x5c/0xd0
[  819.290508]  [<ffffffff812f5ba1>] __pci_device_probe+0xf1/0x100
[  819.297117]  [<ffffffff812f5bea>] pci_device_probe+0x3a/0x60
[  819.303457]  [<ffffffff813b1003>] really_probe+0x73/0x270
[  819.309496]  [<ffffffff813b138e>] driver_probe_device+0x4e/0xb0
[  819.316104]  [<ffffffff813b149b>] __driver_attach+0xab/0xb0
[  819.322337]  [<ffffffff813b13f0>] ? driver_probe_device+0xb0/0xb0
[  819.329151]  [<ffffffff813af5d6>] bus_for_each_dev+0x56/0x90
[  819.335489]  [<ffffffff813b0d7e>] driver_attach+0x1e/0x20
[  819.341534]  [<ffffffff813b0980>] bus_add_driver+0x1b0/0x2a0
[  819.347884]  [<ffffffffa0347000>] ? 0xffffffffa0346fff
[  819.353641]  [<ffffffff813b19f6>] driver_register+0x76/0x140
[  819.359980]  [<ffffffff8159f18b>] ? printk+0x51/0x53
[  819.365524]  [<ffffffffa0347000>] ? 0xffffffffa0346fff
[  819.371291]  [<ffffffff812f5896>] __pci_register_driver+0x56/0xd0
[  819.378096]  [<ffffffffa0347054>] sbridge_init+0x54/0x1000 [sb_edac]
[  819.385231]  [<ffffffff8100203f>] do_one_initcall+0x3f/0x170
[  819.391577]  [<ffffffff810bcd2e>] sys_init_module+0xbe/0x230
[  819.397926]  [<ffffffff815bb529>] system_call_fastpath+0x16/0x1b
[  819.404633] ---[ end trace 1654fdd39556689f ]---

This happens because the bus is not being properly initialized.
Instead of putting the memory sub-devices inside the memory controller,
it is putting everything under the same directory:

$ tree /sys/bus/edac/
/sys/bus/edac/
├── devices
│   ├── all_channel_counts -> ../../../devices/system/edac/mc/mc0/all_channel_counts
│   ├── csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
│   ├── csrow1 -> ../../../devices/system/edac/mc/mc0/csrow1
│   ├── csrow2 -> ../../../devices/system/edac/mc/mc0/csrow2
│   ├── dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
│   ├── dimm1 -> ../../../devices/system/edac/mc/mc0/dimm1
│   ├── dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
│   ├── dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
│   ├── inject_addrmatch -> ../../../devices/system/edac/mc/mc0/inject_addrmatch
│   ├── mc -> ../../../devices/system/edac/mc
│   └── mc0 -> ../../../devices/system/edac/mc/mc0
├── drivers
├── drivers_autoprobe
├── drivers_probe
└── uevent

On a multi-memory controller system, the names "csrow%d" and "dimm%d"
should be under "mc%d", and not at the main hierarchy level.

So, we need to create a per-MC bus, in order to have its own namespace.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Greg K H <gregkh@linuxfoundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 21 Mar 2012 19:21:07 +0000 (16:21 -0300)]

edac: use Documentation-nano format for some data structs

No functional changes. Just comment improvements.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Fri, 16 Mar 2012 10:44:18 +0000 (07:44 -0300)]

edac: Rename the parent dev to pdev

As EDAC doesn't use struct device itself, it created a parent dev
pointer called as "pdev". Now that we'll be converting it to use
struct device, instead of struct devsys, this needs to be fixed.

No functional changes.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Thu, 15 Mar 2012 16:41:17 +0000 (13:41 -0300)]

e752x_edac: provide more info about how DIMMS/ranks are mapped

No funtional changes here. Only the comments got updated.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Fri, 24 Feb 2012 12:34:54 +0000 (09:34 -0300)]

i5000_edac: Fix the logic that retrieves memory information

The logic there is broken: it basically creates two csrows for
each DIMM and assumes that all DIMM's are dual rank. Only one of
the csrows will contain the entire DIMM size. If single rank
memories are found, they'll be marked with 0 bytes.

The check if the AMB is present were also wrong.

Yet, as the error reports don't use the memory size in order to
credit an error to the right DIMM, that part of the driver seems
to work. That's why probably nobody detected the issue yet.

After this patch, the memory layout is now properly reported,
when debug mode is enabled, and the number of ranks per dimm is
now shown:

calculate_dimm_size: ----------------------------------------------------------
calculate_dimm_size: slot  3       0 MB   |    0 MB   |    0 MB   |    0 MB   |
calculate_dimm_size: slot  2       0 MB   |    0 MB   |    0 MB   |    0 MB   |
calculate_dimm_size: ----------------------------------------------------------
calculate_dimm_size: slot  1       0 MB   |    0 MB   |    0 MB   |    0 MB   |
calculate_dimm_size: slot  0     512 MB 1R|  512 MB 1R|  512 MB 1R|  512 MB 1R|
calculate_dimm_size: ----------------------------------------------------------
calculate_dimm_size:            channel 0 | channel 1 | channel 2 | channel 3 |
calculate_dimm_size:                   branch 0       |        branch 1       |

(1R above means that all memories on my test machine are single-ranked)

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Thu, 23 Feb 2012 11:10:34 +0000 (08:10 -0300)]

RAS: Add a tracepoint for reporting memory controller events

Add a new tracepoint-based hardware events report method for
reporting Memory Controller events.

Part of the description bellow is shamelessly copied from Tony
Luck's notes about the Hardware Error BoF during LPC 2010 [1].
Tony, thanks for your notes and discussions to generate the
h/w error reporting requirements.

[1] http://lwn.net/Articles/416669/

    We have several subsystems & methods for reporting hardware errors:

    1) EDAC ("Error Detection and Correction").  In its original form
    this consisted of a platform specific driver that read topology
    information and error counts from chipset registers and reported
    the results via a sysfs interface.

    2) mcelog - x86 specific decoding of machine check bank registers
    reporting in binary form via /dev/mcelog. Recent additions make use
    of the APEI extensions that were documented in version 4.0a of the
    ACPI specification to acquire more information about errors without
    having to rely reading chipset registers directly. A user level
    programs decodes into somewhat human readable format.

    3) drivers/edac/mce_amd.c - this driver hooks into the mcelog path and
    decodes errors reported via machine check bank registers in AMD
    processors to the console log using printk();

    Each of these mechanisms has a band of followers ... and none
    of them appear to meet all the needs of all users.

As part of a RAS subsystem, let's encapsulate the memory error hardware
events into a trace facility.

The tracepoint printk will be displayed like:

mc_event: (Corrected|Uncorrected|Fatal) error:[error msg] on memory stick "[label]" ([location] [edac_mc detail] [driver_detail])

Where:
[error msg] is the driver-specific error message
    (e. g. "memory read", "bus error", ...);
[location] is the location in terms of memory controller and
   branch/channel/slot, channel/slot or csrow/channel;
[label] is the memory stick label;
[edac_mc detail] describes the address location of the error
and the syndrome;
[driver detail] is driver-specifig error message details,
when needed/provided (e. g. "area:DMA", ...)

For example:

mc_event: Corrected error:memory read on memory stick "DIMM_1A" (mc:0 channel:0 slot:0 page:0x586b6e offset:0xa66 grain:32 syndrome:0x0 area:DMA)

Of course, any userspace tools meant to handle errors should not parse
the above data. They should, instead, use the binary fields provided by
the tracepoint, mapping them directly into their MIBs.

NOTE: The original patch was providing an additional mechanism for
MCA-based trace events that also contained MCA error register data.
Hoever, as no agreement was reached so far for the MCA-based trace
events, for now, let's add events only for memory errors.
A latter patch is planned to change the tracepoint, for those types
of event.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Sun, 12 Feb 2012 20:18:06 +0000 (17:18 -0300)]

i5400_edac: improve debug messages to better represent the filled memory

Improves the debug output message, in order to better represent the
memory controller hierarchy, when outputing the debug messages.

No functional changes when debug is disabled.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Fri, 11 May 2012 14:41:45 +0000 (11:41 -0300)]

edac: Cleanup the logs for i7core and sb edac drivers

Remove some information that it is duplicated at the MCE log,
and don't have much usage for the error. Those data will be
added again, when creating a trace function that outputs both
memory errors and MCE fields.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Thu, 9 Feb 2012 14:05:20 +0000 (11:05 -0300)]

edac: Initialize the dimm label with the known information

While userspace doesn't fill the dimm labels, add there the dimm location,
as described by the used memory model. This could eventually match what
is described at the dmidecode, making easier for people to identify the
memory.

For example, on an Intel motherboard, the memory is described as:

Memory Device
Array Handle: 0x0029
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 2048 MB
Form Factor: DIMM
Set: 1
Locator: A1_DIMM0
Bank Locator: A1_Node0_Channel0_Dimm0
Type: <OUT OF SPEC>
Type Detail: Synchronous
Speed: 800 MHz
Manufacturer: A1_Manufacturer0
Serial Number: A1_SerNum0
Asset Tag: A1_AssetTagNum0
Part Number: A1_PartNum0

After this patch, the memory label will be filled with:
/sys/devices/system/edac/mc/mc0/dimm0/dimm_label:mc#0channel#0slot#0

With somewhat matches what it is at the Bank Locator DMI information.
So, it is easier to associate the dimm labels, of course assuming that
the DMI has the Bank Locator filled, and the BIOS doesn't have any bugs.

Yet, even without it, several motherboards are provided with enough
info to map from channel/slot (or branch/channel/slot) into the DIMM
label. So, letting the EDAC core fill it, by default is a good thing.

It should noticed that, as the label filling happens at the
edac_mc_alloc(), drivers can override it to better describe the memories
(and some actually do it).

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 2 May 2012 17:37:00 +0000 (14:37 -0300)]

edac: Remove the legacy EDAC ABI

Now that all drivers got converted to use the new ABI, we can
drop the old one.

Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:12:35 +0000 (15:12 -0300)]

x38_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:12:28 +0000 (15:12 -0300)]

tile_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:12:22 +0000 (15:12 -0300)]

sb_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:12:13 +0000 (15:12 -0300)]

r82600_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Tim Small <tim@buttersideup.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:11:59 +0000 (15:11 -0300)]

ppc4xx_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:11:36 +0000 (15:11 -0300)]

pasemi_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:11:20 +0000 (15:11 -0300)]

mv64x60_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:11:08 +0000 (15:11 -0300)]

mpc85xx_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:55 +0000 (15:10 -0300)]

i82975x_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:43 +0000 (15:10 -0300)]

i82875p_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:31 +0000 (15:10 -0300)]

i82860_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:18 +0000 (15:10 -0300)]

i82443bxgx_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Tim Small <tim@buttersideup.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:12 +0000 (15:10 -0300)]

i7core_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:05 +0000 (15:10 -0300)]

i7300_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:09:58 +0000 (15:09 -0300)]

i5400_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:09:52 +0000 (15:09 -0300)]

i5100_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:09:46 +0000 (15:09 -0300)]

i5000_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:09:38 +0000 (15:09 -0300)]

i3200_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:09:00 +0000 (15:09 -0300)]

i3000_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Jason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:07:09 +0000 (15:07 -0300)]

e7xxx_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:06:59 +0000 (15:06 -0300)]

e752x_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Mark Gross <mark.gross@intel.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:06:50 +0000 (15:06 -0300)]

cpc925_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:06:25 +0000 (15:06 -0300)]

cell_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:05:27 +0000 (15:05 -0300)]

amd76x_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:03:50 +0000 (15:03 -0300)]

amd64_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Wed, 18 Apr 2012 18:20:50 +0000 (15:20 -0300)]

edac: Change internal representation to work with layers

Change the EDAC internal representation to work with non-csrow
based memory controllers.

There are lots of those memory controllers nowadays, and more
are coming. So, the EDAC internal representation needs to be
changed, in order to work with those memory controllers, while
preserving backward compatibility with the old ones.

The edac core was written with the idea that memory controllers
are able to directly access csrows.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks.

So, change the allocation and error report routines to allow
them to work with all types of architectures.

This will allow the removal of several hacks with FB-DIMM and RAMBUS
memory controllers.

Also, several tests were done on different platforms using different
x86 drivers.

TODO: a multi-rank DIMMs are currently represented by multiple DIMM
entries in struct dimm_info. That means that changing a label for one
rank won't change the same label for the other ranks at the same DIMM.
This bug is present since the beginning of the EDAC, so it is not a big
deal. However, on several drivers, it is possible to fix this issue, but
it should be a per-driver fix, as the csrow => DIMM arrangement may not
be equal for all. So, don't try to fix it here yet.

I tried to make this patch as short as possible, preceding it with
several other patches that simplified the logic here. Yet, as the
internal API changes, all drivers need changes. The changes are
generally bigger in the drivers for FB-DIMMs.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 16:04:46 +0000 (13:04 -0300)]

edac.h: Add generic layers for describing a memory location

The edac core were written with the idea that memory controllers
are able to directly access csrows, and that the channels are
used inside a csrows select.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks, accessed
via csrow/channel.

So, changes are needed in order to allow the EDAC core to
work with all types of architectures.

In preparation for handling non-csrows based memory controllers,
add some memory structs and a macro:

enum hw_event_mc_err_type: describes the type of error
   (corrected, uncorrected, fatal)

To be used by the new edac_mc_handle_error function;

enum edac_mc_layer: describes the type of a given memory
architecture layer (branch, channel, slot, csrow).

struct edac_mc_layer: describes the properties of a memory
      layer (type, size, and if the layer
      will be used on a virtual csrow.

EDAC_DIMM_PTR() - as the number of layers can vary from 1 to 3,
this macro converts from an address with up to 3 layers into
a linear address.

Reviewed-by: Borislav Petkov <bp@amd64.org>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 16 Apr 2012 13:18:12 +0000 (10:18 -0300)]

edac: rewrite edac_align_ptr()

The edac_align_ptr() function is used to prepare data for a single
memory allocation kzalloc() call. It counts how many bytes are needed
by some data structure.

Using it as-is is not that trivial, as the quantity of memory elements
reserved is not there, but, instead, it is on a next call.

In order to avoid mistakes when using it, move the number of allocated
elements into it, making easier to use it.

Reviewed-by: Borislav Petkov <bp@amd64.org>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Sat, 28 Jan 2012 12:09:38 +0000 (09:09 -0300)]

edac: move nr_pages to dimm struct

The number of pages is a dimm property. Move it to the dimm struct.

After this change, it is possible to add sysfs nodes for the DIMM's that
will properly represent the DIMM stick properties, including its size.

A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
the memory controller represents the memory via chip select rows.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Sat, 28 Jan 2012 00:20:32 +0000 (21:20 -0300)]

edac: Don't initialize csrow's first_page & friends when not needed

Almost all edac drivers initialize csrow_info->first_page,
csrow_info->last_page and csrow_info->page_mask. Those vars are
used inside the EDAC core, in order to calculate the csrow affected
by an error, by using the routine edac_mc_find_csrow_by_page().

However, very few drivers actually use it:
        e752x_edac.c
        e7xxx_edac.c
        i3000_edac.c
        i82443bxgx_edac.c
        i82860_edac.c
        i82875p_edac.c
        i82975x_edac.c
        r82600_edac.c

There also a few other drivers that have their own calculus
formula internally using those vars.

All the others are just wasting time by initializing those
data.

While initializing data without using them won't cause any troubles, as
those information is stored at the wrong place (at csrows structure), it
is better to remove what is unused, in order to simplify the next patch.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Fri, 27 Jan 2012 21:38:08 +0000 (18:38 -0300)]

edac: move dimm properties to struct dimm_info

On systems based on chip select rows, all channels need to use memories
with the same properties, otherwise the memories on channels A and B
won't be recognized.

However, such assumption is not true for all types of memory
controllers.

Controllers for FB-DIMM's don't have such requirements.

Also, modern Intel controllers seem to be capable of handling such
differences.

So, we need to get rid of storing the DIMM information into a per-csrow
data, storing it, instead at the right place.

The first step is to move grain, mtype, dtype and edac_mode to the
per-dimm struct.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Reviewed-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@parallels.com>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Mike Williams <mike@mikebwilliams.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Fri, 27 Jan 2012 17:12:32 +0000 (14:12 -0300)]

edac: Create a dimm struct and move the labels into it

The way a DIMM is currently represented implies that they're
linked into a per-csrow struct. However, some drivers don't see
csrows, as they're ridden behind some chip like the AMB's
on FBDIMM's, for example.

This forced drivers to fake^Wvirtualize a csrow struct, and to create
a mess under csrow/channel original's concept.

Move the DIMM labels into a per-DIMM struct, and add there
the real location of the socket, in terms of csrow/channel.
Latter patches will modify the location to properly represent the
memory architecture.

All other drivers will use a per-csrow type of location.
Some of those drivers will require a latter conversion, as
they also fake the csrows internally.

TODO: While this patch doesn't change the existing behavior, on
csrows-based memory controllers, a csrow/channel pair points to a memory
rank. There's a known bug at the EDAC core that allows having different
labels for the same DIMM, if it has more than one rank. A latter patch
is need to merge the several ranks for a DIMM into the same dimm_info
struct, in order to avoid having different labels for the same DIMM.

The edac_mc_alloc() will now contain a per-dimm initialization loop that
will be changed by latter patches in order to match other types of
memory architectures.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Reviewed-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Fri, 27 Jan 2012 13:26:13 +0000 (10:26 -0300)]

edac: rename channel_info to rank_info

What it is pointed by a csrow/channel vector is a rank information, and
not a channel information.

On a traditional architecture, the memory controller directly access the
memory ranks, via chip select rows. Different ranks at the same DIMM is
selected via different chip select rows. So, typically, one
csrow/channel pair means one different DIMM.

On FB-DIMMs, there's a microcontroller chip at the DIMM, called Advanced
Memory Buffer (AMB) that serves as the interface between the memory
controller and the memory chips.

The AMB selection is via the DIMM slot, and not via a csrow.

It is up to the AMB to talk with the csrows of the DRAM chips.

So, the FB-DIMM memory controllers see the DIMM slot, and not the DIMM
rank. RAMBUS is similar.

Newer memory controllers, like the ones found on Intel Sandy Bridge and
Nehalem, even working with normal DDR3 DIMM's, don't use the usual
channel A/channel B interleaving schema to provide 128 bits data access.

Instead, they have more channels (3 or 4 channels), and they can use
several interleaving schemas. Such memory controllers see the DIMMs
directly on their registers, instead of the ranks, which is better for
the driver, as its main usageis to point to a broken DIMM stick (the
Field Repleceable Unit), and not to point to a broken DRAM chip.

The drivers that support such such newer memory architecture models
currently need to fake information and to abuse on EDAC structures, as
the subsystem was conceived with the idea that the csrow would always be
visible by the CPU.

To make things a little worse, those drivers don't currently fake
csrows/channels on a consistent way, as the concepts there don't apply
to the memory controllers they're talking with. So, each driver author
interpreted the concepts using a different logic.

In order to fix it, let's rename the data structure that points into a
DIMM rank to "rank_info", in order to be clearer about what's stored
there.

Latter patches will provide a better way to represent the memory
hierarchy for the other types of memory controller.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Sun, 12 Feb 2012 11:21:34 +0000 (08:21 -0300)]

i5400_edac: Avoid calling pci_put_device() twice

When i5400_edac driver is removed and re-loaded a few times, it causes
an OOPS, as it is currently decrementing some PCI device usage two
times.

When called inside a loop, pci_get_device() will call
pci_put_device(). That mangles the error count. In this specific
case, it seems easier to just duplicate the call.

Also fixes the error logic when pci_get_device fails.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Niklas Söderlund [Fri, 9 Dec 2011 16:12:15 +0000 (13:12 -0300)]

edac: i5100 ack error detection register after each read

If I only ack the detection register after a error have been detected
I'm unable to reliably detect errors. I have verified this behavior
using both an error injection DIMM and software to inject errors.

I can't find any documentation supporting this behavior in Intel 5100
Memory Controller Hub Chipset, see 1. So this is all based on
experimentation.

[1] Intel® 5100 Memory Controller Hub Chipset
http://www.intel.com/content/dam/doc/datasheet/5100-
memory-controller-hub-chipset-datasheet.pdf

Signed-off-by: Niklas Söderlund <niklas.soderlund@ericsson.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Niklas Söderlund [Fri, 17 Feb 2012 10:36:54 +0000 (07:36 -0300)]

edac: i5100 fix erroneous define for M1Err

According to [1] the define for M1Err in the FERR_NF_MEM register is
wrong. It should be at position 1 not 0.

[1] Intel 5100 Memory Controller Hub Chipset Doc.Nr: 318378
http://www.intel.com/content/dam/doc/datasheet/5100-
memory-controller-hub-chipset-datasheet.pdf

Reported-by: Ba Thang Nguyen <thang.b.nguyen@dektech.com.au>
Signed-off-by: Niklas Söderlund <niklas.soderlund@ericsson.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Hui Wang [Mon, 6 Feb 2012 07:11:01 +0000 (04:11 -0300)]

edac: sb_edac: Fix a wrong value setting for the previous value

>From the driver design, the variable limit wants to compare with its
previous value, we should set the value of limit instead of the value
of tmp_mb to the variable prev.

Signed-off-by: Hui Wang <jason77.wang@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Hui Wang [Mon, 6 Feb 2012 07:11:00 +0000 (04:11 -0300)]

edac: sb_edac: Fix a INTERLEAVE_MODE() misuse

We can identify dram interleave mode from the Dram Rule register
rather than Dram Interleave list register.

In this context, the reg of INTERLEAVE_MODE(reg) contains the Dram
Interleave list register, we can't get interleave mode from the reg,
while the variable interleave_mode saves the the mode got from the
Dram Rule register, so we use the variable to replace
INTERLEAVE_MDDE(reg) here.

Signed-off-by: Hui Wang <jason77.wang@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Hui Wang [Mon, 6 Feb 2012 07:10:59 +0000 (04:10 -0300)]

edac: sb_edac: Let the driver depend on PCI_MMCONFIG

This driver needs to access PCIe Extended Configuration Space
Registers (0x100~0xfff), to correctly access those registers, we need
to enable PCI_MMCONFIG option. Since this option is not enabled for
X86_64 by default, we let the driver depend on it to prevent users
forgetting to enable this option.

Signed-off-by: Hui Wang <jason77.wang@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Fri, 3 Feb 2012 16:17:48 +0000 (13:17 -0300)]

edac: Improve the comments to better describe the memory concepts

The Computer memory terminology has changed with time since EDAC was
originally written: new concepts were introduced, and some things have
different meanings, depending on the memory architecture.

Improve the definition of all related terms.

Also, describe each memory type in a more detailed fashion.

No functional changes. Just comments were touched.

Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Tue, 31 Jan 2012 19:36:08 +0000 (16:36 -0300)]

edac/ppc4xx_edac: Fix compilation

It seems that nobody is cross-compiling for this arch anymore...

drivers/edac/ppc4xx_edac.c: In function 'ppc4xx_edac_probe':
drivers/edac/ppc4xx_edac.c:188:12: error: storage class specified for parameter 'ppc4xx_edac_remove'
...
drivers/edac/ppc4xx_edac.c:1068:19: error: 'match' undeclared (first use in this function)
drivers/edac/ppc4xx_edac.c:1068:19: note: each undeclared identifier is reported only once for each function it appears in
drivers/edac/ppc4xx_edac.c:1068:36: warning: left-hand operand of comma expression has no effect [-Wunused-value]

Acked-by: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Mauro Carvalho Chehab [Mon, 7 Nov 2011 21:26:53 +0000 (18:26 -0300)]

Fix sb_edac compilation with 32 bits kernels

As reported by Josh Boyer <jwboyer@redhat.com>:
> drivers/edac/sb_edac.c: In function 'get_memory_error_data':
> drivers/edac/sb_edac.c:861:2: warning: left shift count >= width of type
> [enabled by default]
> <snip>
> ERROR: "__udivdi3" [drivers/edac/sb_edac.ko] undefined!
> make[1]: *** [__modpost] Error 1
> make: *** [modules] Error 2

PS.: compile-tested only

Reported-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

commit | commitdiff | tree

Linus Torvalds [Sun, 18 Mar 2012 23:15:34 +0000 (16:15 -0700)]

Linux 3.3

commit | commitdiff | tree

Jason Baron [Fri, 16 Mar 2012 20:34:03 +0000 (16:34 -0400)]

Don't limit non-nested epoll paths

Commit 28d82dc1c4ed ("epoll: limit paths") that I did to limit the
number of possible wakeup paths in epoll is causing a few applications
to longer work (dovecot for one).

The original patch is really about limiting the amount of epoll nesting
(since epoll fds can be attached to other fds). Thus, we probably can
allow an unlimited number of paths of depth 1. My current patch limits
it at 1000. And enforce the limits on paths that have a greater depth.

This is captured in: https://bugzilla.redhat.com/show_bug.cgi?id=681578

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Sun, 18 Mar 2012 02:22:24 +0000 (19:22 -0700)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking changes from David Miller:
"1) icmp6_dst_alloc() returns NULL instead of ERR_PTR() leading to
     crashes, particularly during shutdown.  Reported by Dave Jones and
     fixed by Eric Dumazet.

  2) hyperv and wimax/i2400m return NETDEV_TX_BUSY when they have
     already freed the SKB, which causes crashes as to the caller this
     means requeue the packet.  Fixes from Eric Dumazet.

  3) usbnet driver doesn't allocate the right amount of headroom on
     fresh RX SKBs, fix from Eric Dumazet.

  4) Fix regression in ip6_mc_find_dev_rcu(), as an RCU lookup it
     abolutely should not take a reference to 'dev', this leads to
     leaks.  Fix from RonQing Li.

  5) Fix netfilter ctnetlink race between delete and timeout expiration.
     From Pablo Neira Ayuso.

  6) Revert SFQ change which causes regressions, specifically queueing
     to tail can lead to unavoidable flow starvation.  From Eric
     Dumazet.

  7) Fix a memory leak and a crash on corrupt firmware files in bnx2x,
     from Michal Schmidt."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  netfilter: ctnetlink: fix race between delete and timeout expiration
  ipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.
  wimax/i2400m: fix erroneous NETDEV_TX_BUSY use
  net/hyperv: fix erroneous NETDEV_TX_BUSY use
  net/usbnet: reserve headroom on rx skbs
  bnx2x: fix memory leak in bnx2x_init_firmware()
  bnx2x: fix a crash on corrupt firmware file
  sch_sfq: revert dont put new flow at the end of flows
  ipv6: fix icmp6_dst_alloc()

commit | commitdiff | tree

Linus Torvalds [Sat, 17 Mar 2012 16:54:16 +0000 (09:54 -0700)]

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar.

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools, x86: Build perf on older user-space as well
  perf tools: Use scnprintf where applicable
  perf tools: Incorrect use of snprintf results in SEGV

commit | commitdiff | tree

Pablo Neira Ayuso [Fri, 16 Mar 2012 02:00:34 +0000 (02:00 +0000)]

netfilter: ctnetlink: fix race between delete and timeout expiration

Kerin Millar reported hardlockups while running `conntrackd -c'
in a busy firewall. That system (with several processors) was
acting as backup in a primary-backup setup.

After several tries, I found a race condition between the deletion
operation of ctnetlink and timeout expiration. This patch fixes
this problem.

Tested-by: Kerin Millar <kerframil@gmail.com>
Reported-by: Kerin Millar <kerframil@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

RongQing.Li [Thu, 15 Mar 2012 22:54:14 +0000 (22:54 +0000)]

ipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.

ip6_mc_find_dev_rcu() is called with rcu_read_lock(), so don't
need to dev_hold().
With dev_hold(), not corresponding dev_put(), will lead to leak.

[ bug introduced in 96b52e61be1 (ipv6: mcast: RCU conversions) ]

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Sat, 17 Mar 2012 00:14:55 +0000 (17:14 -0700)]

Merge branch 'akpm' (more patches from Andrew)

Merge some more email patches from Andrew Morton:
"A couple of nilfs fixes"

* emailed from Andrew Morton <akpm@linux-foundation.org>:
nilfs2: fix NULL pointer dereference in nilfs_load_super_block()
nilfs2: clamp ns_r_segments_percentage to [1, 99]

commit | commitdiff | tree

Ryusuke Konishi [Sat, 17 Mar 2012 00:08:39 +0000 (17:08 -0700)]

nilfs2: fix NULL pointer dereference in nilfs_load_super_block()

According to the report from Slicky Devil, nilfs caused kernel oops at
nilfs_load_super_block function during mount after he shrank the
partition without resizing the filesystem:

BUG: unable to handle kernel NULL pointer dereference at 00000048
IP: [<d0d7a08e>] nilfs_load_super_block+0x17e/0x280 [nilfs2]
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP
...
Call Trace:
  [<d0d7a87b>] init_nilfs+0x4b/0x2e0 [nilfs2]
  [<d0d6f707>] nilfs_mount+0x447/0x5b0 [nilfs2]
  [<c0226636>] mount_fs+0x36/0x180
  [<c023d961>] vfs_kern_mount+0x51/0xa0
  [<c023ddae>] do_kern_mount+0x3e/0xe0
  [<c023f189>] do_mount+0x169/0x700
  [<c023fa9b>] sys_mount+0x6b/0xa0
  [<c04abd1f>] sysenter_do_call+0x12/0x28
Code: 53 18 8b 43 20 89 4b 18 8b 4b 24 89 53 1c 89 43 24 89 4b 20 8b 43
20 c7 43 2c 00 00 00 00 23 75 e8 8b 50 68 89 53 28 8b 54 b3 20 <8b> 72
48 8b 7a 4c 8b 55 08 89 b3 84 00 00 00 89 bb 88 00 00 00
EIP: [<d0d7a08e>] nilfs_load_super_block+0x17e/0x280 [nilfs2] SS:ESP 0068:ca9bbdcc
CR2: 0000000000000048

This turned out due to a defect in an error path which runs if the
calculated location of the secondary super block was invalid.

This patch fixes it and eliminates the reported oops.

Reported-by: Slicky Devil <slicky.dvl@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Slicky Devil <slicky.dvl@gmail.com>
Cc: <stable@vger.kernel.org> [2.6.30+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Haogang Chen [Sat, 17 Mar 2012 00:08:38 +0000 (17:08 -0700)]

nilfs2: clamp ns_r_segments_percentage to [1, 99]

ns_r_segments_percentage is read from the disk. Bogus or malicious
value could cause integer overflow and malfunction due to meaningless
disk usage calculation. This patch reports error when mounting such
bogus volumes.

Signed-off-by: Haogang Chen <haogangchen@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Sat, 17 Mar 2012 00:04:02 +0000 (17:04 -0700)]

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull maintainer update from James Morris:
"Please pull this patch which adds Serge as maintainer of the
  capabilities code, as discussed on lwn and the lsm list.

  New capabilities must be signed off by the maintainer, and new uses of
  any capabilities should at be cc'd to the maintainer."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  MAINTAINERS: Add Serge as maintainer of capabilities

commit | commitdiff | tree

Linus Torvalds [Sat, 17 Mar 2012 00:03:15 +0000 (17:03 -0700)]

Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming

Pull c6x bugfix from Mark Salter:
"Remove dead code from entry.S which causes a build failure when using
a newer assembler (v2.22 complains about it, v2.20 ignores it)."

* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
C6X: remove dead code from entry.S

commit | commitdiff | tree

Anton Blanchard [Fri, 16 Mar 2012 10:28:19 +0000 (10:28 +0000)]

afs: Remote abort can cause BUG in rxrpc code

When writing files to afs I sometimes hit a BUG:

kernel BUG at fs/afs/rxrpc.c:179!

With a backtrace of:

afs_free_call
afs_make_call
afs_fs_store_data
afs_vnode_store_data
afs_write_back_from_locked_page
afs_writepages_region
afs_writepages

The cause is:

ASSERT(skb_queue_empty(&call->rx_queue));

Looking at a tcpdump of the session the abort happens because we
are exceeding our disk quota:

rx abort fs reply store-data error diskquota exceeded (32)

So the abort error is valid. We hit the BUG because we haven't
freed all the resources for the call.

By freeing any skbs in call->rx_queue before calling afs_free_call
we avoid hitting leaking memory and avoid hitting the BUG.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Anton Blanchard [Fri, 16 Mar 2012 10:28:07 +0000 (10:28 +0000)]

afs: Read of file returns EBADMSG

A read of a large file on an afs mount failed:

# cat junk.file > /dev/null
cat: junk.file: Bad message

Looking at the trace, call->offset wrapped since it is only an
unsigned short. In afs_extract_data:

        _enter("{%u},{%zu},%d,,%zu", call->offset, len, last, count);
...

        if (call->offset < count) {
                if (last) {
                        _leave(" = -EBADMSG [%d < %zu]", call->offset, count);
                        return -EBADMSG;
                }

Which matches the trace:

[cat   ] ==> afs_extract_data({65132},{524},1,,65536)
[cat   ] <== afs_extract_data() = -EBADMSG [0 < 65536]

call->offset went from 65132 to 0. Fix this by making call->offset an
unsigned int.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Mark Salter [Fri, 16 Mar 2012 13:27:57 +0000 (09:27 -0400)]

C6X: remove dead code from entry.S

The ENDPROC() on sys_fadvise64_c6x() in arch/c6x/kernel/entry.S is
outside of the conditional block with the matching ENTRY() macro. This
leads a newer (v2.22 vs. v2.20) assembler to complain:

/tmp/ccGZBaPT.s: Assembler messages:
/tmp/ccGZBaPT.s: Error: .size expression for sys_fadvise64_c6x does not evaluate to a constant

The conditional block became dead code when c6x switched to generic
unistd.h and should be removed along with the offending ENDPROC().

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>

commit | commitdiff | tree

Eric Dumazet [Wed, 14 Mar 2012 09:21:44 +0000 (09:21 +0000)]

wimax/i2400m: fix erroneous NETDEV_TX_BUSY use

A driver start_xmit() method cannot free skb and return NETDEV_TX_BUSY,
since caller is going to reuse freed skb.

In fact netif_tx_stop_queue() / netif_stop_queue() is needed before
returning NETDEV_TX_BUSY or you can trigger a ksoftirqd fatal loop.

In case of memory allocation error, only safe way is to drop the packet
and return NETDEV_TX_OK

Also increments tx_dropped counter

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Eric Dumazet [Wed, 14 Mar 2012 08:53:34 +0000 (08:53 +0000)]

net/hyperv: fix erroneous NETDEV_TX_BUSY use

A driver start_xmit() method cannot free skb and return NETDEV_TX_BUSY,
since caller is going to reuse freed skb.

This is mostly a revert of commit bf769375c (staging: hv: fix the return
status of netvsc_start_xmit())

In fact netif_tx_stop_queue() / netif_stop_queue() is needed before
returning NETDEV_TX_BUSY or you can trigger a ksoftirqd fatal loop.

In case of memory allocation error, only safe way is to drop the packet
and return NETDEV_TX_OK

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Eric Dumazet [Wed, 14 Mar 2012 06:56:25 +0000 (06:56 +0000)]

net/usbnet: reserve headroom on rx skbs

network drivers should reserve some headroom on incoming skbs so that we
dont need expensive reallocations, eg forwarding packets in tunnels.

This NET_SKB_PAD padding is done in various helpers, like
__netdev_alloc_skb_ip_align() in this patch, combining NET_SKB_PAD and
NET_IP_ALIGN magic.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Michal Schmidt [Thu, 15 Mar 2012 14:08:29 +0000 (14:08 +0000)]

bnx2x: fix memory leak in bnx2x_init_firmware()

When cycling the interface down and up, bnx2x_init_firmware() knows that
the firmware is already loaded, but nevertheless it allocates certain
arrays anew (init_data, init_ops, init_ops_offsets, iro_arr). The old
arrays are leaked.

Fix the leaks by returning early if the firmware was already loaded.
Because if the firmware is loaded, so are the arrays.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Michal Schmidt [Thu, 15 Mar 2012 14:08:28 +0000 (14:08 +0000)]

bnx2x: fix a crash on corrupt firmware file

If the requested firmware is deemed corrupt and then released, reset the
pointer to NULL in order to avoid double-freeing it in
bnx2x_release_firmware() or dereferencing it in bnx2x_init_firmware().

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Eric Dumazet [Tue, 13 Mar 2012 18:04:25 +0000 (18:04 +0000)]

sch_sfq: revert dont put new flow at the end of flows

This reverts commit d47a0ac7b6 (sch_sfq: dont put new flow at the end of
flows)

As Jesper found out, patch sounded great but has bad side effects.

In stress situation, pushing new flows in front of the queue can prevent
old flows doing any progress. Packets can stay in SFQ queue for
unlimited amount of time.

It's possible to add heuristics to limit this problem, but this would
add complexity outside of SFQ scope.

A more sensible answer to Dave Taht concerns (who reported the issued I
tried to solve in original commit) is probably to use a qdisc hierarchy
so that high prio packets dont enter a potentially crowded SFQ qdisc.

Reported-by: Jesper Dangaard Brouer <jdb@comx.dk>
Cc: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Eric Dumazet [Wed, 14 Mar 2012 21:13:11 +0000 (21:13 +0000)]

ipv6: fix icmp6_dst_alloc()

commit 87a115783 ( ipv6: Move xfrm_lookup() call down into
icmp6_dst_alloc().) forgot to convert one error path, leading
to crashes in mld_sendpack()

Many thanks to Dave Jones for providing a very complete bug report.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

James Morris [Fri, 16 Mar 2012 01:05:48 +0000 (12:05 +1100)]

MAINTAINERS: Add Serge as maintainer of capabilities

Add Serge as maintainer of capabilities, per suggestion on LWN:
http://lwn.net/Articles/486306/

Signed-off-by: James Morris <james.l.morris@oracle.com>

commit | commitdiff | tree

Linus Torvalds [Fri, 16 Mar 2012 00:16:22 +0000 (17:16 -0700)]

Merge branch 'akpm' (Andrew's patch-bomb)

Merge patches from Andrew Morton:
"Nine patches - some bug fixes and some MAINTAINERS fiddling."

* emailed from Andrew Morton <akpm@linux-foundation.org>:
  drivers/video/backlight/s6e63m0.c: fix corruption storing gamma mode
  MAINTAINERS: add entry for exynos mipi display drivers
  MAINTAINERS: fix link to Gustavo Padovans tree
  MAINTAINERS: add Johan to Bluetooth maintainers
  MAINTAINERS: Gustavo has moved
  prctl: use CAP_SYS_RESOURCE for PR_SET_MM option
  rapidio/tsi721: fix bug in register offset definitions
  MAINTAINERS: update ST's Mailing list for SPEAr
  memcg: free mem_cgroup by RCU to fix oops

commit | commitdiff | tree

Linus Torvalds [Fri, 16 Mar 2012 00:14:35 +0000 (17:14 -0700)]

Merge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging

Pull i2c subsystem fixes from Jean Delvare.

* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c-algo-bit: Fix spurious SCL timeouts under heavy load
i2c-core: Comment says "transmitted" but means "received"

commit | commitdiff | tree

Linus Torvalds [Fri, 16 Mar 2012 00:13:39 +0000 (17:13 -0700)]

Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck.

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (zl6100) Enable interval between chip accesses for all chips
  hwmon: (w83627ehf) Describe undocumented pwm attributes
  hwmon: (w83627ehf) Fix temp2 source for W83627UHG
  hwmon: (w83627ehf) Fix memory leak in probe function
  hwmon: (w83627ehf) Fix writing into fan_stop_time for NCT6775F/NCT6776F

Error Detection and Correction (EDAC) mytree

RSS Atom