]> www.infradead.org Git - users/mchehab/edac.git/log
users/mchehab/edac.git
12 years agoamd64_edac: Don't pass driver name as an error parameter hw_events_v27
Mauro Carvalho Chehab [Tue, 22 May 2012 12:06:17 +0000 (09:06 -0300)]
amd64_edac: Don't pass driver name as an error parameter

The EDAC driver name doesn't help to handle EDAC errors. So,
remove it from the EDAC error messages, preserving only the
error_message.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac_mc: check for allocation failure in edac_mc_alloc()
Dan Carpenter [Fri, 18 May 2012 12:51:02 +0000 (15:51 +0300)]
edac_mc: check for allocation failure in edac_mc_alloc()

Add a check here for if kzalloc() failed.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Increase version to 3.0.0
Mauro Carvalho Chehab [Thu, 10 May 2012 15:43:01 +0000 (12:43 -0300)]
edac: Increase version to 3.0.0

There were lots of changes introduced to justify renaming it to
3.0.0:

  - EDAC core were redesigned to represent all types of
    memory controllers;

  - EDAC API were redesigned to properly represent the memory
    controller hierarchy;

  - a tracepoint-based API were added to report memory errors.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac_mc: Cleanup per-dimm_info debug messages
Mauro Carvalho Chehab [Mon, 30 Apr 2012 13:24:43 +0000 (10:24 -0300)]
edac_mc: Cleanup per-dimm_info debug messages

The edac_mc_alloc() routine allocates one dimm_info device for all
possible memories, including the non-filled ones. The debug messages
there are somewhat confusing. So, cleans them, by moving the code
that prints the memory location to edac_mc, and using it on both
edac_mc_sysfs and edac_mc.

Also, only dumps information when DIMM/ranks are actually
filled.

After this patch, a dimm-based memory controller will print the debug
info as:

[ 1011.380027] EDAC DEBUG: edac_mc_dump_csrow: csrow->csrow_idx = 0
[ 1011.380029] EDAC DEBUG: edac_mc_dump_csrow:   csrow = ffff8801169be000
[ 1011.380031] EDAC DEBUG: edac_mc_dump_csrow:   csrow->first_page = 0x0
[ 1011.380032] EDAC DEBUG: edac_mc_dump_csrow:   csrow->last_page = 0x0
[ 1011.380034] EDAC DEBUG: edac_mc_dump_csrow:   csrow->page_mask = 0x0
[ 1011.380035] EDAC DEBUG: edac_mc_dump_csrow:   csrow->nr_channels = 3
[ 1011.380037] EDAC DEBUG: edac_mc_dump_csrow:   csrow->channels = ffff8801149c2840
[ 1011.380039] EDAC DEBUG: edac_mc_dump_csrow:   csrow->mci = ffff880117426000
[ 1011.380041] EDAC DEBUG: edac_mc_dump_channel:   channel->chan_idx = 0
[ 1011.380042] EDAC DEBUG: edac_mc_dump_channel:     channel = ffff8801149c2860
[ 1011.380044] EDAC DEBUG: edac_mc_dump_channel:     channel->csrow = ffff8801169be000
[ 1011.380046] EDAC DEBUG: edac_mc_dump_channel:     channel->dimm = ffff88010fe90400
...
[ 1011.380095] EDAC DEBUG: edac_mc_dump_dimm: dimm0: channel 0 slot 0 mapped as virtual row 0, chan 0
[ 1011.380097] EDAC DEBUG: edac_mc_dump_dimm:   dimm = ffff88010fe90400
[ 1011.380099] EDAC DEBUG: edac_mc_dump_dimm:   dimm->label = 'CPU#0Channel#0_DIMM#0'
[ 1011.380101] EDAC DEBUG: edac_mc_dump_dimm:   dimm->nr_pages = 0x40000
[ 1011.380103] EDAC DEBUG: edac_mc_dump_dimm:   dimm->grain = 8
[ 1011.380104] EDAC DEBUG: edac_mc_dump_dimm:   dimm->nr_pages = 0x40000
...

(a rank-based memory controller would print, instead of "dimm?", "rank?"
 on the above debug info)

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Convert debugfX to edac_dbg(X,
Joe Perches [Sun, 29 Apr 2012 20:08:39 +0000 (17:08 -0300)]
edac: Convert debugfX to edac_dbg(X,

Use a more common debugging style.

Remove __FILE__ uses, add missing newlines,
coalesce formats and align arguments.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Use more normal debugging macro style
Joe Perches [Sat, 28 Apr 2012 19:41:46 +0000 (16:41 -0300)]
edac: Use more normal debugging macro style

Convert macros to a simpler style and enforce appropriate
format checking when not CONFIG_EDAC_DEBUG.

Use fmt and __VA_ARGS__, neaten macros.

Move some string arrays to the debugfx uses and remove the
now unnecessary CONFIG_EDAC_DEBUG variable block definitions.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Don't add __func__ or __FILE__ for debugf[0-9] msgs
Mauro Carvalho Chehab [Sun, 29 Apr 2012 14:59:14 +0000 (11:59 -0300)]
edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs

The debug macro already adds that. Most of the work here was
made by this small script:

$f .=$_ while (<>);

$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*": /\1"/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*/\1/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*"MC: /\1"/g;

$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;
$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;

$f =~ s/\"MC\: \\n\"/"MC:\\n"/g;

print $f;

After running the script, manual cleanups were done to fix it the remaining
places.

While here, removed the __LINE__ on most places, as it doesn't actually give
useful info on most places.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi7core: fix ranks information at the per-channel struct
Mauro Carvalho Chehab [Thu, 26 Apr 2012 14:47:29 +0000 (11:47 -0300)]
i7core: fix ranks information at the per-channel struct

There is a flag at the per-channel struct that indicates if there are
any 4R dimm on it. The way the presence of this flag were reported
is not ok, as it might give the false idea that the channel were filled
with 2R memories:

[  580.588701] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f7431): 2 ranks, UDIMMs
[  580.588704] EDAC DEBUG: get_dimm_config:  dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400

(in this case, just one 1R memory is filled on channel 1)

So, use a better way to represent the per-channel ranks information.
After the patch, it will show:

[ 2002.233978] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f7431): UDIMMs
[ 2002.233982] EDAC DEBUG: get_dimm_config:  dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
[ 2002.233988] EDAC DEBUG: get_dimm_config:  dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400

(in this case, there isn't any 4R memories)

Reported-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi5000: Fix the fatal error handling
Mauro Carvalho Chehab [Wed, 25 Apr 2012 14:47:36 +0000 (11:47 -0300)]
i5000: Fix the fatal error handling

The fatal error channel bits point to a single channel, and not
to a range of channels. Fix the code to properly report it,
instead of printing messages like:
kernel: EDAC MC0: INTERNAL ERROR: channel-b out of range (4 >= 4)

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoEdac: Add ABI Documentation for the new device nodes
Mauro Carvalho Chehab [Tue, 17 Apr 2012 14:13:10 +0000 (11:13 -0300)]
Edac: Add ABI Documentation for the new device nodes

The EDAC ABI were extended to add support for per-DIMM or per-rank
information and silkscreen labels. Properly document them.

Most of the comments there came from edac.txt descriptions of the
fields that are part of the legacy csrowX ABI (e. g.
/sys/devices/system/edac/mc/mc*/csrow*/*).

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: move documentation ABI to ABI/testing/sysfs-devices-edac
Mauro Carvalho Chehab [Tue, 17 Apr 2012 11:53:34 +0000 (08:53 -0300)]
edac: move documentation ABI to ABI/testing/sysfs-devices-edac

The EDAC MC API is currently stored at the wrong place. Move the
parts of the EDAC MC ABI that will be kept to
ABI/testing/sysfs-devices-edac.

The Date: field were added based on git timestamps for the git
commit patches that added the functionality at edac.txt.

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi7core_edac: change the mem allocation scheme to make Documentation/kobject.txt happy
Mauro Carvalho Chehab [Fri, 30 Mar 2012 19:10:51 +0000 (16:10 -0300)]
i7core_edac: change the mem allocation scheme to make Documentation/kobject.txt happy

Kernel kobjects have rigid rules: each container object should be
dynamically allocated, and can't be allocated into a single kmalloc.

EDAC never obeyed this rule: it has a single malloc function that
allocates all needed data into a single kzalloc.

As this is not accepted anymore, change the allocation schema of the
EDAC *_info structs to enforce this Kernel standard.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: change the mem allocation scheme to make Documentation/kobject.txt happy
Mauro Carvalho Chehab [Tue, 24 Apr 2012 18:05:43 +0000 (15:05 -0300)]
edac: change the mem allocation scheme to make Documentation/kobject.txt happy

Kernel kobjects have rigid rules: each container object should be
dynamically allocated, and can't be allocated into a single kmalloc.

EDAC never obeyed this rule: it has a single malloc function that
allocates all needed data into a single kzalloc.

As this is not accepted anymore, change the allocation schema of the
EDAC *_info structs to enforce this Kernel standard.

Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Greg K H <gregkh@linuxfoundation.org>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Only expose csrows/channels on legacy API if they're populated
Mauro Carvalho Chehab [Thu, 29 Mar 2012 15:20:22 +0000 (12:20 -0300)]
edac: Only expose csrows/channels on legacy API if they're populated

This patch actually fixes a bug with the legacy API, where, at the
same csrow, some channels may have different DIMMs. This can happen
on FB-DIMM/RAMBUS and modern Intel controllers.

This is the case, for example, of Nehalem machines:

$ ./edac-ctl --layout
       +-----------------------------------+
       |                mc0                |
       | channel0  | channel1  | channel2  |
-------+-----------------------------------+
slot2: |     0 MB  |     0 MB  |     0 MB  |
slot1: |  1024 MB  |     0 MB  |     0 MB  |
slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
-------+-----------------------------------+

Before this patch, non-filled memories were shown. Now, only what's
filled is there:

grep . /sys/devices/system/edac/mc/mc0/csrow*/ch?*
/sys/devices/system/edac/mc/mc0/csrow0/ch0_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow0/ch0_dimm_label:CPU#0Channel#0_DIMM#0
/sys/devices/system/edac/mc/mc0/csrow0/ch1_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow0/ch1_dimm_label:CPU#0Channel#0_DIMM#1
/sys/devices/system/edac/mc/mc0/csrow1/ch0_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow1/ch0_dimm_label:CPU#0Channel#1_DIMM#0
/sys/devices/system/edac/mc/mc0/csrow2/ch0_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow2/ch0_dimm_label:CPU#0Channel#2_DIMM#0

Thanks-to: Aristeu Rozanski Filho <arozansk@redhat.com>
Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi7300_edac: Get rid of some wrongly-solved rebase conflict
Mauro Carvalho Chehab [Thu, 29 Mar 2012 13:41:38 +0000 (10:41 -0300)]
i7300_edac: Get rid of some wrongly-solved rebase conflict

Drivers should not increment tot_dimms manually. This code were
there for an older approach that got removed. After one of the
several rebases that this code suffered until the final version,
the conflict was wrongly merged, and this code returned.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi5100_edac: Fix a warning when compiled with 32 bits
Mauro Carvalho Chehab [Thu, 29 Mar 2012 11:41:08 +0000 (08:41 -0300)]
i5100_edac: Fix a warning when compiled with 32 bits

drivers/edac/i5100_edac.c: In function ‘i5100_init_csrows’:
drivers/edac/i5100_edac.c:862:3: warning: format ‘%zd’ expects argument of type ‘signed size_t’, but argument 5 has type ‘long unsigned int’ [-Wformat]

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi82975x_edac: Test nr_pages earlier to save a few CPU cycles
Mauro Carvalho Chehab [Wed, 28 Mar 2012 23:13:36 +0000 (20:13 -0300)]
i82975x_edac: Test nr_pages earlier to save a few CPU cycles

Avoid test nr_pages twice, and initializing some data that won't
be used.

Cleanup patch only.

Reported-by: Aristeu Rozanski Filho <arozansk@redhat.com>
Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Move grain/dtype/edac_type calculus to be out of channel loop
Mauro Carvalho Chehab [Wed, 28 Mar 2012 22:37:59 +0000 (19:37 -0300)]
edac: Move grain/dtype/edac_type calculus to be out of channel loop

The 3e7bddc changeset (edac: move dimm properties to struct memset_info)
moved the calculus inside a loop. However, at those stuff are common to
all channels, on several drivers, it is better to put the calculus
outside the loop, to optimize the code.

Reported-by: Aristeu Rozanski Filho <arozansk@redhat.com>
Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Add debufs nodes to allow doing fake error inject
Mauro Carvalho Chehab [Mon, 26 Mar 2012 12:35:11 +0000 (09:35 -0300)]
edac: Add debufs nodes to allow doing fake error inject

Sometimes, it is useful to have a mechanism that generates fake
errors, in order to test the EDAC core code, and the userspace
tools.

Provide such mechanism by adding a few debugfs nodes.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: add a sysfs node to report the maximum location for the system
Mauro Carvalho Chehab [Wed, 21 Mar 2012 20:13:24 +0000 (17:13 -0300)]
edac: add a sysfs node to report the maximum location for the system

The userspace tools need to know what's the maximum location on each
system, as it helps to create nice maps showing how the memory was
filled at the system.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: add a new per-dimm API and make the old per-virtual-rank API obsolete
Mauro Carvalho Chehab [Wed, 21 Mar 2012 20:06:53 +0000 (17:06 -0300)]
edac: add a new per-dimm API and make the old per-virtual-rank API obsolete

The old EDAC API is broken. It only works fine for systems manufatured
before 2005 and for AMD 64. The reason is that it forces all memory
controller drivers to discover rank info.

Also, it doesn't allow grouping the several ranks into a DIMM.

So, what almost all modern drivers do is to create a fake virtual-rank
information, and use it to cheat the EDAC core to accept the driver.

While this works if the user has enough time to discover what DIMM slot
corresponds to each "virtual-rank" information, it prevents EDAC usage
for users with less available time. It also makes life hard for vendors
that may want to provide a table with their motherboards to the userspace
tool (edac-utils) as each driver has its own logic for the virtual
mapping.

So, the old API should be removed, in favor of a more flexible API that
allows newer drivers to not lie to the EDAC core.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Hui Wang <jason77.wang@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Get rid of the old kobj's from the edac mc code
Mauro Carvalho Chehab [Wed, 21 Mar 2012 19:55:02 +0000 (16:55 -0300)]
edac: Get rid of the old kobj's from the edac mc code

Now that al users for the old kobj raw access are gone,
we can get rid of the legacy kobj-based structures and
data.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi7core_edac: convert it to use struct device
Mauro Carvalho Chehab [Wed, 21 Mar 2012 14:08:06 +0000 (11:08 -0300)]
i7core_edac: convert it to use struct device

Instead of relying on a complex logic inside the edac core to create
a "device tree-like" sysfs struct, just use device_add.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoamd64_edac: convert sysfs logic to use struct device
Mauro Carvalho Chehab [Wed, 21 Mar 2012 17:00:44 +0000 (14:00 -0300)]
amd64_edac: convert sysfs logic to use struct device

Now that the EDAC core supports struct device, there's no sense
on having any logic at the EDAC core to simulate it. So, instead
of adding such logic there, change the logic at amd64_edac to
use it.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agompc85xx_edac: convert sysfs logic to use struct device
Mauro Carvalho Chehab [Wed, 21 Mar 2012 18:16:20 +0000 (15:16 -0300)]
mpc85xx_edac: convert sysfs logic to use struct device

Now that the EDAC core supports struct device, there's no sense on
having any logic at the EDAC core to simulate it. So, instead of adding
such logic there, change the logic at mpc85xx_edac to use it

compile-tested only.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: rewrite the sysfs code to use struct device
Mauro Carvalho Chehab [Mon, 16 Apr 2012 19:41:11 +0000 (16:41 -0300)]
edac: rewrite the sysfs code to use struct device

The EDAC subsystem uses the old struct sysdev approach,
creating all nodes using the raw sysfs API. This is bad,
as the API is deprecated.

As we'll be changing the EDAC API, let's first port the existing
code to struct device.

There's one drawback on this patch: driver-specific sysfs
nodes, used by mpc85xx_edac, amd64_edac and i7core_edac
 won't be created anymore. While it would be possible to
also port the device-specific code, that would mix kobj with
struct device, with is not recommended. Also, it is easier and nicer
to move the code to the drivers, instead, as the core can get rid
of some complex logic that just emulates what the device_add()
and device_create_file() already does.

The next patches will convert the driver-specific code to use
the device-specific calls. Then, the remaining bits of the old
sysfs API will be removed.

NOTE: a per-MC bus is required, otherwise devices with more than
one memory controller will hit a bug like the one below:

[  819.094946] EDAC DEBUG: find_mci_by_dev: find_mci_by_dev()
[  819.094948] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device() idx=1
[  819.094952] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device(): creating device mc1
[  819.094967] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device creating dimm0, located at channel 0 slot 0
[  819.094984] ------------[ cut here ]------------
[  819.100142] WARNING: at fs/sysfs/dir.c:481 sysfs_add_one+0xc1/0xf0()
[  819.107282] Hardware name: S2600CP
[  819.111078] sysfs: cannot create duplicate filename '/bus/edac/devices/dimm0'
[  819.119062] Modules linked in: sb_edac(+) edac_core ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge stp llc sunrpc binfmt_misc dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun kvm microcode pcspkr iTCO_wdt iTCO_vendor_support igb i2c_i801 i2c_core sg ioatdma dca sr_mod cdrom sd_mod crc_t10dif ahci libahci isci libsas libata scsi_transport_sas scsi_mod wmi dm_mod [last unloaded: scsi_wait_scan]
[  819.175748] Pid: 10902, comm: modprobe Not tainted 3.3.0-0.11.el7.v12.2.x86_64 #1
[  819.184113] Call Trace:
[  819.186868]  [<ffffffff8105adaf>] warn_slowpath_common+0x7f/0xc0
[  819.193573]  [<ffffffff8105aea6>] warn_slowpath_fmt+0x46/0x50
[  819.200000]  [<ffffffff811f53d1>] sysfs_add_one+0xc1/0xf0
[  819.206025]  [<ffffffff811f5cf5>] sysfs_do_create_link+0x135/0x220
[  819.212944]  [<ffffffff811f7023>] ? sysfs_create_group+0x13/0x20
[  819.219656]  [<ffffffff811f5df3>] sysfs_create_link+0x13/0x20
[  819.226109]  [<ffffffff813b04f6>] bus_add_device+0xe6/0x1b0
[  819.232350]  [<ffffffff813ae7cb>] device_add+0x2db/0x460
[  819.238300]  [<ffffffffa0325634>] edac_create_dimm_object+0x84/0xf0 [edac_core]
[  819.246460]  [<ffffffffa0325e18>] edac_create_sysfs_mci_device+0xe8/0x290 [edac_core]
[  819.255215]  [<ffffffffa0322e2a>] edac_mc_add_mc+0x5a/0x2c0 [edac_core]
[  819.262611]  [<ffffffffa03412df>] sbridge_register_mci+0x1bc/0x279 [sb_edac]
[  819.270493]  [<ffffffffa03417a3>] sbridge_probe+0xef/0x175 [sb_edac]
[  819.277630]  [<ffffffff813ba4e8>] ? pm_runtime_enable+0x58/0x90
[  819.284268]  [<ffffffff812f430c>] local_pci_probe+0x5c/0xd0
[  819.290508]  [<ffffffff812f5ba1>] __pci_device_probe+0xf1/0x100
[  819.297117]  [<ffffffff812f5bea>] pci_device_probe+0x3a/0x60
[  819.303457]  [<ffffffff813b1003>] really_probe+0x73/0x270
[  819.309496]  [<ffffffff813b138e>] driver_probe_device+0x4e/0xb0
[  819.316104]  [<ffffffff813b149b>] __driver_attach+0xab/0xb0
[  819.322337]  [<ffffffff813b13f0>] ? driver_probe_device+0xb0/0xb0
[  819.329151]  [<ffffffff813af5d6>] bus_for_each_dev+0x56/0x90
[  819.335489]  [<ffffffff813b0d7e>] driver_attach+0x1e/0x20
[  819.341534]  [<ffffffff813b0980>] bus_add_driver+0x1b0/0x2a0
[  819.347884]  [<ffffffffa0347000>] ? 0xffffffffa0346fff
[  819.353641]  [<ffffffff813b19f6>] driver_register+0x76/0x140
[  819.359980]  [<ffffffff8159f18b>] ? printk+0x51/0x53
[  819.365524]  [<ffffffffa0347000>] ? 0xffffffffa0346fff
[  819.371291]  [<ffffffff812f5896>] __pci_register_driver+0x56/0xd0
[  819.378096]  [<ffffffffa0347054>] sbridge_init+0x54/0x1000 [sb_edac]
[  819.385231]  [<ffffffff8100203f>] do_one_initcall+0x3f/0x170
[  819.391577]  [<ffffffff810bcd2e>] sys_init_module+0xbe/0x230
[  819.397926]  [<ffffffff815bb529>] system_call_fastpath+0x16/0x1b
[  819.404633] ---[ end trace 1654fdd39556689f ]---

This happens because the bus is not being properly initialized.
Instead of putting the memory sub-devices inside the memory controller,
it is putting everything under the same directory:

$ tree /sys/bus/edac/
/sys/bus/edac/
├── devices
│   ├── all_channel_counts -> ../../../devices/system/edac/mc/mc0/all_channel_counts
│   ├── csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
│   ├── csrow1 -> ../../../devices/system/edac/mc/mc0/csrow1
│   ├── csrow2 -> ../../../devices/system/edac/mc/mc0/csrow2
│   ├── dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
│   ├── dimm1 -> ../../../devices/system/edac/mc/mc0/dimm1
│   ├── dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
│   ├── dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
│   ├── inject_addrmatch -> ../../../devices/system/edac/mc/mc0/inject_addrmatch
│   ├── mc -> ../../../devices/system/edac/mc
│   └── mc0 -> ../../../devices/system/edac/mc/mc0
├── drivers
├── drivers_autoprobe
├── drivers_probe
└── uevent

On a multi-memory controller system, the names "csrow%d" and "dimm%d"
should be under "mc%d", and not at the main hierarchy level.

So, we need to create a per-MC bus, in order to have its own namespace.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Greg K H <gregkh@linuxfoundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: use Documentation-nano format for some data structs
Mauro Carvalho Chehab [Wed, 21 Mar 2012 19:21:07 +0000 (16:21 -0300)]
edac: use Documentation-nano format for some data structs

No functional changes. Just comment improvements.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Rename the parent dev to pdev
Mauro Carvalho Chehab [Fri, 16 Mar 2012 10:44:18 +0000 (07:44 -0300)]
edac: Rename the parent dev to pdev

As EDAC doesn't use struct device itself, it created a parent dev
pointer called as "pdev".  Now that we'll be converting it to use
struct device, instead of struct devsys, this needs to be fixed.

No functional changes.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoe752x_edac: provide more info about how DIMMS/ranks are mapped
Mauro Carvalho Chehab [Thu, 15 Mar 2012 16:41:17 +0000 (13:41 -0300)]
e752x_edac: provide more info about how DIMMS/ranks are mapped

No funtional changes here. Only the comments got updated.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi5000_edac: Fix the logic that retrieves memory information
Mauro Carvalho Chehab [Fri, 24 Feb 2012 12:34:54 +0000 (09:34 -0300)]
i5000_edac: Fix the logic that retrieves memory information

The logic there is broken: it basically creates two csrows for
each DIMM and assumes that all DIMM's are dual rank. Only one of
the csrows will contain the entire DIMM size. If single rank
memories are found, they'll be marked with 0 bytes.

The check if the AMB is present were also wrong.

Yet, as the error reports don't use the memory size in order to
credit an error to the right DIMM, that part of the driver seems
to work. That's why probably nobody detected the issue yet.

After this patch, the memory layout is now properly reported,
when debug mode is enabled, and the number of ranks per dimm is
now shown:

calculate_dimm_size: ----------------------------------------------------------
calculate_dimm_size: slot  3       0 MB   |    0 MB   |    0 MB   |    0 MB   |
calculate_dimm_size: slot  2       0 MB   |    0 MB   |    0 MB   |    0 MB   |
calculate_dimm_size: ----------------------------------------------------------
calculate_dimm_size: slot  1       0 MB   |    0 MB   |    0 MB   |    0 MB   |
calculate_dimm_size: slot  0     512 MB 1R|  512 MB 1R|  512 MB 1R|  512 MB 1R|
calculate_dimm_size: ----------------------------------------------------------
calculate_dimm_size:            channel 0 | channel 1 | channel 2 | channel 3 |
calculate_dimm_size:                   branch 0       |        branch 1       |

(1R above means that all memories on my test machine are single-ranked)

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoRAS: Add a tracepoint for reporting memory controller events
Mauro Carvalho Chehab [Thu, 23 Feb 2012 11:10:34 +0000 (08:10 -0300)]
RAS: Add a tracepoint for reporting memory controller events

Add a new tracepoint-based hardware events report method for
reporting Memory Controller events.

Part of the description bellow is shamelessly copied from Tony
Luck's notes about the Hardware Error BoF during LPC 2010 [1].
Tony, thanks for your notes and discussions to generate the
h/w error reporting requirements.

[1] http://lwn.net/Articles/416669/

    We have several subsystems & methods for reporting hardware errors:

    1) EDAC ("Error Detection and Correction").  In its original form
    this consisted of a platform specific driver that read topology
    information and error counts from chipset registers and reported
    the results via a sysfs interface.

    2) mcelog - x86 specific decoding of machine check bank registers
    reporting in binary form via /dev/mcelog. Recent additions make use
    of the APEI extensions that were documented in version 4.0a of the
    ACPI specification to acquire more information about errors without
    having to rely reading chipset registers directly. A user level
    programs decodes into somewhat human readable format.

    3) drivers/edac/mce_amd.c - this driver hooks into the mcelog path and
    decodes errors reported via machine check bank registers in AMD
    processors to the console log using printk();

    Each of these mechanisms has a band of followers ... and none
    of them appear to meet all the needs of all users.

As part of a RAS subsystem, let's encapsulate the memory error hardware
events into a trace facility.

The tracepoint printk will be displayed like:

mc_event: (Corrected|Uncorrected|Fatal) error:[error msg] on memory stick "[label]" ([location] [edac_mc detail] [driver_detail])

Where:
[error msg] is the driver-specific error message
    (e. g. "memory read", "bus error", ...);
[location] is the location in terms of memory controller and
   branch/channel/slot, channel/slot or csrow/channel;
[label] is the memory stick label;
[edac_mc detail] describes the address location of the error
 and the syndrome;
[driver detail] is driver-specifig error message details,
when needed/provided (e. g. "area:DMA", ...)

For example:

mc_event: Corrected error:memory read on memory stick "DIMM_1A" (mc:0 channel:0 slot:0 page:0x586b6e offset:0xa66 grain:32 syndrome:0x0 area:DMA)

Of course, any userspace tools meant to handle errors should not parse
the above data. They should, instead, use the binary fields provided by
the tracepoint, mapping them directly into their Management Information
Base.

NOTE: The original patch was providing an additional mechanism for
MCA-based trace events that also contained MCA error register data.
However, as no agreement was reached so far for the MCA-based trace
events, for now, let's add events only for memory errors.
A latter patch is planned to change the tracepoint, for those types
of event.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi5400_edac: improve debug messages to better represent the filled memory
Mauro Carvalho Chehab [Sun, 12 Feb 2012 20:18:06 +0000 (17:18 -0300)]
i5400_edac: improve debug messages to better represent the filled memory

Improves the debug output message, in order to better represent the
memory controller hierarchy, when outputing the debug messages.

No functional changes when debug is disabled.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Cleanup the logs for i7core and sb edac drivers
Mauro Carvalho Chehab [Fri, 11 May 2012 14:41:45 +0000 (11:41 -0300)]
edac: Cleanup the logs for i7core and sb edac drivers

Remove some information that it is duplicated at the MCE log,
and don't have much usage for the error. Those data will be
added again, when creating a trace function that outputs both
memory errors and MCE fields.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Initialize the dimm label with the known information
Mauro Carvalho Chehab [Thu, 9 Feb 2012 14:05:20 +0000 (11:05 -0300)]
edac: Initialize the dimm label with the known information

While userspace doesn't fill the dimm labels, add there the dimm location,
as described by the used memory model. This could eventually match what
is described at the dmidecode, making easier for people to identify the
memory.

For example, on an Intel motherboard where the DMI table is reliable,
the first memory stick is described as:

Memory Device
Array Handle: 0x0029
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 2048 MB
Form Factor: DIMM
Set: 1
Locator: A1_DIMM0
Bank Locator: A1_Node0_Channel0_Dimm0
Type: <OUT OF SPEC>
Type Detail: Synchronous
Speed: 800 MHz
Manufacturer: A1_Manufacturer0
Serial Number: A1_SerNum0
Asset Tag: A1_AssetTagNum0
Part Number: A1_PartNum0

The memory named as "A1_DIMM0" is physically located at the first
memory controller (node 0), at channel 0, dimm slot 0.

After this patch, the memory label will be filled with:
/sys/devices/system/edac/mc/csrow0/ch0_dimm_label:mc#0channel#0slot#0

And (after the new EDAC API patches) as:
/sys/devices/system/edac/mc/mc0/dimm0/dimm_label:mc#0channel#0slot#0

So, even if the memory label is not initialized on userspace, an useful
information with the error location is filled there, expecially since
several systems/motherboards are provided with enough info to map from
channel/slot (or branch/channel/slot) into the DIMM label. So, letting the
EDAC core fill it by default is a good thing.

It should noticed that, as the label filling happens at the
edac_mc_alloc(), drivers can override it to better describe the memories
(and some actually do it).

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Remove the legacy EDAC ABI
Mauro Carvalho Chehab [Wed, 2 May 2012 17:37:00 +0000 (14:37 -0300)]
edac: Remove the legacy EDAC ABI

Now that all drivers got converted to use the new ABI, we can
drop the old one.

Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agox38_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:12:35 +0000 (15:12 -0300)]
x38_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agotile_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:12:28 +0000 (15:12 -0300)]
tile_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agosb_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:12:22 +0000 (15:12 -0300)]
sb_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agor82600_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:12:13 +0000 (15:12 -0300)]
r82600_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Tim Small <tim@buttersideup.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoppc4xx_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:11:59 +0000 (15:11 -0300)]
ppc4xx_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agopasemi_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:11:36 +0000 (15:11 -0300)]
pasemi_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agomv64x60_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:11:20 +0000 (15:11 -0300)]
mv64x60_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agompc85xx_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:11:08 +0000 (15:11 -0300)]
mpc85xx_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi82975x_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:55 +0000 (15:10 -0300)]
i82975x_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi82875p_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:43 +0000 (15:10 -0300)]
i82875p_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi82860_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:31 +0000 (15:10 -0300)]
i82860_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi82443bxgx_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:18 +0000 (15:10 -0300)]
i82443bxgx_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Tim Small <tim@buttersideup.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi7core_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:12 +0000 (15:10 -0300)]
i7core_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi7300_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:10:05 +0000 (15:10 -0300)]
i7300_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi5400_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:09:58 +0000 (15:09 -0300)]
i5400_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi5100_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:09:52 +0000 (15:09 -0300)]
i5100_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi5000_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:09:46 +0000 (15:09 -0300)]
i5000_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi3200_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:09:38 +0000 (15:09 -0300)]
i3200_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoi3000_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:09:00 +0000 (15:09 -0300)]
i3000_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Jason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoe7xxx_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:07:09 +0000 (15:07 -0300)]
e7xxx_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoe752x_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:06:59 +0000 (15:06 -0300)]
e752x_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Mark Gross <mark.gross@intel.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agocpc925_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:06:50 +0000 (15:06 -0300)]
cpc925_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agocell_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:06:25 +0000 (15:06 -0300)]
cell_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoamd76x_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:05:27 +0000 (15:05 -0300)]
amd76x_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoamd64_edac: convert driver to use the new edac ABI
Mauro Carvalho Chehab [Mon, 16 Apr 2012 18:03:50 +0000 (15:03 -0300)]
amd64_edac: convert driver to use the new edac ABI

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Change internal representation to work with layers
Mauro Carvalho Chehab [Wed, 18 Apr 2012 18:20:50 +0000 (15:20 -0300)]
edac: Change internal representation to work with layers

Change the EDAC internal representation to work with non-csrow
based memory controllers.

There are lots of those memory controllers nowadays, and more
are coming. So, the EDAC internal representation needs to be
changed, in order to work with those memory controllers, while
preserving backward compatibility with the old ones.

The edac core was written with the idea that memory controllers
are able to directly access csrows.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks.

So, change the allocation and error report routines to allow
them to work with all types of architectures.

This will allow the removal of several hacks with FB-DIMM and RAMBUS
memory controllers.

Also, several tests were done on different platforms using different
x86 drivers.

TODO: a multi-rank DIMMs are currently represented by multiple DIMM
entries in struct dimm_info. That means that changing a label for one
rank won't change the same label for the other ranks at the same DIMM.
This bug is present since the beginning of the EDAC, so it is not a big
deal. However, on several drivers, it is possible to fix this issue, but
it should be a per-driver fix, as the csrow => DIMM arrangement may not
be equal for all. So, don't try to fix it here yet.

I tried to make this patch as short as possible, preceding it with
several other patches that simplified the logic here. Yet, as the
internal API changes, all drivers need changes. The changes are
generally bigger in the drivers for FB-DIMMs.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac.h: Add generic layers for describing a memory location
Mauro Carvalho Chehab [Mon, 16 Apr 2012 16:04:46 +0000 (13:04 -0300)]
edac.h: Add generic layers for describing a memory location

The edac core were written with the idea that memory controllers
are able to directly access csrows, and that the channels are
used inside a csrows select.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks, accessed
via csrow/channel.

So, changes are needed in order to allow the EDAC core to
work with all types of architectures.

In preparation for handling non-csrows based memory controllers,
add some memory structs and a macro:

enum hw_event_mc_err_type: describes the type of error
   (corrected, uncorrected, fatal)

To be used by the new edac_mc_handle_error function;

enum edac_mc_layer: describes the type of a given memory
architecture layer (branch, channel, slot, csrow).

struct edac_mc_layer: describes the properties of a memory
      layer (type, size, and if the layer
      will be used on a virtual csrow.

EDAC_DIMM_PTR() - as the number of layers can vary from 1 to 3,
this macro converts from an address with up to 3 layers into
a linear address.

Reviewed-by: Borislav Petkov <bp@amd64.org>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: rewrite edac_align_ptr()
Mauro Carvalho Chehab [Mon, 16 Apr 2012 13:18:12 +0000 (10:18 -0300)]
edac: rewrite edac_align_ptr()

The edac_align_ptr() function is used to prepare data for a single
memory allocation kzalloc() call. It counts how many bytes are needed
by some data structure.

Using it as-is is not that trivial, as the quantity of memory elements
reserved is not there, but, instead, it is on a next call.

In order to avoid mistakes when using it, move the number of allocated
elements into it, making easier to use it.

Reviewed-by: Borislav Petkov <bp@amd64.org>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: move nr_pages to dimm struct
Mauro Carvalho Chehab [Sat, 28 Jan 2012 12:09:38 +0000 (09:09 -0300)]
edac: move nr_pages to dimm struct

The number of pages is a dimm property. Move it to the dimm struct.

After this change, it is possible to add sysfs nodes for the DIMM's that
will properly represent the DIMM stick properties, including its size.

A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
the memory controller represents the memory via chip select rows.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Don't initialize csrow's first_page & friends when not needed
Mauro Carvalho Chehab [Sat, 28 Jan 2012 00:20:32 +0000 (21:20 -0300)]
edac: Don't initialize csrow's first_page & friends when not needed

Almost all edac drivers initialize csrow_info->first_page,
csrow_info->last_page and csrow_info->page_mask. Those vars are
used inside the EDAC core, in order to calculate the csrow affected
by an error, by using the routine edac_mc_find_csrow_by_page().

However, very few drivers actually use it:
        e752x_edac.c
        e7xxx_edac.c
        i3000_edac.c
        i82443bxgx_edac.c
        i82860_edac.c
        i82875p_edac.c
        i82975x_edac.c
        r82600_edac.c

There also a few other drivers that have their own calculus
formula internally using those vars.

All the others are just wasting time by initializing those
data.

While initializing data without using them won't cause any troubles, as
those information is stored at the wrong place (at csrows structure), it
is better to remove what is unused, in order to simplify the next patch.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: move dimm properties to struct dimm_info
Mauro Carvalho Chehab [Fri, 27 Jan 2012 21:38:08 +0000 (18:38 -0300)]
edac: move dimm properties to struct dimm_info

On systems based on chip select rows, all channels need to use memories
with the same properties, otherwise the memories on channels A and B
won't be recognized.

However, such assumption is not true for all types of memory
controllers.

Controllers for FB-DIMM's don't have such requirements.

Also, modern Intel controllers seem to be capable of handling such
differences.

So, we need to get rid of storing the DIMM information into a per-csrow
data, storing it, instead at the right place.

The first step is to move grain, mtype, dtype and edac_mode to the
per-dimm struct.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Reviewed-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@parallels.com>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Mike Williams <mike@mikebwilliams.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoedac: Create a dimm struct and move the labels into it
Mauro Carvalho Chehab [Fri, 27 Jan 2012 17:12:32 +0000 (14:12 -0300)]
edac: Create a dimm struct and move the labels into it

The way a DIMM is currently represented implies that they're
linked into a per-csrow struct. However, some drivers don't see
csrows, as they're ridden behind some chip like the AMB's
on FBDIMM's, for example.

This forced drivers to fake^Wvirtualize a csrow struct, and to create
a mess under csrow/channel original's concept.

Move the DIMM labels into a per-DIMM struct, and add there
the real location of the socket, in terms of csrow/channel.
Latter patches will modify the location to properly represent the
memory architecture.

All other drivers will use a per-csrow type of location.
Some of those drivers will require a latter conversion, as
they also fake the csrows internally.

TODO: While this patch doesn't change the existing behavior, on
csrows-based memory controllers, a csrow/channel pair points to a memory
rank. There's a known bug at the EDAC core that allows having different
labels for the same DIMM, if it has more than one rank. A latter patch
is need to merge the several ranks for a DIMM into the same dimm_info
struct, in order to avoid having different labels for the same DIMM.

The edac_mc_alloc() will now contain a per-dimm initialization loop that
will be changed by latter patches in order to match other types of
memory architectures.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Reviewed-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agoLinux 3.4-rc7
Linus Torvalds [Sun, 13 May 2012 01:37:47 +0000 (18:37 -0700)]
Linux 3.4-rc7

.. and this should hopefully be the last -rc before final 3.4 release.

12 years agoMerge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm...
Linus Torvalds [Sun, 13 May 2012 00:27:41 +0000 (17:27 -0700)]
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM: SoC fixes from Olof Johansson:
 "I was hoping to be done with fixes for 3.4 but we got two branches
  from subarch maintainers the last couple of days.  So here is one
  last(?) pull request for arm-soc containing 7 patches:

   - Five of them are for shmobile dealing with SMP setup and compile
     failures
   - The remaining two are for regressions on the Samsung platforms"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: EXYNOS: fix ctrlbit for exynos5_clk_pdma1
  ARM: EXYNOS: use s5p-timer for UniversalC210 board
  ARM / mach-shmobile: Invalidate caches when booting secondary cores
  ARM / mach-shmobile: sh73a0 SMP TWD boot regression fix
  ARM / mach-shmobile: r8a7779 SMP TWD boot regression fix
  ARM: mach-shmobile: convert ag5evm to use the generic MMC GPIO hotplug helper
  ARM: mach-shmobile: convert mackerel to use the generic MMC GPIO hotplug helper

12 years agoMerge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6
Linus Torvalds [Sun, 13 May 2012 00:24:29 +0000 (17:24 -0700)]
Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6

Pull a few more GPIO bug fixes from Grant Likely:
 "Oops, missed a couple.  Here's an updated pull req for GPIO"

A set of PCH bug fixes, and one patch to fix up compile warnings

* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6:
  gpio/exynos: Fix compiler warnings when non-exynos machines are selected
  gpio: pch9: Use proper flow type handlers

12 years agoMerge branch 'v3.4-samsung-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Sat, 12 May 2012 22:41:22 +0000 (15:41 -0700)]
Merge branch 'v3.4-samsung-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes

* 'v3.4-samsung-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: EXYNOS: fix ctrlbit for exynos5_clk_pdma1
  ARM: EXYNOS: use s5p-timer for UniversalC210 board

12 years agoARM: EXYNOS: fix ctrlbit for exynos5_clk_pdma1
Kukjin Kim [Sat, 12 May 2012 07:45:47 +0000 (16:45 +0900)]
ARM: EXYNOS: fix ctrlbit for exynos5_clk_pdma1

It should be (1 << 2) for ctrlbit of exynos5_clk_pdma1.

Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
12 years agoARM: EXYNOS: use s5p-timer for UniversalC210 board
Marek Szyprowski [Fri, 11 May 2012 21:17:59 +0000 (06:17 +0900)]
ARM: EXYNOS: use s5p-timer for UniversalC210 board

Commit 069d4e743 ("ARM: EXYNOS4: Remove clock event timers using
ARM private timers") removed support for local timers and forced
to use MCT as event source. However MCT is not operating properly
on early revision of EXYNOS4 SoCs. All UniversalC210 boards are
based on it, so that commit broke support for it. This patch
provides a workaround that enables UniversalC210 boards to boot
again. s5p-timer is used as an event source, it works only for
non-SMP builds.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
12 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/renesas...
Olof Johansson [Sat, 12 May 2012 22:40:56 +0000 (15:40 -0700)]
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/renesas into fixes

By Guennadi Liakhovetski (2) and others via Rafael J. Wysocki:
"[...] urgent fixes for Renesas ARM-based platforms.  Four of these
commits are fixes of regressions new in 3.4-rc and the last one is
necessary for SMP to work on those systems in general."

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/renesas:
  ARM / mach-shmobile: Invalidate caches when booting secondary cores
  ARM / mach-shmobile: sh73a0 SMP TWD boot regression fix
  ARM / mach-shmobile: r8a7779 SMP TWD boot regression fix
  ARM: mach-shmobile: convert ag5evm to use the generic MMC GPIO hotplug helper
  ARM: mach-shmobile: convert mackerel to use the generic MMC GPIO hotplug helper

12 years agoARM / mach-shmobile: Invalidate caches when booting secondary cores
Magnus Damm [Wed, 9 May 2012 07:24:59 +0000 (16:24 +0900)]
ARM / mach-shmobile: Invalidate caches when booting secondary cores

Make sure L1 caches are invalidated when booting secondary
cores. Needed to boot all mach-shmobile SMP systems that
are using Cortex-A9 including sh73a0, r8a7779 and EMEV2.

Thanks to imx and tegra guys for actual code.

Signed-off-by: Magnus Damm <damm@opensource.se>
Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
12 years agoARM / mach-shmobile: sh73a0 SMP TWD boot regression fix
Kuninori Morimoto [Thu, 10 May 2012 07:26:58 +0000 (00:26 -0700)]
ARM / mach-shmobile: sh73a0 SMP TWD boot regression fix

Fix SMP TWD boot regression on sh73a0 based platforms caused by:

4200b16 ARM: shmobile: convert to twd_local_timer_register() interface

After the merge of the above commit it has been impossible to boot
sh73a0 based SoCs with SMP enabled and CONFIG_HAVE_ARM_TWD=y. The
kernel crashes at smp_init_cpus() timing which is before the console
has been initialized, so to the user this looks like a kernel lock up
without any particular error message.

This patch fixes the regression on sh73a0 by moving the TWD
registration code from smp_init_cpus() to sys_timer->init() time.

This patch removed shmobile_twd_init() which is no longer needed

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
12 years agoARM / mach-shmobile: r8a7779 SMP TWD boot regression fix
Magnus Damm [Thu, 10 May 2012 05:57:22 +0000 (14:57 +0900)]
ARM / mach-shmobile: r8a7779 SMP TWD boot regression fix

Fix SMP TWD boot regression on r8a7779 based platforms caused by:

4200b16 ARM: shmobile: convert to twd_local_timer_register() interface

After the merge of the above commit it has been impossible to boot
r8a7779 based SoCs with SMP enabled and CONFIG_HAVE_ARM_TWD=y. The
kernel crashes at smp_init_cpus() timing which is before the console
has been initialized, so to the user this looks like a kernel lock up
without any particular error message.

This patch fixes the regression on r8a7779 by moving the TWD
registration code from smp_init_cpus() to sys_timer->init() time.

Signed-off-by: Magnus Damm <damm@opensource.se>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
12 years agoARM: mach-shmobile: convert ag5evm to use the generic MMC GPIO hotplug helper
Guennadi Liakhovetski [Mon, 16 Apr 2012 21:09:19 +0000 (23:09 +0200)]
ARM: mach-shmobile: convert ag5evm to use the generic MMC GPIO hotplug helper

This also fixes the following modular mmc build failure:

arch/arm/mach-shmobile/built-in.o: In function `mackerel_sdhi0_gpio_cd':
pfc-sh7372.c:(.text+0x1138): undefined reference to `mmc_detect_change'

on this platform by eliminating the use of an inline function, which
calls into the mmc core.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Reviewed-by: Simon Horman <horms@verge.net.au>
Acked-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
12 years agoARM: mach-shmobile: convert mackerel to use the generic MMC GPIO hotplug helper
Guennadi Liakhovetski [Mon, 16 Apr 2012 21:09:13 +0000 (23:09 +0200)]
ARM: mach-shmobile: convert mackerel to use the generic MMC GPIO hotplug helper

This also fixes the following modular mmc build failure:

arch/arm/mach-shmobile/built-in.o: In function `ag5evm_sdhi0_gpio_cd':
pfc-sh73a0.c:(.text+0x7c0): undefined reference to `mmc_detect_change'

on this platform by eliminating the use of an inline function, which
calls into the mmc core.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Tested-by: Simon Horman <horms@verge.net.au>
Acked-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
12 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 12 May 2012 20:02:31 +0000 (13:02 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a set of minor qla and virto fixes plus one major regression
  fix (oops in all legacy host drivers)."

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  [SCSI] virtio_scsi: fix TMF use-after-free
  [SCSI] fix oops in all legacy host adapters caused by 6f381fa
  [SCSI] qla2xxx: Update version number to 8.04.00.03-k.
  [SCSI] qla2xxx: Properly check for current state after the fabric-login request.
  [SCSI] qla2xxx: Proper completion to scsi-ml for scsi status task_set_full and busy.
  [SCSI] qla2xxx: Block flash access from application when device is initialized for ISP82xx.
  [SCSI] qla2xxx: Fix reset time out as qla2xxx not ack to reset request.

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sat, 12 May 2012 19:57:01 +0000 (12:57 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David S. Miller:

 1) Since we do RCU lookups on ipv4 FIB entries, we have to test if the
    entry is dead before returning it to our caller.

 2) openvswitch locking and packet validation fixes from Ansis Atteka,
    Jesse Gross, and Pravin B Shelar.

 3) Fix PM resume locking in IGB driver, from Benjamin Poirier.

 4) Fix VLAN header handling in vhost-net and macvtap, from Basil Gor.

 5) Revert a bogus network namespace isolation change that was causing
    regressions on S390 networking devices.

 6) If bonding decides to process and handle a LACPDU frame, we
    shouldn't bump the rx_dropped counter.  From Jiri Bohac.

 7) Fix mis-calculation of available TX space in r8169 driver when doing
    TSO, which can lead to crashes and/or hung device.  From Julien
    Ducourthial.

 8) SCTP does not validate cached routes properly in all cases, from
    Nicolas Dichtel.

 9) Link status interrupt needs to be handled in ks8851 driver, from
    Stephen Boyd.

10) Use capable(), not cap_raised(), in connector/userns netlink code.
    From Eric W. Biederman via Andrew Morton.

11) Fix pktgen OOPS on module unload, from Eric Dumazet.

12) iwlwifi under-estimates SKB truesizes, also from Eric Dumazet.

13) Cure division by zero in SFC driver, from Ben Hutchings.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
  ks8851: Update link status during link change interrupt
  macvtap: restore vlan header on user read
  vhost-net: fix handle_rx buffer size
  bonding: don't increase rx_dropped after processing LACPDUs
  connector/userns: replace netlink uses of cap_raised() with capable()
  sctp: check cached dst before using it
  pktgen: fix crash at module unload
  Revert "net: maintain namespace isolation between vlan and real device"
  ehea: fix losing of NEQ events when one event occurred early
  igb: fix rtnl race in PM resume path
  ipv4: Do not use dead fib_info entries.
  r8169: fix unsigned int wraparound with TSO
  sfc: Fix division by zero when using one RX channel and no SR-IOV
  openvswitch: Validation of IPv6 set port action uses IPv4 header
  net: compare_ether_addr[_64bits]() has no ordering
  cdc_ether: Ignore bogus union descriptor for RNDIS devices
  bnx2x: bug fix when loading after SAN boot
  e1000: Silence sparse warnings by correcting type
  igb, ixgbe: netdev_tx_reset_queue incorrectly called from tx init path
  openvswitch: Release rtnl_lock if ovs_vport_cmd_build_info() failed.
  ...

12 years agoMerge tag 'dm-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm
Linus Torvalds [Sat, 12 May 2012 19:56:08 +0000 (12:56 -0700)]
Merge tag 'dm-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm

Pull device-mapper fixes from Alasdair G Kergon:
 "Fix a couple of serious memory leaks in device-mapper thin
  provisioning and tidy its MODULE_DESCRIPTION.

  Mitigate occasional reported hangs associated with multipath scsi_dh
  module loading."

* tag 'dm-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
  dm mpath: check if scsi_dh module already loaded before trying to load
  dm thin: correct module description
  dm thin: fix unprotected use of prepared_discards list
  dm thin: reinstate missing mempool_free in cell_release_singleton

12 years agoMAINTAINERS: Add myself as the cpufreq maintainer
Rafael J. Wysocki [Fri, 11 May 2012 19:35:45 +0000 (21:35 +0200)]
MAINTAINERS: Add myself as the cpufreq maintainer

Since cpufreq has no official maintainer at the moment, I'm willing
to maintain it along some other power management core code I've been
maintaining already.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agodm mpath: check if scsi_dh module already loaded before trying to load
Mike Snitzer [Sat, 12 May 2012 00:43:21 +0000 (01:43 +0100)]
dm mpath: check if scsi_dh module already loaded before trying to load

If the requested scsi_dh module is already loaded then skip
request_module().

Multipath table loads can hang in an unnecessary __request_module.

Reported-by: Ben Marzinski <bmarzins@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
12 years agodm thin: correct module description
Alasdair G Kergon [Sat, 12 May 2012 00:43:19 +0000 (01:43 +0100)]
dm thin: correct module description

Remove duplicate copy of string "device-mapper" (DM_NAME) from
MODULE_DESCRIPTION.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
12 years agodm thin: fix unprotected use of prepared_discards list
Mike Snitzer [Sat, 12 May 2012 00:43:16 +0000 (01:43 +0100)]
dm thin: fix unprotected use of prepared_discards list

Fix two places in commit 104655fd4dce ("dm thin: support discards") that
didn't use pool->lock to protect against concurrent changes to the
prepared_discards list.

Without this fix, thin_endio() can race with process_discard(), leading
to concurrent list_add()s that result in the processes locking up with
an error like the following:

WARNING: at lib/list_debug.c:32 __list_add+0x8f/0xa0()
...
list_add corruption. next->prev should be prev (ffff880323b96140), but was ffff8801d2c48440. (next=ffff8801d2c485c0).
...
Pid: 17205, comm: kworker/u:1 Tainted: G        W  O 3.4.0-rc3.snitm+ #1
Call Trace:
 [<ffffffff8103ca1f>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff8103cb16>] warn_slowpath_fmt+0x46/0x50
 [<ffffffffa04f6ce6>] ? bio_detain+0xc6/0x210 [dm_thin_pool]
 [<ffffffff8124ff3f>] __list_add+0x8f/0xa0
 [<ffffffffa04f70d2>] process_discard+0x2a2/0x2d0 [dm_thin_pool]
 [<ffffffffa04f6a78>] ? remap_and_issue+0x38/0x50 [dm_thin_pool]
 [<ffffffffa04f7c3b>] process_deferred_bios+0x7b/0x230 [dm_thin_pool]
 [<ffffffffa04f7df0>] ? process_deferred_bios+0x230/0x230 [dm_thin_pool]
 [<ffffffffa04f7e42>] do_worker+0x52/0x60 [dm_thin_pool]
 [<ffffffff81056fa9>] process_one_work+0x129/0x450
 [<ffffffff81059b9c>] worker_thread+0x17c/0x3c0
 [<ffffffff81059a20>] ? manage_workers+0x120/0x120
 [<ffffffff8105eabe>] kthread+0x9e/0xb0
 [<ffffffff814ceda4>] kernel_thread_helper+0x4/0x10
 [<ffffffff8105ea20>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff814ceda0>] ? gs_change+0x13/0x13
---[ end trace 7e0a523bc5e52692 ]---

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
12 years agodm thin: reinstate missing mempool_free in cell_release_singleton
Mike Snitzer [Sat, 12 May 2012 00:43:12 +0000 (01:43 +0100)]
dm thin: reinstate missing mempool_free in cell_release_singleton

Fix a significant memory leak inadvertently introduced during
simplification of cell_release_singleton() in commit
6f94a4c45a6f744383f9f695dde019998db3df55 ("dm thin: fix stacked bi_next
usage").

A cell's hlist_del() must be accompanied by a mempool_free().
Use __cell_release() to do this, like before.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
12 years agogpio/exynos: Fix compiler warnings when non-exynos machines are selected
Sachin Kamat [Mon, 30 Apr 2012 06:52:48 +0000 (12:22 +0530)]
gpio/exynos: Fix compiler warnings when non-exynos machines are selected

Fixes the following compiler warnings:

drivers/gpio/gpio-samsung.c: In function ‘samsung_gpiolib_init’:
drivers/gpio/gpio-samsung.c:2980:1: warning: label ‘err_ioremap1’ defined but not used [-Wunused-label]
drivers/gpio/gpio-samsung.c:2978:1: warning: label ‘err_ioremap2’ defined but not used [-Wunused-label]
drivers/gpio/gpio-samsung.c:2976:1: warning: label ‘err_ioremap3’ defined but not used [-Wunused-label]
drivers/gpio/gpio-samsung.c:2974:1: warning: label ‘err_ioremap4’ defined but not used [-Wunused-label]
drivers/gpio/gpio-samsung.c:2722:55: warning: unused variable ‘gpio_base4’ [-Wunused-variable]

drivers/gpio/gpio-samsung.c:455:32: warning: ‘exynos_gpio_cfg’ defined but not used [-Wunused-variable]
drivers/gpio/gpio-samsung.c:2126:33: warning: ‘exynos4_gpios_1’ defined but not used [-Wunused-variable]
drivers/gpio/gpio-samsung.c:2228:33: warning: ‘exynos4_gpios_2’ defined but not used [-Wunused-variable]
drivers/gpio/gpio-samsung.c:2373:33: warning: ‘exynos4_gpios_3’ defined but not used [-Wunused-variable]

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
12 years agogpio: pch9: Use proper flow type handlers
Thomas Gleixner [Sat, 28 Apr 2012 08:13:45 +0000 (10:13 +0200)]
gpio: pch9: Use proper flow type handlers

Jean-Francois Dagenais reported:

 Configuring a gpio pin with the gpio-pch driver with
 "IRQF_TRIGGER_LOW | IRQF_ONESHOT" generates an interrupt storm for
 threaded ISR until the ISR thread actually gets to physically clear
 the interrupt on the triggering chip!! The immediate observable
 symptom is the high CPU usage for my ISR thread task and the
 interrupt count in /proc/interrupts incrementing radically.

The driver is wrong in several ways:

1) Using handle_simple_irq() does not provide proper flow control
   handling. In the case of oneshot threaded handlers for the
   demultiplexed interrupts this results in an interrupt storm because
   the simple handler does not deal with masking/unmasking.  Even
   without threaded oneshot handlers an interrupt storm for level type
   interrupts can easily be triggered when the interrupt is disabled
   and the interrupt line is activated from the device.

2) Acknowlegding the demultiplexed interrupt before calling the
   handler is wrong for level type interrupts.

3) The set_type function unconditionally enables the interrupt. It's
   supposed to set the type and nothing else. The unmasking is done by
   the core code.

Move the acknowledge code into a separate function and add it to the
demux irqchip callbacks.

Remove the unconditional enabling from the set_type() callback and set
the proper flow handlers depending on the selected type (level/edge).

Reported-and-tested-by: Jean-Francois Dagenais <jeff.dagenais@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
12 years agoMerge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6
Linus Torvalds [Fri, 11 May 2012 23:59:07 +0000 (16:59 -0700)]
Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6

Pull GPIO omap bug fix from Grant Likely.

* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6:
  gpio/omap: fix incorrect initialization of omap_gpio_mod_init

12 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Fri, 11 May 2012 23:58:14 +0000 (16:58 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

Pull another powerpc irq fix from Benjamin Herrenschmidt:
 "It looks like my previous fix for the lazy irq masking problem wasn't
  quite enough.  There was another problem related to performance
  monitor interrupts acting as NMIs leaving the flags in an incorrect
  state.  Here's a fix that finally seems to make perf solid again."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/irq: Fix another case of lazy IRQ state getting out of sync

12 years agoMerge branch '3.4-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target...
Linus Torvalds [Fri, 11 May 2012 23:49:09 +0000 (16:49 -0700)]
Merge branch '3.4-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending

Pull target fix from Nicholas Bellinger:
 "This patch removes some incorrect legacy code to free se_lun_acl
  memory in the NodeACL release path that could potentially trigger an
  OOPS during shutdown once dynamic -> explicit initiator NodeACL
  conversion has occurred.

  That said, we've been able to trigger an OOPS in v4.0 code for this
  special case when the associated MappedLUNs had not also been made
  explicit based on active TPG LUN layout during the conversion, so it
  really makes senses to go ahead and drop this extra cruft to avoid any
  possible issues here.

  This ends up only effecting iscsi-target module code (it's the only
  user) and is CC'ed to stable."

* '3.4-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Drop incorrect se_lun_acl release for dynamic -> explict ACL conversion

12 years agopowerpc/irq: Fix another case of lazy IRQ state getting out of sync
Benjamin Herrenschmidt [Thu, 10 May 2012 16:12:38 +0000 (16:12 +0000)]
powerpc/irq: Fix another case of lazy IRQ state getting out of sync

So we have another case of paca->irq_happened getting out of
sync with the HW irq state. This can happen when a perfmon
interrupt occurs while soft disabled, as it will return to a
soft disabled but hard enabled context while leaving a stale
PACA_IRQ_HARD_DIS flag set.

This patch fixes it, and also adds a test for the condition
of those flags being out of sync in arch_local_irq_restore()
when CONFIG_TRACE_IRQFLAGS is enabled.

This helps catching those gremlins faster (and so far I
can't seem see any anymore, so that's good news).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
12 years agoks8851: Update link status during link change interrupt
Stephen Boyd [Thu, 10 May 2012 12:51:30 +0000 (12:51 +0000)]
ks8851: Update link status during link change interrupt

If a link change interrupt comes in we just clear the interrupt
and continue along without notifying the upper networking layers
that the link has changed. Use the mii_check_link() function to
update the link status whenever a link change interrupt occurs.

Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomacvtap: restore vlan header on user read
Basil Gor [Thu, 3 May 2012 22:55:24 +0000 (22:55 +0000)]
macvtap: restore vlan header on user read

Ethernet vlan header is not on the packet and kept in the skb->vlan_tci
when it comes from lower dev. This patch inserts vlan header in user
buffer during skb copy on user read.

Signed-off-by: Basil Gor <basil.gor@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agovhost-net: fix handle_rx buffer size
Basil Gor [Thu, 3 May 2012 22:55:23 +0000 (22:55 +0000)]
vhost-net: fix handle_rx buffer size

Take vlan header length into account, when vlan id is stored as
vlan_tci. Otherwise tagged packets coming from macvtap will be
truncated.

Signed-off-by: Basil Gor <basil.gor@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotarget: Drop incorrect se_lun_acl release for dynamic -> explict ACL conversion
Nicholas Bellinger [Fri, 11 May 2012 05:05:49 +0000 (22:05 -0700)]
target: Drop incorrect se_lun_acl release for dynamic -> explict ACL conversion

This patch removes some potentially problematic legacy code within
core_clear_initiator_node_from_tpg() that was originally intended to
release left over se_lun_acl setup during dynamic NodeACL+MappedLUN
generate when running with TPG demo-mode operation.

Since we now only ever expect to allocate and release se_lun_acl from
within target_core_fabric_configfs.c:target_fabric_make_mappedlun() and
target_fabric_drop_mappedlun() context respectively, this code for
demo-mode release is incorrect and needs to be removed.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Linus Torvalds [Fri, 11 May 2012 16:28:35 +0000 (09:28 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu

Pull a m68knommu fix from Greg Ungerer:
 "It contains a single fix for including the ColdFire QSPI interface
  setup code when enabled as a module.  This was broken in the
  consolidation of the ColdFire SoC device tables in the 3.4 merge
  window."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68knommu: enable qspi support when SPI_COLDFIRE_QSPI = m

12 years agomm: raise MemFree by reverting percpu_pagelist_fraction to 0
Hugh Dickins [Fri, 11 May 2012 08:00:07 +0000 (01:00 -0700)]
mm: raise MemFree by reverting percpu_pagelist_fraction to 0

Why is there less MemFree than there used to be?  It perturbed a test,
so I've just been bisecting linux-next, and now find the offender went
upstream yesterday.

Commit 93278814d359 "mm: fix division by 0 in percpu_pagelist_fraction()"
mistakenly initialized percpu_pagelist_fraction to the sysctl's minimum 8,
which leaves 1/8th of memory on percpu lists (on each cpu??); but most of
us expect it to be left unset at 0 (and it's not then used as a divisor).

  MemTotal: 8061476kB  8061476kB  8061476kB  8061476kB  8061476kB  8061476kB
  Repetitive test with percpu_pagelist_fraction 8:
  MemFree:  6948420kB  6237172kB  6949696kB  6840692kB  6949048kB  6862984kB
  Same test with percpu_pagelist_fraction back to 0:
  MemFree:  7945000kB  7944908kB  7948568kB  7949060kB  7948796kB  7948812kB

Signed-off-by: Hugh Dickins <hughd@google.com>
[ We really should fix the crazy sysctl interface too, but that's a
  separate thing - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agobonding: don't increase rx_dropped after processing LACPDUs
Jiri Bohac [Wed, 9 May 2012 01:01:40 +0000 (01:01 +0000)]
bonding: don't increase rx_dropped after processing LACPDUs

Since commit 3aba891d, bonding processes LACP frames (802.3ad
mode) with bond_handle_frame(). Currently a copy of the skb is
made and the original is left to be processed by other
rx_handlers and the rest of the network stack by returning
RX_HANDLER_ANOTHER.  As there is no protocol handler for
PKT_TYPE_LACPDU, the frame is dropped and dev->rx_dropped
increased.

Fix this by making bond_handle_frame() return RX_HANDLER_CONSUMED
if bonding has processed the LACP frame.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>