Age | Commit message (Collapse) | Author |
|
EDAC MC0: INTERNAL ERROR: channel-b out of range (4 >= 4)
Kernel panic - not syncing: EDAC MC0: Uncorrected Error (XEN) Domain 0 crashed: 'noreboot' set - not rebooting.
This happens because FERR_NF_FBD bit 28 is not updated on i5000. Due to
that, both bits 28 and 29 may be equal to one, returning channel = 3. As
this value is invalid, EDAC core generates the panic.
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14568
Signed-off-by: Tamas Vincze <tom@vincze.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Allow csrows to properly initialize when the topology only has active
channels on 2 and 3. This new check allows proper detection and
initialization in this topology. Only checking the first mrt that
represented channels 0 and 1 is not sufficient.
I also fixed up the related debug information path. I can submit as a 2nd
patch if needed.
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Acked-by: Aristeu Rozanski <aris@ruivo.org>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
It turns out that edac_mc_del_mc will kobject_put the last kref on the
mci object.
If the timing is just right, that means that the mci object is freed
before before i5000_remove_one has a chance to free the resources
associated with it, causing a null pointer exceptions when unloading the
driver. Insert a kobject_{get,put} pair so that this doesn't happen.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Make the Thermal messages (temperature got past Tmid) be displayed only
once because:
1) it's the BIOS job to configure and handle the memory throttling
2) if the BIOS is broken or is aware about the condition, flooding the
system logs won't help anything.
3) According to the specification update for Intel 5000 MCHs, all the
revisions of this MCH have problems on the thermal sensors, making
not automatic (a.k.a. intelligent thermal throttling) impossible.
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Update the i5000_edac messages, making everything pass through the EDAC
(so the log controls will work) and being more specific about the errors.
Also, it makes the miscellaneous errors optional and disabled by default.
As I didn't found anywhere information about M23ERR-M26ERR
(FERR_NF_THERMAL) on FERR_NF_FBD, I'm removing them.
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I implemented opstate_init() as a inline function in linux/edac.h.
added calling opstate_init() to:
i82443bxgx_edac.c
i82860_edac.c
i82875p_edac.c
i82975x_edac.c
I wrote a fixed patch of
edac-fix-module-initialization-on-several-modules.patch,
and tested building 2.6.25-rc7 with applying this. It was succeed.
I think the patch is now correct.
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
|
|
The i5000_edac driver's PCI registration structure has the name
""i5000_edac"" (with extra set of double-quotes) which is probably not
intentional. Get rid of __stringify.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Refactoring of sysfs code necessitated the refactoring of the edac_mc_alloc()
and edac_mc_add_mc() apis, of moving the index value to the alloc() function.
This patch alters the in tree drivers to utilize this new api signature.
Having the index value performed later created a chicken-and-the-egg issue.
Moving it to the alloc() function allows for creating the necessary sysfs
entries with the proper index number
Cc: Alan Cox alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Various code style conformance patches on the i5000 driver
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Found a typo in one of the #defines in the driver
MTR_DIM_RANKS --> MTR_DIMM_RANK
Signed-off-by: Marisuz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch fixes some remnant spaces inserted by the use of Lindent.
Seems Lindent adds some spaces when it shoulded. These have been fixed.
In addition, goto targets have issues, these have been fixed
in this patch.
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Move x86 drivers to new pci controller setup
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Ran e752x_edac.c file through Lindent for cleanup
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Move dev_name() macro to a more generic interface since it's not possible
to determine whether a device is pci, platform, or of_device easily.
Now each low level driver sets the name into the control structure, and
the EDAC core references the control structure for the information.
Better abstraction.
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In the refactoring of edac_mc.c into several subsystem files,
the header file edac_mc.h became meaningless. A new header file
edac_core.h was created. All the files that previously included
"edac_mc.h" are changed to include "edac_core.h".
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Provides a way for NMI reported errors on x86 to notify the EDAC
subsystem pending ECC errors by writing to a software state variable.
Here's the reworked patch. I added an EDAC stub to the kernel so we can
have variables that are in the kernel even if EDAC is a module. I also
implemented the idea of using the chip driver to select error detection
mode via module parameter and eliminate the kernel compile option.
Please review/test. Thx!
Also, I only made changes to some of the chipset drivers since I am
unfamiliar with the other ones. We can add similar changes as we go.
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Eric Wollesen ported the Bluesmoke Memory Controller driver (written by Doug
Thompson) for the Intel 5000X/V/P (Blackford/Greencreek) chipset to the in
kernel EDAC model.
This patch incorporates the module for the 5000X/V/P chipset family
[m.kozlowski@tuxland.pl: edac i5000 parenthesis balance fix]
Signed-off-by: Eric Wollesen <ericw@xmtp.net>
Signed-off-by: Doug Thompson <norsk5@xmission.com>
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|