Age | Commit message (Collapse) | Author |
|
|
|
With CONFIG_DEBUG_SLAB=y I see slab corruption messages during boot on
pSeries machines with IPR adapters with any 2.6.12-rc kernel.
The change which seems to have introduced the problem is "SCSI: revamp
target scanning routines" and may be found at:
http://marc.theaimsgroup.com/?l=bk-commits-head&m=111093946426333&w=2
In order to revert that in a 2.6.12-rc1 tree, I had to revert "target
code updates to support scanned targets" first:
http://marc.theaimsgroup.com/?l=bk-commits-head&m=111094132524649&w=2
With both patches reverted, the corruption messages go away.
ipr: IBM Power RAID SCSI Device Driver version: 2.0.13 (February 21,
2005)
ipr 0001:d0:01.0: Found IOA with IRQ: 167
ipr 0001:d0:01.0: Starting IOA initialization sequence.
ipr 0001:d0:01.0: Adapter firmware version: 020A005C
ipr 0001:d0:01.0: IOA initialized.
scsi0 : IBM 570B Storage Adapter
Vendor: IBM Model: VSBPD4E1 U4SCSI Rev: 4770
Type: Enclosure ANSI SCSI revision: 02
Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF
Type: Direct-Access ANSI SCSI revision: 04
Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF
Type: Direct-Access ANSI SCSI revision: 04
Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF
Type: Direct-Access ANSI SCSI revision: 04
Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF
Type: Direct-Access ANSI SCSI revision: 04
Vendor: IBM Model: VSBPD4E1 U4SCSI Rev: 4770
Type: Enclosure ANSI SCSI revision: 02
Slab corruption: start=c0000001e8de5268, len=512
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c00000000029c3a0>](.scsi_target_dev_release+0x28/0x50)
080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a
Prev obj: start=c0000001e8de5050, len=512
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<0000000000000000>](0x0)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=c0000001e8de5480, len=512
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c000000000228d7c>](.as_init_queue+0x5c/0x228)
000: c0 00 00 01 e8 83 26 08 00 00 00 00 00 00 00 00
010: 00 00 00 00 00 00 00 00 c0 00 00 01 e8 de 54 98
Slab corruption: start=c0000001e8de5268, len=512
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c00000000029c3a0>](.scsi_target_dev_release+0x28/0x50)
080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a
Prev obj: start=c0000001e8de5050, len=512
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<0000000000000000>](0x0)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=c0000001e8de5480, len=512
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c000000000228d7c>](.as_init_queue+0x5c/0x228)
000: c0 00 00 01 e8 83 26 08 00 00 00 00 00 00 00 00
010: 00 00 00 00 00 00 00 00 c0 00 00 01 e8 de 54 98
...
I did some digging and the problem seems to be a refcounting issue in
__scsi_add_device. The target gets freed in scsi_target_reap, and
then __scsi_add_device tries to do another device_put on it.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
|
This gives the HBA driver notice when a target is created and
destroyed to allow it to manage its own target based allocations
accordingly.
This is a much reduced verson of the original patch sent in by
James.Smart@Emulex.com
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
|
a) TYPE_SDAD renamed to TYPE_RBC and taken to scsi.h
b) in sbp2.c remapping of TYPE_RPB to TYPE_DISK turned off
c) relevant places in midlayer and sd.c taught to accept TYPE_RBC
d) sd.c::sd_read_cache_type() looks into page 6 when dealing with
TYPE_RBC - these guys have writeback cache flag there and are not guaranteed
to have page 8 at all.
e) sd_read_cache_type() got an extra sanity check - it checks that
it got the page it asked for before using its contents. And screams if
mismatch had happened. Rationale: there are broken devices out there that
are "helpful" enough to go for "I don't have a page you've asked for, here,
have another one". For example, PL3507 had been caught doing just that...
f) sbp2 sets sdev->use_10_for_rw and sdev->use_10_for_ms instead
of bothering to remap READ6/WRITE6/MOD_SENSE, so most of the conversions
in there are gone now.
Incidentally, I wonder if USB storage devices that have no
mode page 8 are simply RBC ones. I haven't touched that, but it might
be interesting to check...
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
|
Somebody forgot that | has higher priority than ?:. As the result,
allocation is done with bogus flags - instead of GFP_ATOMIC + possibly
GFP_DMA we always get GFP_DMA and no GFP_ATOMIC.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
The current problem seen is that the queue lock is actually in the
SCSI device structure, so when that structure is freed on device
release, we go boom if the queue tries to access the lock again.
The fix here is to move the lock from the scsi_device to the queue.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
|
|
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
|