Release Date : Fri May 19 09:31:45 EST 2006 - Seokmann Ju <sju@lsil.com> Current Version : 2.20.4.9 (scsi module), 2.20.2.6 (cmm module) Older Version : 2.20.4.8 (scsi module), 2.20.2.6 (cmm module) 1. Fixed a bug in megaraid_init_mbox(). Customer reported "garbage in file on x86_64 platform". Root Cause: the driver registered controllers as 64-bit DMA capable for those which are not support it. Fix: Made change in the function inserting identification machanism identifying 64-bit DMA capable controllers. > -----Original Message----- > From: Vasily Averin [mailto:vvs@sw.ru] > Sent: Thursday, May 04, 2006 2:49 PM > To: linux-scsi@vger.kernel.org; Kolli, Neela; Mukker, Atul; > Ju, Seokmann; Bagalkote, Sreenivas; > James.Bottomley@SteelEye.com; devel@openvz.org > Subject: megaraid_mbox: garbage in file > > Hello all, > > I've investigated customers claim on the unstable work of > their node and found a > strange effect: reading from some files leads to the > "attempt to access beyond end of device" messages. > > I've checked filesystem, memory on the node, motherboard BIOS > version, but it > does not help and issue still has been reproduced by simple > file reading. > > Reproducer is simple: > > echo 0xffffffff >/proc/sys/dev/scsi/logging_level ; > cat /vz/private/101/root/etc/ld.so.cache >/tmp/ttt ; > echo 0 >/proc/sys/dev/scsi/logging > > It leads to the following messages in dmesg > > sd_init_command: disk=sda, block=871769260, count=26 > sda : block=871769260 > sda : reading 26/26 512 byte blocks. > scsi_add_timer: scmd: f79ed980, time: 7500, (c02b1420) > sd 0:1:0:0: send 0xf79ed980 sd 0:1:0:0: > command: Read (10): 28 00 33 f6 24 ac 00 00 1a 00 > buffer = 0xf7cfb540, bufflen = 13312, done = 0xc0366b40, > queuecommand 0xc0344010 > leaving scsi_dispatch_cmnd() > scsi_delete_timer: scmd: f79ed980, rtn: 1 > sd 0:1:0:0: done 0xf79ed980 SUCCESS 0 sd 0:1:0:0: > command: Read (10): 28 00 33 f6 24 ac 00 00 1a 00 > scsi host busy 1 failed 0 > sd 0:1:0:0: Notifying upper driver of completion (result 0) > sd_rw_intr: sda: res=0x0 > 26 sectors total, 13312 bytes done. > use_sg is 4 > attempt to access beyond end of device > sda6: rw=0, want=1044134458, limit=951401367 > Buffer I/O error on device sda6, logical block 522067228 > attempt to access beyond end of device 2. When INQUIRY with EVPD bit set issued to the MegaRAID controller, system memory gets corrupted. Root Cause: MegaRAID F/W handle the INQUIRY with EVPD bit set incorrectly. Fix: MegaRAID F/W has fixed the problem and being process of release, soon. Meanwhile, driver will filter out the request. 3. One of member in the data structure of the driver leads unaligne issue on 64-bit platform. Customer reporeted "kernel unaligned access addrss" issue when application communicates with MegaRAID HBA driver. Root Cause: in uioc_t structure, one of member had misaligned and it led system to display the error message. Fix: A patch submitted to community from following folk. > -----Original Message----- > From: linux-scsi-owner@vger.kernel.org > [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Sakurai Hiroomi > Sent: Wednesday, July 12, 2006 4:20 AM > To: linux-scsi@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: Re: Help: strange messages from kernel on IA64 platform > > Hi, > > I saw same message. > > When GAM(Global Array Manager) is started, The following > message output. > kernel: kernel unaligned access to 0xe0000001fe1080d4, > ip=0xa000000200053371 > > The uioc structure used by ioctl is defined by packed, > the allignment of each member are disturbed. > In a 64 bit structure, the allignment of member doesn't fit 64 bit > boundary. this causes this messages. > In a 32 bit structure, we don't see the message because the allinment > of member fit 32 bit boundary even if packed is specified. > > patch > I Add 32 bit dummy member to fit 64 bit boundary. I tested. > We confirmed this patch fix the problem by IA64 server. > > ************************************************************** > **************** > --- linux-2.6.9/drivers/scsi/megaraid/megaraid_ioctl.h.orig > 2006-04-03 17:13:03.000000000 +0900 > +++ linux-2.6.9/drivers/scsi/megaraid/megaraid_ioctl.h > 2006-04-03 17:14:09.000000000 +0900 > @@ -132,6 +132,10 @@ > /* Driver Data: */ > void __user * user_data; > uint32_t user_data_len; > + > + /* 64bit alignment */ > + uint32_t pad_0xBC; > + > mraid_passthru_t __user *user_pthru; > > mraid_passthru_t *pthru32; > ************************************************************** > **************** Release Date : Mon Apr 11 12:27:22 EST 2006 - Seokmann Ju <sju@lsil.com> Current Version : 2.20.4.8 (scsi module), 2.20.2.6 (cmm module) Older Version : 2.20.4.7 (scsi module), 2.20.2.6 (cmm module) 1. Fixed a bug in megaraid_reset_handler(). Customer reported "Unable to handle kernel NULL pointer dereference at virtual address 00000000" when system goes to reset condition for some reason. It happened randomly. Root Cause: in the megaraid_reset_handler(), there is possibility not returning pending packets in the pend_list if there are multiple pending packets. Fix: Made the change in the driver so that it will return all packets in the pend_list. 2. Added change request. As found in the following URL, rmb() only didn't help the problem. I had to increase the loop counter to 0xFFFFFF. (6 F's) http://marc.theaimsgroup.com/?l=linux-scsi&m=110971060502497&w=2 I attached a patch for your reference, too. Could you check and get this fix in your driver? Best Regards, Jun'ichi Nomura Release Date : Fri Nov 11 12:27:22 EST 2005 - Seokmann Ju <sju@lsil.com> Current Version : 2.20.4.7 (scsi module), 2.20.2.6 (cmm module) Older Version : 2.20.4.6 (scsi module), 2.20.2.6 (cmm module) 1. Sorted out PCI IDs to remove megaraid support overlaps. Based on the patch from Daniel, sorted out PCI IDs along with charactor node name change from 'megadev' to 'megadev_legacy' to avoid conflict. --- Hopefully we'll be getting the build restriction zapped much sooner, but we should also be thinking about totally removing the hardware support overlap in the megaraid drivers. This patch pencils in a date of Feb 06 for this, and performs some printk abuse in hope that existing legacy users might pick up on what's going on. Signed-off-by: Daniel Drake <dsd@gentoo.org> --- 2. Fixed a issue: megaraid always fails to reset handler. --- I found that the megaraid driver always fails to reset the adapter with the following message: megaraid: resetting the host... megaraid mbox: reset sequence completed successfully megaraid: fast sync command timed out megaraid: reservation reset failed when the "Cluster mode" of the adapter BIOS is enabled. So, whenever the reset occurs, the adapter goes to offline and just become unavailable. Jun'ichi Nomura [mailto:jnomura@mtc.biglobe.ne.jp] --- Release Date : Mon Mar 07 12:27:22 EST 2005 - Seokmann Ju <sju@lsil.com> Current Version : 2.20.4.6 (scsi module), 2.20.2.6 (cmm module) Older Version : 2.20.4.5 (scsi module), 2.20.2.5 (cmm module) 1. Added IOCTL backward compatibility. Convert megaraid_mm driver to new compat_ioctl entry points. I don't have easy access to hardware, so only compile tested. - Signed-off-by:Andi Kleen <ak@muc.de> 2. megaraid_mbox fix: wrong order of arguments in memset() That, BTW, shows why cross-builds are useful-the only indication of problem had been a new warning showing up in sparse output on alpha build (number of exceeding 256 got truncated). - Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> 3. Convert pci_module_init to pci_register_driver Convert from pci_module_init to pci_register_driver (from:http://kerneljanitors.org/TODO) - Signed-off-by: Domen Puncer <domen@coderock.org> 4. Use the pre defined DMA mask constants from dma-mapping.h Use the DMA_{64,32}BIT_MASK constants from dma-mapping.h when calling pci_set_dma_mask() or pci_set_consistend_dma_mask(). See http://marc.theaimsgroup.com/?t=108001993000001&r=1&w=2 for more details. Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch> Signed-off-by: Domen Puncer <domen@coderock.org> 5. Remove SSID checking for Dobson, Lindsay, and Verde based products. Checking the SSVID/SSID for controllers which have Dobson, Lindsay, and Verde is unnecessary because device ID has been assigned by LSI and it is unique value. So, all controllers with these IOPs have to be supported by the driver regardless SSVID/SSID. 6. Date Thu, 27 Jan 2005 04:31:09 +0100 From Herbert Poetzl <> Subject RFC: assert_spin_locked() for 2.6 Greetings! overcautious programming will kill your kernel ;) ever thought about checking a spin_lock or even asserting that it must be held (maybe just for spinlock debugging?) ... there are several checks present in the kernel where somebody does a variation on the following: BUG_ON(!spin_is_locked(&some_lock)); so what's wrong about that? nothing, unless you compile the code with CONFIG_DEBUG_SPINLOCK but without CONFIG_SMP ... in which case the BUG() will kill your kernel ... maybe it's not advised to make such assertions, but here is a solution which works for me ... (compile tested for sh, x86_64 and x86, boot/run tested for x86 only) best, Herbert - Herbert Poetzl <herbert@13thfloor.at>, Thu, 27 Jan 2005 Release Date : Thu Feb 03 12:27:22 EST 2005 - Seokmann Ju <sju@lsil.com> Current Version : 2.20.4.5 (scsi module), 2.20.2.5 (cmm module) Older Version : 2.20.4.4 (scsi module), 2.20.2.4 (cmm module) 1. Modified name of two attributes in scsi_host_template. On Wed, 2005-02-02 at 10:56 -0500, Ju, Seokmann wrote: > + .sdev_attrs = megaraid_device_attrs, > + .shost_attrs = megaraid_class_device_attrs, These are, perhaps, slightly confusing names. The terms device and class_device have well defined meanings in the generic device model, neither of which is what you mean here. Why not simply megaraid_sdev_attrs and megaraid_shost_attrs? Other than this, it looks fine to me too. Release Date : Thu Jan 27 00:01:03 EST 2005 - Atul Mukker <atulm@lsil.com> Current Version : 2.20.4.4 (scsi module), 2.20.2.5 (cmm module) Older Version : 2.20.4.3 (scsi module), 2.20.2.4 (cmm module) 1. Bump up the version of scsi module due to its conflict. Release Date : Thu Jan 21 00:01:03 EST 2005 - Atul Mukker <atulm@lsil.com> Current Version : 2.20.4.3 (scsi module), 2.20.2.5 (cmm module) Older Version : 2.20.4.2 (scsi module), 2.20.2.4 (cmm module) 1. Remove driver ioctl for logical drive to scsi address translation and replace with the sysfs attribute. To remove drives and change capacity, application shall now use the device attribute to get the logical drive number for a scsi device. For adding newly created logical drives, class device attribute would be required to uniquely identify each controller. - Atul Mukker <atulm@lsil.com> "James, I've been thinking about this a little more, and you may be on to something here. Let each driver add files as such:" - Matt Domsch <Matt_Domsch@dell.com>, 12.15.2004 linux-scsi mailing list "Then, if you simply publish your LD number as an extra parameter of the device, you can look through /sys to find it." - James Bottomley <James.Bottomley@SteelEye.com>, 01.03.2005 linux-scsi mailing list "I don't see why not ... it's your driver, you can publish whatever extra information you need as scsi_device attributes; that was one of the designs of the extensible attribute system." - James Bottomley <James.Bottomley@SteelEye.com>, 01.06.2005 linux-scsi mailing list 2. Add AMI megaraid support - Brian King <brking@charter.net> PCI_VENDOR_ID_AMI, PCI_DEVICE_ID_AMI_MEGARAID3, PCI_VENDOR_ID_AMI, PCI_SUBSYS_ID_PERC3_DC, 3. Make some code static - Adrian Bunk <bunk@stusta.de> Date: Mon, 15 Nov 2004 03:14:57 +0100 The patch below makes some needlessly global code static. -wait_queue_head_t wait_q; +static wait_queue_head_t wait_q; Signed-off-by: Adrian Bunk <bunk@stusta.de> 4. Added NEC ROMB support - NEC MegaRAID PCI Express ROMB controller PCI_VENDOR_ID_LSI_LOGIC, PCI_DEVICE_ID_MEGARAID_NEC_ROMB_2E, PCI_SUBSYS_ID_NEC, PCI_SUBSYS_ID_MEGARAID_NEC_ROMB_2E, 5. Fixed Tape drive issue : For any Direct CDB command to physical device including tape, timeout value set by driver was 10 minutes. With this value, most of command will return within timeout. However, for those command like ERASE or FORMAT, it takes more than an hour depends on capacity of the device and the command could be terminated before it completes. To address this issue, the 'timeout' field in the DCDB command will have NO TIMEOUT (i.e., 4) value as its timeout on DCDB command. Release Date : Thu Dec 9 19:10:23 EST 2004 - Sreenivas Bagalkote <sreenib@lsil.com> Current Version : 2.20.4.2 (scsi module), 2.20.2.4 (cmm module) Older Version : 2.20.4.1 (scsi module), 2.20.2.3 (cmm module) i. Introduced driver ioctl that returns scsi address for a given ld. "Why can't the existing sysfs interfaces be used to do this?" - Brian King (brking@us.ibm.com) "I've looked into solving this another way, but I cannot see how to get this driver-private mapping of logical drive number-> HCTL without putting code something like this into the driver." "...and by providing a mapping a function to userspace, the driver is free to change its mapping algorithm in the future if necessary .." - Matt Domsch (Matt_Domsch@dell.com) Release Date : Thu Dec 9 19:02:14 EST 2004 - Sreenivas Bagalkote <sreenib@lsil.com> Current Version : 2.20.4.1 (scsi module), 2.20.2.3 (cmm module) Older Version : 2.20.4.1 (scsi module), 2.20.2.2 (cmm module) i. Fix a bug in kioc's dma buffer deallocation Release Date : Thu Nov 4 18:24:56 EST 2004 - Sreenivas Bagalkote <sreenib@lsil.com> Current Version : 2.20.4.1 (scsi module), 2.20.2.2 (cmm module) Older Version : 2.20.4.0 (scsi module), 2.20.2.1 (cmm module) i. Handle IOCTL cmd timeouts more properly. ii. pci_dma_sync_{sg,single}_for_cpu was introduced into megaraid_mbox incorrectly (instead of _for_device). Changed to appropriate pci_dma_sync_{sg,single}_for_device. Release Date : Wed Oct 06 11:15:29 EDT 2004 - Sreenivas Bagalkote <sreenib@lsil.com> Current Version : 2.20.4.0 (scsi module), 2.20.2.1 (cmm module) Older Version : 2.20.4.0 (scsi module), 2.20.2.0 (cmm module) i. Remove CONFIG_COMPAT around register_ioctl32_conversion Release Date : Mon Sep 27 22:15:07 EDT 2004 - Atul Mukker <atulm@lsil.com> Current Version : 2.20.4.0 (scsi module), 2.20.2.0 (cmm module) Older Version : 2.20.3.1 (scsi module), 2.20.2.0 (cmm module) i. Fix data corruption. Because of a typo in the driver, the IO packets were wrongly shared by the ioctl path. This causes a whole IO command to be replaced by an incoming ioctl command. Release Date : Tue Aug 24 09:43:35 EDT 2004 - Atul Mukker <atulm@lsil.com> Current Version : 2.20.3.1 (scsi module), 2.20.2.0 (cmm module) Older Version : 2.20.3.0 (scsi module), 2.20.2.0 (cmm module) i. Function reordering so that inline functions are defined before they are actually used. It is now mandatory for GCC 3.4.1 (current stable) Declare some heavy-weight functions to be non-inlined, megaraid_mbox_build_cmd, megaraid_mbox_runpendq, megaraid_mbox_prepare_pthru, megaraid_mbox_prepare_epthru, megaraid_busywait_mbox - Andrew Morton <akpm@osdl.org>, 08.19.2004 linux-scsi mailing list "Something else to clean up after inclusion: every instance of an inline function is actually rendered as a full function call, because the function is always used before it is defined. Atul, please re-arrange the code to eliminate the need for most (all) of the function prototypes at the top of each file, and define (not just declare with a prototype) each inline function before its first use" - Matt Domsch <Matt_Domsch@dell.com>, 07.27.2004 linux-scsi mailing list ii. Display elapsed time (countdown) while waiting for FW to boot. iii. Module compilation reorder in Makefile so that unresolved symbols do not occur when driver is compiled non-modular. Patrick J. LoPresti <patl@users.sourceforge.net>, 8.22.2004 linux-scsi mailing list Release Date : Thu Aug 19 09:58:33 EDT 2004 - Atul Mukker <atulm@lsil.com> Current Version : 2.20.3.0 (scsi module), 2.20.2.0 (cmm module) Older Version : 2.20.2.0 (scsi module), 2.20.1.0 (cmm module) i. When copying the mailbox packets, copy only first 14 bytes (for 32-bit mailboxes) and only first 22 bytes (for 64-bit mailboxes). This is to avoid getting the stale values for busy bit. We want to set the busy bit just before issuing command to the FW. ii. In the reset handling, if the reseted command is not owned by the driver, do not (wrongly) print information for the "attached" driver packet. iii. Have extended wait when issuing command in synchronous mode. This is required for the cases where the option ROM is disabled and there is no BIOS to start the controller. The FW starts to boot after receiving the first command from the driver. The current driver has 1 second timeout for the synchronous commands, which is far less than what is actually required. We now wait up to MBOX_RESET_TIME (180 seconds) for FW boot process. iv. In megaraid_mbox_product_info, clear the mailbox contents completely before preparing the command for inquiry3. This is to ensure that the FW does not get junk values in the command. v. Do away with the redundant LSI_CONFIG_COMPAT redefinition for CONFIG_COMPAT. Replace <asm/ioctl32.h> with <linux/ioctl32.h> - James Bottomley <James.Bottomley@SteelEye.com>, 08.17.2004 linux-scsi mailing list vi. Add support for 64-bit applications. Current drivers assume only 32-bit applications, even on 64-bit platforms. Use the "data" and "buffer" fields of the mimd_t structure, instead of embedded 32-bit addresses in application mailbox and passthru structures. vii. Move the function declarations for the management module from megaraid_mm.h to megaraid_mm.c - Andrew Morton <akpm@osdl.org>, 08.19.2004 linux-scsi mailing list viii. Change default values for MEGARAID_NEWGEN, MEGARAID_MM, and MEGARAID_MAILBOX to 'n' in Kconfig.megaraid - Andrew Morton <akpm@osdl.org>, 08.19.2004 linux-scsi mailing list ix. replace udelay with msleep x. Typos corrected in comments and whitespace adjustments, explicit grouping of expressions. Release Date : Fri Jul 23 15:22:07 EDT 2004 - Atul Mukker <atulm@lsil.com> Current Version : 2.20.2.0 (scsi module), 2.20.1.0 (cmm module) Older Version : 2.20.1.0 (scsi module), 2.20.0.0 (cmm module) i. Add PCI ids for Acer ROMB 2E solution ii. Add PCI ids for I4 iii. Typo corrected for subsys id for megaraid sata 300-4x iv. Remove yield() while mailbox handshake in synchronous commands "My other main gripe is things like this: + // wait for maximum 1 second for status to post + for (i = 0; i < 40000; i++) { + if (mbox->numstatus != 0xFF) break; + udelay(25); yield(); + } which litter the driver. Use of yield() in drivers is deprecated." - James Bottomley <James.Bottomley@SteelEye.com>, 07.14.2004 linux-scsi mailing list v. Remove redundant __megaraid_busywait_mbox routine vi. Fix bug in the managment module, which causes a system lockup when the IO module is loaded and then unloaded, followed by executing any management utility. The current version of management module does not handle the adapter unregister properly. Specifically, it still keeps a reference to the unregistered controllers. To avoid this, the static array adapters has been replaced by a dynamic list, which gets updated every time an adapter is added or removed. Also, during unregistration of the IO module, the resources are now released in the exact reverse order of the allocation time sequence. Release Date : Fri Jun 25 18:58:43 EDT 2004 - Atul Mukker <atulm@lsil.com> Current Version : 2.20.1.0 Older Version : megaraid 2.20.0.1 i. Stale list pointer in adapter causes kernel panic when module megaraid_mbox is unloaded Release Date : Thu Jun 24 20:37:11 EDT 2004 - Atul Mukker <atulm@lsil.com> Current Version : 2.20.0.1 Older Version : megaraid 2.20.0.00 i. Modules are not 'y' by default, but depend on current definition of SCSI & PCI. ii. Redundant structure mraid_driver_t removed. iii. Miscellaneous indentation and goto/label fixes. - Christoph Hellwig <hch@infradead.org>, 06.24.2004 linux-scsi iv. scsi_host_put(), do just before completing HBA shutdown. Release Date : Mon Jun 21 19:53:54 EDT 2004 - Atul Mukker <atulm@lsil.com> Current Version : 2.20.0.0 Older Version : megaraid 2.20.0.rc2 and 2.00.3 i. Independent module to interact with userland applications and multiplex command to low level RAID module(s). "Shared code in a third module, a "library module", is an acceptable solution. modprobe automatically loads dependent modules, so users running "modprobe driver1" or "modprobe driver2" would automatically load the shared library module." - Jeff Garzik <jgarzik@pobox.com> 02.25.2004 LKML "As Jeff hinted, if your userspace<->driver API is consistent between your new MPT-based RAID controllers and your existing megaraid driver, then perhaps you need a single small helper module (lsiioctl or some better name), loaded by both mptraid and megaraid automatically, which handles registering the /dev/megaraid node dynamically. In this case, both mptraid and megaraid would register with lsiioctl for each adapter discovered, and lsiioctl would essentially be a switch, redirecting userspace tool ioctls to the appropriate driver." - Matt Domsch <Matt_Domsch@dell.com> 02.25.2004 LKML ii. Remove C99 initializations from pci_device id. "pci_id_table_g would be much more readable when not using C99 initializers. PCI table doesn't change, there's lots of users that prefer the more readable variant. And it's really far less and much easier to grok lines without C99 initializers." - Christoph Hellwig <hch@infradead.org>, 05.28.2004 linux-scsi iii. Many fixes as suggested by Christoph Hellwig <hch@infradead.org> on linux-scsi, 05.28.2004 iv. We now support up to 32 parallel ioctl commands instead of current 1. There is a conscious effort to let memory allocation not fail for ioctl commands. v. Do away with internal memory management. Use pci_pool_(create|alloc) instead. vi. Kill tasklet when unloading the driver. vii. Do not use "host_lock', driver has fine-grain locks now to protect all data structures. viii. Optimize the build scatter-gather list routine. The callers already know the data transfer address and length. ix. Better implementation of error handling and recovery. Driver now performs extended errors recovery for instances like scsi cable pull. x. Disassociate the management commands with an overlaid scsi command. Driver now treats the management packets as special packets and has a dedicated callback routine.