aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-05-24md: restart recovery cleanly after device failure.NeilBrown
When we get any IO error during a recovery (rebuilding a spare), we abort the recovery and restart it. For RAID6 (and multi-drive RAID1) it may not be best to restart at the beginning: when multiple failures can be tolerated, the recovery may be able to continue and re-doing all that has already been done doesn't make sense. We already have the infrastructure to record where a recovery is up to and restart from there, but it is not being used properly. This is because: - We sometimes abort with MD_RECOVERY_ERR rather than just MD_RECOVERY_INTR, which causes the recovery not be be checkpointed. - We remove spares and then re-added them which loses important state information. The distinction between MD_RECOVERY_ERR and MD_RECOVERY_INTR really isn't needed. If there is an error, the relevant drive will be marked as Faulty, and that is enough to ensure correct handling of the error. So we first remove MD_RECOVERY_ERR, changing some of the uses of it to MD_RECOVERY_INTR. Then we cause the attempt to remove a non-faulty device from an array to fail (unless recovery is impossible as the array is too degraded). Then when remove_and_add_spares attempts to remove the devices on which recovery can continue, it will fail, they will remain in place, and recovery will continue on them as desired. Issue: If we are halfway through rebuilding a spare and another drive fails, and a new spare is immediately available, do we want to: 1/ complete the current rebuild, then go back and rebuild the new spare or 2/ restart the rebuild from the start and rebuild both devices in parallel. Both options can be argued for. The code currently takes option 2 as a/ this requires least code change b/ this results in a minimally-degraded array in minimal time. Cc: "Eivind Sarto" <ivan@kasenna.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24md: allow parallel resync of md-devices.Bernd Schubert
In some configurations, a raid6 resync can be limited by CPU speed (Calculating P and Q and moving data) rather than by device speed. In these cases there is nothing to be gained byt serialising resync of arrays that share a device, and doing the resync in parallel can provide benefit. So add a sysfs tunable to flag an array as being allowed to resync in parallel with other arrays that use (a different part of) the same device. Signed-off-by: Bernd Schubert <bs@q-leap.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24md: notify userspace on 'stop' eventsDan Williams
This additional notification to 'array_state' is needed to allow the monitor application to learn about stop events via sysfs. The sysfs_notify("sync_action") call that comes at the end of do_md_stop() (via md_new_event) is insufficient since the 'sync_action' attribute has been removed by this point. (Seems like a sysfs-notify-on-removal patch is a better fix. Currently removal updates the event count but does not wake up waiters) Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24md: notify userspace on 'write-pending' changes to array_stateNeilBrown
When an array enters write pending, 'array_state' changes, so we must be sure to sysfs_notify. Also, when waiting for user-space to acknowledge 'write-pending' by marking the metadata as dirty, we don't want to wait for MD_CHANGE_DEVS to be cleared as that might not happen. So explicity test for the bits that we are really interested in. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24md: raid1: Fix restoration of bio between failed read and write.NeilBrown
When performing a "recovery" or "check" pass on a RAID1 array, we read from each device and possible, if there is a difference or a read error, write back to some devices. We use the same 'bio' for both read and write, resetting various fields between the two operations. We forgot to reset bv_offset and bv_len however. These are often left unchanged, but in the case where there is an IO error one or two sectors into a page, they are changed. This results in correctable errors not being corrected properly. It does not result in any data corruption. Cc: "Fairbanks, David" <David.Fairbanks@stratus.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24md: md: raid5 rate limit error printkBernd Schubert
Last night we had scsi problems and a hardware raid unit was offlined during heavy i/o. While this happened we got for about 3 minutes a huge number messages like these Apr 12 03:36:07 pfs1n14 kernel: [197510.696595] raid5:md7: read error not correctable (sector 2993096568 on sdj2). I guess the high error rate is responsible for not scheduling other events - during this time the system was not pingable and in the end also other devices run into scsi command timeouts causing problems on these unrelated devices as well. Signed-off-by: Bernd Schubert <bernd-schubert@gmx.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24md: kill file_path wrapperChristoph Hellwig
Kill the trivial and rather pointless file_path wrapper around d_path. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24md: proper extern for mdp_majorAdrian Bunk
This patch adds a proper extern for mdp_major in include/linux/raid/md.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24md: fix possible oops when removing a bitmap from an active arrayNeilBrown
It is possible to add a write-intent bitmap to an active array, or remove the bitmap that is there. When we do with the 'quiesce' the array, which causes make_request to block in "wait_barrier()". However we are sampling the value of "mddev->bitmap" before the wait_barrier call, and using it afterwards. This can result in using a bitmap structure that has been freed. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24mm: fix atomic_t overflow in vmAlan Cox
The atomic_t type is 32bit but a 64bit system can have more than 2^32 pages of virtual address space available. Without this we overflow on ludicrously large mappings Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24types.h: don't expose struct ustat to userspacemaximilian attems
<linux/types.h> can't be used together with <sys/ustat.h> because they both define struct ustat: $ cat test.c #include <sys/ustat.h> #include <linux/types.h> $ gcc -c test.c In file included from test.c:2: /usr/include/linux/types.h:165: error: redefinition of 'struct ustat' has been reported a while ago to debian, but seems to have been lost in cat fighting: http://bugs.debian.org/429064 Signed-off-by: maximilian attems <max@stro.at> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24serial: support for InstaShield IS-400 four port RS-232 PCI cardIgnacio García Pérez
Add support for the InstaShield IS-400 four port RS-232 PCI card. Signed-off-by: Ignacio García Pérez <iggarpe@t2i.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24fix parenthesis in include/asm-mips/mach-au1x00/au1000.hMariusz Kozlowski
Parenthesis fix in include/asm-mips/mach-au1x00/au1000.h Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24fix parenthesis in include/asm-mips/gic.hMariusz Kozlowski
Parenthesis fix in include/asm-mips/gic.h Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24ibmaem: new driver for power/energy/temp meters in IBM System X hardwareDarrick J. Wong
This driver reads IBM Active Energy Manager energy/temperature/power sensors on IBM System X hardware. [akpm@linux-foundation.org: fix printk warnings] Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Cc: "Mark M. Hoffman" <mhoffman@lightlink.com> Cc: Corey Minyard <minyard@acm.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24i5k_amb: support Intel 5400 chipsetDarrick J. Wong
Minor rework to support the Intel 5400 chipset. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Cc: "Mark M. Hoffman" <mhoffman@lightlink.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24hdaps: invert the axes for HDAPS on Lenovo R61i ThinkPadsGabor Czigola
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: Jiri Kosina <jikos@jikos.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24ntfs: le*_add_cpu conversionMarcin Slusarz
replace all: little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) + expression_in_cpu_byteorder); with: leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder); generated with semantic patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Acked-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24mm: don't drop a partial page in a zone's memory map sizeJohannes Weiner
In a zone's present pages number, account for all pages occupied by the memory map, including a partial. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24MAINTAINERS: add util-linux-ng packageKarel Zak
(akpm: we often deal with util-linux and I (at least) can never remember where they hang out). Signed-off-by: Karel Zak <kzak@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24ecryptfs: fix missed mutex_unlockCyrill Gorcunov
Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24fuse: fix bdi naming conflictMiklos Szeredi
Fuse allocates a separate bdi for each filesystem, and registers them in sysfs with "MAJOR:MINOR" of sb->s_dev (st_dev). This works fine for anon devices normally used by fuse, but can conflict with an already registered BDI for "fuseblk" filesystems, where sb->s_dev represents a real block device. In particularl this happens if a non-partitioned device is being mounted. Fix by registering with a different name for "fuseblk" filesystems. Thanks to Ioan Ionita for the bug report. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reported-by: Ioan Ionita <opslynx@gmail.com> Tested-by: Ioan Ionita <opslynx@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24mm: allow pfnmap ->fault()sNick Piggin
Take out an assertion to allow ->fault handlers to service PFNMAP regions. This is required to reimplement .nopfn handlers with .fault handlers and subsequently remove nopfn. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Jes Sorensen <jes@sgi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/mad: Fix kernel crash when .process_mad() returns SUCCESS|CONSUMED IPoIB: Test for NULL broadcast object in ipiob_mcast_join_finish() MAINTAINERS: Add cxgb3 and iw_cxgb3 NIC and iWARP driver entries IB/mlx4: Fix creation of kernel QP with max number of send s/g entries IB/mthca: Fix max_sge value returned by query_device RDMA/cxgb3: Fix uninitialized variable warning in iwch_post_send() IB/mlx4: Fix uninitialized-var warning in mlx4_ib_post_send() IB/ipath: Fix UC receive completion opcode for RDMA WRITE with immediate IB/ipath: Fix printk format for ipath_sdma_status
2008-05-23IB/mad: Fix kernel crash when .process_mad() returns SUCCESS|CONSUMEDDave Olson
If a low-level driver returns IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED, handle_outgoing_dr_smp() doesn't clean up properly. The fix is to kfree the local data and break, rather than falling through. This was observed with the ipath driver, but could happen with any driver. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1027>. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-05-23Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] clarify license of freq_table.c [CPUFREQ] Remove documentation of removed ondemand tunable. [CPUFREQ] Crusoe: longrun cpufreq module reports false min freq [CPUFREQ] powernow-k8: improve error messages
2008-05-23remove debug printk from DRM suspend pathJesse Barnes
Not sure how this snuck upstream, but it really doesn't belong there. We don't need a KERN_ERR printk in the suspend path to know what's going on (at least not anymore). Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-23Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] iSeries: Remove unused mail address [POWERPC] mpic: Fix use of uninitialized variable [POWERPC] Add kernstart_addr to list of allowed symbols in prom_init [POWERPC] Fix __set_fixmap() for STRICT_MM_TYPECHECKS [POWERPC] PS3: Fix memory hotplug
2008-05-23Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6Linus Torvalds
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: [XFS] Fix memory corruption with small buffer reads [XFS] Fix inode list allocation size in writeback. [XFS] Don't allow memory reclaim to wait on the filesystem in inode [XFS] Fix fsync() b0rkage. [XFS] Include linux/random.h in all builds, not just debug builds.
2008-05-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linusLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: stop_machine: make stop_machine_run more virtualization friendly doc: add a chapter about trylock functions [Bug 9011] modules: proper cleanup of kobject without CONFIG_SYSFS module loading ELF handling: use SELFMAG instead of numeric constant
2008-05-23fbdev: fix integer as NULL pointer warningHarvey Harrison
drivers/video/aty/atyfb_base.c:3359:26: warning: Using plain integer as NULL pointer drivers/video/aty/radeon_base.c:2280:32: warning: Using plain integer as NULL pointer drivers/video/matrox/matroxfb_base.h:203:25: warning: Using plain integer as NULL pointer drivers/video/matrox/matroxfb_base.h:203:25: warning: Using plain integer as NULL pointer drivers/video/sis/sis_main.c:5790:44: warning: Using plain integer as NULL pointer Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-23scsi: fix integer as NULL pointer warningHarvey Harrison
drivers/scsi/aha152x.c:3585:60: warning: Using plain integer as NULL pointer drivers/scsi/aha152x.c:3845:56: warning: Using plain integer as NULL pointer drivers/scsi/qla1280.c:2814:37: warning: Using plain integer as NULL pointer drivers/scsi/atp870u.c:750:47: warning: Using plain integer as NULL pointer drivers/scsi/3w-9xxx.c:1281:36: warning: Using plain integer as NULL pointer drivers/scsi/3w-9xxx.c:1293:36: warning: Using plain integer as NULL pointer drivers/scsi/3w-9xxx.c:1301:35: warning: Using plain integer as NULL pointer drivers/scsi/hptiop.c:447:10: warning: Using plain integer as NULL pointer drivers/scsi/hptiop.c:457:10: warning: Using plain integer as NULL pointer drivers/scsi/hptiop.c:479:24: warning: Using plain integer as NULL pointer drivers/scsi/hptiop.c:483:22: warning: Using plain integer as NULL pointer drivers/scsi/hptiop.c:1213:23: warning: Using plain integer as NULL pointer drivers/scsi/hptiop.c:1214:23: warning: Using plain integer as NULL pointer Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-23isdn: fix integer as NULL pointer warningHarvey Harrison
drivers/isdn/hysdn/hycapi.c:465:42: warning: Using plain integer as NULL pointer drivers/isdn/hysdn/hycapi.c:467:44: warning: Using plain integer as NULL pointer drivers/isdn/hysdn/hycapi.c:469:42: warning: Using plain integer as NULL pointer Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-23acpi: fix integer as NULL pointer warningHarvey Harrison
drivers/acpi/dispatcher/dsmethod.c:568:50: warning: Using plain integer as NULL pointer drivers/acpi/executer/exmutex.c:329:30: warning: Using plain integer as NULL pointer drivers/acpi/executer/exmutex.c:466:31: warning: Using plain integer as NULL pointer Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-23x86: fix integer as NULL pointer warningHarvey Harrison
arch/x86/boot/printf.c:59:10: warning: Using plain integer as NULL pointer Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-23[XFS] Fix memory corruption with small buffer readsChristoph Hellwig
When we have multiple buffers in a single page for a blocksize == pagesize filesystem we might overwrite the page contents if two callers hit it shortly after each other. To prevent that we need to keep the page locked until I/O is completed and the page marked uptodate. Thanks to Eric Sandeen for triaging this bug and finding a reproducible testcase and Dave Chinner for additional advice. This should fix kernel.org bz #10421. Tested-by: Eric Sandeen <sandeen@sandeen.net> SGI-PV: 981813 SGI-Modid: xfs-linux-melb:xfs-kern:31173a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-05-23[POWERPC] iSeries: Remove unused mail addressStephen Rothwell
I don't use my IBM email address normally and people can find me in CREDITS. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-23[POWERPC] mpic: Fix use of uninitialized variableStephen Rothwell
Compiling ppc64_defconfig with gcc 4.3 gives thes warnings: arch/powerpc/sysdev/mpic.c: In function 'mpic_irq_get_priority': arch/powerpc/sysdev/mpic.c:1351: warning: 'is_ipi' may be used uninitialized in this function arch/powerpc/sysdev/mpic.c: In function 'mpic_irq_set_priority': arch/powerpc/sysdev/mpic.c:1328: warning: 'is_ipi' may be used uninitialized in this function It turns out that in the cases where is_ipi is uninitialized, another variable (mpic) will be NULL and it is dereferenced. Protect against this by returning if mpic is NULL in mpic_irq_set_priority, and removing mpic_irq_get_priority completely as it has no in tree callers. This has the nice side effect of making the warning go away. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-23[POWERPC] Add kernstart_addr to list of allowed symbols in prom_initMichael Ellerman
Since commit "85xx: Add support for relocatable kernel (and booting at non-zero)" (37dd2badcfcec35f5e21a0926968d77a404f03c3), PHYSICAL_START is #defined as kernstart_addr if RELOCATABLE and FLATMEM is enabled. PHYSICAL_START is used in prom_init.c and so kernstart_addr needs to be added to the list of allowed symbols that prom_init.c can access. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-23[POWERPC] Fix __set_fixmap() for STRICT_MM_TYPECHECKSDavid Gibson
__set_fixmap() in pgtable_32.c currently fails to compile if STRICT_MM_TYPECHECKS is defined. This fixes it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-23[POWERPC] PS3: Fix memory hotplugGeoff Levand
A change was made to walk_memory_resource() in commit 4b119e21d0c66c22e8ca03df05d9de623d0eb50f that added a check of find_lmb(). Add the coresponding lmb_add() call to ps3_mm_add_memory() so that that check will succeed. This fixes the condition where the PS3 boots up with only the 128 MiB of boot memory, and doesn't see the other 128MiB that is available. Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-23[XFS] Fix inode list allocation size in writeback.David Chinner
We only need to allocate space for the number of inodes in the cluster when writing back inodes, not every byte in the inode cluster. This reduces the amount of memory needing to be allocated to 256 bytes instead of 64k. SGI-PV: 981949 SGI-Modid: xfs-linux-melb:xfs-kern:31182a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-05-23[XFS] Don't allow memory reclaim to wait on the filesystem in inodeDavid Chinner
writeback If we allow memory reclaim to wait on the pages under writeback in inode cluster writeback we could deadlock because we are currently holding the ILOCK on the initial writeback inode which is needed in data I/O completion to change the file size or do unwritten extent conversion before the pages are taken out of writeback state. SGI-PV: 981091 SGI-Modid: xfs-linux-melb:xfs-kern:31015a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-05-23[XFS] Fix fsync() b0rkage.David Chinner
xfs_fsync() fails to wait for data I/O completion before checking if the inode is dirty or clean to decide whether to log the inode or not. This misses inode size updates when the data flushed by the fsync() is extending the file. Hence, like fdatasync(), we need to wait for I/o completion first, then check the inode for cleanliness. Doing so makes the behaviour of xfs_fsync() identical for fsync and fdatasync and we *always* use synchronous semantics if the inode is dirty. Therefore also kill the differences and remove the unused flags from the xfs_fsync function and callers. SGI-PV: 981296 SGI-Modid: xfs-linux-melb:xfs-kern:31033a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-05-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into ↵Lachlan McIlroy
for-linus
2008-05-23stop_machine: make stop_machine_run more virtualization friendlyChristian Borntraeger
On kvm I have seen some rare hangs in stop_machine when I used more guest cpus than hosts cpus. e.g. 32 guest cpus on 1 host cpu triggered the hang quite often. I could also reproduce the problem on a 4 way z/VM host with a 64 way guest. It turned out that the guest was consuming all available cpus mostly for spinning on scheduler locks like rq->lock. This is expected as the threads are calling yield all the time. The problem is now, that the host scheduling decisings together with the guest scheduling decisions and spinlocks not being fair managed to create an interesting scenario similar to a live lock. (Sometimes the hang resolved itself after some minutes) Changing stop_machine to yield the cpu to the hypervisor when yielding inside the guest fixed the problem for me. While I am not completely happy with this patch, I think it causes no harm and it really improves the situation for me. I used cpu_relax for yielding to the hypervisor, does that work on all architectures? p.s.: If you want to reproduce the problem, cpu hotplug and kprobes use stop_machine_run and both triggered the problem after some retries. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> CC: Ingo Molnar <mingo@elte.hu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-23doc: add a chapter about trylock functions [Bug 9011]Matti Linnanvuori
Add a chapter about trylock functions. http://bugzilla.kernel.org/show_bug.cgi?id=9011 Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed down_trylock)
2008-05-23modules: proper cleanup of kobject without CONFIG_SYSFSDenis V. Lunev
kobject: '<NULL>' (ffffffffa0104050): is not initialized, yet kobject_put() is being called. ------------[ cut here ]------------ WARNING: at /home/den/src/linux-netns26/lib/kobject.c:583 kobject_put+0x53/0x55() Modules linked in: ipv6 nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs ide_cd_mod cdrom button [last unloaded: pktgen] comm: rmmod Tainted: G W 2.6.26-rc3 #585 Call Trace: [<ffffffff802359ab>] warn_on_slowpath+0x58/0x7a [<ffffffff80236aca>] ? printk+0x67/0x69 [<ffffffff80236aca>] ? printk+0x67/0x69 [<ffffffff80324289>] kobject_put+0x53/0x55 [<ffffffff8025e2ee>] free_module+0x87/0xfa [<ffffffff8025fee5>] sys_delete_module+0x178/0x1e1 [<ffffffff804b1e70>] ? lockdep_sys_exit_thunk+0x35/0x67 [<ffffffff804b1dff>] ? trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8020c0bb>] system_call_after_swapgs+0x7b/0x80 ---[ end trace 8f5aafa7f6406cf8 ]--- mod->mkobj.kobj is not initialized without CONFIG_SYSFS. Do not call kobject_put in this case. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-23module loading ELF handling: use SELFMAG instead of numeric constantCyrill Gorcunov
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-22[CPUFREQ] clarify license of freq_table.cDominik Brodowski
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Dave Jones <davej@redhat.com>