aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-04-06scsi: fix sense_slab/bio swapping livelockHugh Dickins
Since 2.6.25-rc7, I've been seeing an occasional livelock on one x86_64 machine, copying kernel trees to tmpfs, paging out to swap. Signature: 6000 pages under writeback but never getting written; most tasks of interest trying to reclaim, but each get_swap_bio waiting for a bio in mempool_alloc's io_schedule_timeout(5*HZ); every five seconds an atomic page allocation failure report from kblockd failing to allocate a sense_buffer in __scsi_get_command. __scsi_get_command has a (one item) free_list to protect against this, but rc1's [SCSI] use dynamically allocated sense buffer de25deb18016f66dcdede165d07654559bb332bc upset that slightly. When it fails to allocate from the separate sense_slab, instead of giving up, it must fall back to the command free_list, which is sure to have a sense_buffer attached. Either my earlier -rc testing missed this, or there's some recent contributory factor. One very significant factor is SLUB, which merges slab caches when it can, and on 64-bit happens to merge both bio cache and sense_slab cache into kmalloc's 128-byte cache: so that under this swapping load, bios above are liable to gobble up all the slots needed for scsi_cmnd sense_buffers below. That's disturbing behaviour, and I tried a few things to fix it. Adding a no-op constructor to the sense_slab inhibits SLUB from merging it, and stops all the allocation failures I was seeing; but it's rather a hack, and perhaps in different configurations we have other caches on the swapout path which are ill-merged. Another alternative is to revert the separate sense_slab, using cache-line-aligned sense_buffer allocated beyond scsi_cmnd from the one kmem_cache; but that might waste more memory, and is only a way of diverting around the known problem. While I don't like seeing the allocation failures, and hate the idea of all those bios piled up above a scsi host working one by one, it does seem to emerge fairly soon with the livelock fix. So lacking better ideas, stick with that one clear fix for now. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Zijlstra <a.p.ziljstra@chello.nl> Cc: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-05Revert "ACPI: Ignore _BQC object when registering backlight device"Linus Torvalds
This reverts commit 7c0ea45be4f114d85ee35caeead8e1660699c46f which caused a regression with the backlight being set to off when a laptop doesn't have a _BQC entry to query the actual backlight value. The code blindly then falls back on a value of 0. See http://bugzilla.kernel.org/show_bug.cgi?id=10387 http://lkml.org/lkml/2008/4/2/366 for details. Bisected-and-reported-by: Andrey Borzenkov <arvidjaar@mail.ru> Cc: Zhao Yakui <yakui.zhao@intel.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Len Brown <len.brown@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04Merge branch 'upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ralf/upstream-linus * 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/ralf/upstream-linus: [MIPS] Make KGDB compile on UP [MIPS] Pb1200: Fix header breakage
2008-04-04ipmi: change device node ordering to reflect probe orderCarol Hebert
In 2.6.14 a patch was merged which switching the order of the ipmi device naming from in-order-of-discovery over to reverse-order-of-discovery. So on systems with multiple BMC interfaces, the ipmi device names are being created in reverse order relative to how they are discovered on the system (e.g. on an IBM x3950 multinode server with N nodes, the device name for the BMC in the first node is /dev/ipmiN-1 and the device name for the BMC in the last node is /dev/ipmi0, etc.). The problem is caused by the list handling routines chosen in dmi_scan.c. Using list_add() causes the multiple ipmi devices to be added to the device list using a stack-paradigm and so the ipmi driver subsequently pulls them off during initialization in LIFO order. This patch changes the dmi_save_ipmi_device() list handling paradigm to a queue, thereby allowing the ipmi driver to build the ipmi device names in the order in which they are found on the system. Signed-off-by: Carol Hebert <cah@us.ibm.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04mtd: fix broken state in CFI driver caused by FL_SHUTDOWNAlexey Korolev
THe CFI driver in 2.6.24 kernel is broken. Not so intensive read/write operations cause incomplete writes which lead to kernel panics in JFFS2. We investigated the issue - it is caused by bug in FL_SHUTDOWN parsing code. Sometimes chip returns -EIO as if it is in FL_SHUTDOWN state when it should wait in FL_PONT (error in order of conditions). The following patch fixes the bug in state parsing code of CFI. Also I've added comments to notify developers if they want to add new case in future. Signed-off-by: Alexey Korolev <akorolev@infradead.org> Reviewed-by: Joern Engel <joern@logfs.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04memory controller: make memory resource control aware of boot optionsBalbir Singh
A boot option for the memory controller was discussed on lkml. It is a good idea to add it, since it saves memory for people who want to turn off the memory controller. By default the option is on for the following two reasons: 1. It provides compatibility with the current scheme where the memory controller turns on if the config option is enabled 2. It allows for wider testing of the memory controller, once the config option is enabled We still allow the create, destroy callbacks to succeed, since they are not aware of boot options. We do not populate the directory will memory resource controller specific files. Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04cgroups: add cgroup support for enabling controllers at boot timePaul Menage
The effects of cgroup_disable=foo are: - foo isn't auto-mounted if you mount all cgroups in a single hierarchy - foo isn't visible as an individually mountable subsystem As a result there will only ever be one call to foo->create(), at init time; all processes will stay in this group, and the group will never be mounted on a visible hierarchy. Any additional effects (e.g. not allocating metadata) are up to the foo subsystem. This doesn't handle early_init subsystems (their "disabled" bit isn't set be, but it could easily be extended to do so if any of the early_init systems wanted it - I think it would just involve some nastier parameter processing since it would occur before the command-line argument parser had been run. Hugh said: Ballpark figures, I'm trying to get this question out rather than processing the exact numbers: CONFIG_CGROUP_MEM_RES_CTLR adds 15% overhead to the affected paths, booting with cgroup_disable=memory cuts that back to 1% overhead (due to slightly bigger struct page). I'm no expert on distros, they may have no interest whatever in CONFIG_CGROUP_MEM_RES_CTLR=y; and the rest of us can easily build with or without it, or apply the cgroup_disable=memory patches. Unix bench's execl test result on x86_64 was == just after boot without mounting any cgroup fs.== mem_cgorup=off : Execl Throughput 43.0 3150.1 732.6 mem_cgroup=on : Execl Throughput 43.0 2932.6 682.0 == [lizf@cn.fujitsu.com: fix boot option parsing] Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04[MIPS] Make KGDB compile on UPSergei Shtylyov
Building UP kernel with KGDB enabled produces the following errors and warning (fatal due to -Werror in arch/mips/kernel/Makefile): In file included from arch/mips/kernel/gdb-stub.c:142: include/asm/smp.h:25:1: "raw_smp_processor_id" redefined In file included from include/linux/sched.h:69, from arch/mips/kernel/gdb-stub.c:126: include/linux/smp.h:88:1: this is the location of the previous definition In file included from arch/mips/kernel/gdb-stub.c:142: include/asm/smp.h:62: error: redefinition of 'smp_send_reschedule' include/linux/smp.h:102: error: previous definition of 'smp_send_reschedule' was here include/asm/smp.h: In function `smp_send_reschedule': include/asm/smp.h:65: error: dereferencing pointer to incomplete type arch/mips/kernel/gdb-stub.c: At top level: arch/mips/kernel/gdb-stub.c:660: warning: 'kgdb_wait' defined but not used Fix the errors by not directly including <asm/smp.h> (which is already included by <linux/smp.h>) and the warning by enclosing kgdb_wait() in #ifdef CONFIG_SMP. Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-04-04[MIPS] Pb1200: Fix header breakageSergei Shtylyov
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-04-04Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: revert assign IRQs to hpet timer x86: tsc prevent time going backwards xen: Clear PG_pinned in release_{pt,pd}() xen: Do not pin/unpin PMD pages xen: refactor xen_{alloc,release}_{pt,pd}() x86, agpgart: scary messages are fortunately obsolete xen: fix grant table bug x86: fix breakage of vSMP irq operations x86: print message if nmi_watchdog=2 cannot be enabled x86: fix nmi_watchdog=2 on Pentium-D CPUs
2008-04-04m68k: update defconfigs for 2.6.25Geert Uytterhoeven
Long overdue update of the m68k defconfigs Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04m68k: use KBUILD_DEFCONFIGAdrian Bunk
The default defconfig should be one from arch/m68k/configs/ arch/m68k/defconfig was not exactly identical to amiga_defconfig but also considering how long they have been without any update that doesn't seem to have been on purpose. Signed-off-by: Adrian Bunk <adrian.bunk@movial.fi> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: pata_ali: disable ATAPI DMA libata: ATA_12/16 doesn't fall into ATAPI_MISC libata: uninline atapi_cmd_type() libata: fix IDENTIFY order in ata_bus_probe()
2008-04-04Be more careful about marking buffers dirtyLinus Torvalds
Mikulas Patocka noted that the optimization where we check if a buffer was already dirty (and we avoid re-dirtying it) was not really SMP-safe. Since the read of the old status was not synchronized with anything, an aggressive CPU re-ordering of memory accesses might have moved that read up to before the data was even written to the buffer, and another CPU that cleaned it again, causing the newly dirty state to never actually hit the disk. Admittedly this would probably never trigger in practice, but it's still wrong. Mikulas sent a patch that fixed the problem, but I dislike the subtlety of the whole optimization, so this is an alternate fix that is more explicit about the particular SMP ordering for the optimization, and separates out the speculative reads of the buffer state into its own conditional (and makes the memory barrier only happen if we are likely to actually hit the optimized case in the first place). I considered removing the optimization entirely, but Andrew argued for it's continued existence. I'm a push-over. Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04parport_pc: make sure to release IO ports after probing for IT87XXLinus Torvalds
Commit f63fd7e299ee13da071ecfce2b90b58c5e1562b1 ("parport_pc: detection for SuperIO IT87XX POST") only released the IO port region on success, not when the probe for the IT87XX chip failed. That caused not only a reserved region to leak, but also caused an oops when the driver module was unloaded and somebody tried to cat /proc/ioports - because the string that was assigned to the IO port region was a static string in the module virtual address area. Reported-by: Lubos Lunak <l.lunak@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Petr Cvek <petr.cvek@tul.cz> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04x86: revert assign IRQs to hpet timerThomas Gleixner
The commits: commit 37a47db8d7f0f38dac5acf5a13abbc8f401707fa Author: Balaji Rao <balajirrao@gmail.com> Date: Wed Jan 30 13:30:03 2008 +0100 x86: assign IRQs to HPET timers, fix and commit e3f37a54f690d3e64995ea7ecea08c5ab3070faf Author: Balaji Rao <balajirrao@gmail.com> Date: Wed Jan 30 13:30:03 2008 +0100 x86: assign IRQs to HPET timers have been identified to cause a regression on some platforms due to the assignement of legacy IRQs which makes the legacy devices connected to those IRQs disfunctional. Revert them. This fixes http://bugzilla.kernel.org/show_bug.cgi?id=10382 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04x86: tsc prevent time going backwardsThomas Gleixner
We already catch most of the TSC problems by sanity checks, but there is a subtle bug which has been in the code for ever. This can cause time jumps in the range of hours. This was reported in: http://lkml.org/lkml/2007/8/23/96 and http://lkml.org/lkml/2008/3/31/23 I was able to reproduce the problem with a gettimeofday loop test on a dual core and a quad core machine which both have sychronized TSCs. The TSCs seems not to be perfectly in sync though, but the kernel is not able to detect the slight delta in the sync check. Still there exists an extremly small window where this delta can be observed with a real big time jump. So far I was only able to reproduce this with the vsyscall gettimeofday implementation, but in theory this might be observable with the syscall based version as well. CPU 0 updates the clock source variables under xtime/vyscall lock and CPU1, where the TSC is slighty behind CPU0, is reading the time right after the seqlock was unlocked. The clocksource reference data was updated with the TSC from CPU0 and the value which is read from TSC on CPU1 is less than the reference data. This results in a huge delta value due to the unsigned subtraction of the TSC value and the reference value. This algorithm can not be changed due to the support of wrapping clock sources like pm timer. The huge delta is converted to nanoseconds and added to xtime, which is then observable by the caller. The next gettimeofday call on CPU1 will show the correct time again as now the TSC has advanced above the reference value. To prevent this TSC specific wreckage we need to compare the TSC value against the reference value and return the latter when it is larger than the actual TSC value. I pondered to mark the TSC unstable when the readout is smaller than the reference value, but this would render an otherwise good and fast clocksource unusable without a real good reason. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04xen: Clear PG_pinned in release_{pt,pd}()Mark McLoughlin
Signed-off-by: Mark McLoughlin <markmc@redhat.com> Cc: xen-devel@lists.xensource.com Cc: Mark McLoughlin <markmc@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04xen: Do not pin/unpin PMD pagesMark McLoughlin
i.e. with this simple test case: int fd = open("/dev/zero", O_RDONLY); munmap(mmap((void *)0x40000000, 0x1000_LEN, PROT_READ, MAP_PRIVATE, fd, 0), 0x1000); close(fd); we currently get: kernel BUG at arch/x86/xen/enlighten.c:678! ... EIP is at xen_release_pt+0x79/0xa9 ... Call Trace: [<c041da25>] ? __pmd_free_tlb+0x1a/0x75 [<c047a192>] ? free_pgd_range+0x1d2/0x2b5 [<c047a2f3>] ? free_pgtables+0x7e/0x93 [<c047b272>] ? unmap_region+0xb9/0xf5 [<c047c1bd>] ? do_munmap+0x193/0x1f5 [<c047c24f>] ? sys_munmap+0x30/0x3f [<c0408cce>] ? syscall_call+0x7/0xb ======================= and xen complains: (XEN) mm.c:2241:d4 Mfn 1cc37 not pinned Further details at: https://bugzilla.redhat.com/436453 Signed-off-by: Mark McLoughlin <markmc@redhat.com> Cc: xen-devel@lists.xensource.com Cc: Mark McLoughlin <markmc@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04xen: refactor xen_{alloc,release}_{pt,pd}()Mark McLoughlin
Signed-off-by: Mark McLoughlin <markmc@redhat.com> Cc: xen-devel@lists.xensource.com Cc: Mark McLoughlin <markmc@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04x86, agpgart: scary messages are fortunately obsoletePavel Machek
Fix obsolete printks in aperture-64. We used not to handle missing agpgart, but we handle it okay now. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04xen: fix grant table bugMichael Abd-El-Malek
fix memory corruption and crash due to mis-sized grant table. A PV OS has two grant table data structures: the grant table itself and a free list. The free list is composed of an array of pages, which grow dynamically as the guest OS requires more grants. While the grant table contains 8-byte entries, the free list contains 4-byte entries. So we have half as many pages in the free list than in the grant table. There was a bug in the free list allocation code. The free list was indexed as if it was the same size as the grant table. But it's only half as large. So memory got corrupted, and I was seeing crashes in the slab allocator later on. Taken from: http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/4018c0da3360 Signed-off-by: Michael Abd-El-Malek <mabdelmalek@cmu.edu> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04x86: fix breakage of vSMP irq operationsRavikiran G Thirumalai
25-rc* stopped working with CONFIG_X86_VSMP on vSMP machines. Looks like the vsmp irq ops got accidentally removed during merge of x86_64 pvops in 2.6.25. -- commit 6abcd98ffafbff81f0bfd7ee1d129e634af13245 removed vsmp irq ops. Tested with both CONFIG_X86_VSMP and without CONFIG_X86_VSMP, on vSMP and non vSMP x86_64 machines. Please apply. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04x86: print message if nmi_watchdog=2 cannot be enabledIngo Molnar
right now if there's no CPU support for nmi_watchdog=2 we'll just refuse it silently. print a useful warning. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04x86: fix nmi_watchdog=2 on Pentium-D CPUsIngo Molnar
implement nmi_watchdog=2 on this class of CPUs: cpu family : 15 model : 6 model name : Intel(R) Pentium(R) D CPU 3.00GHz the watchdog's ->setup() method is safe anyway, so if the CPU cannot support it we'll bail out safely. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04pata_ali: disable ATAPI DMATejun Heo
ATAPI DMA just doesn't work reliably on pata_ali. The IDE driver can do it but for some mysterious reason, pata_ali can't. This patch disables it by default and makes the driver whine during initialization. "pata_ali.atapi_dma" parameter is added so that user can bypass the workaround. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-04libata: ATA_12/16 doesn't fall into ATAPI_MISCTejun Heo
SAT passthrus don't really fit into ATAPI_MISC class. SAT passthru commands always transfer multiple of 512 bytes and variable length response is not allowed. This patch creates a separate category - ATAPI_PASS_THRU - for these. This fixes HSM violation on "hdparm -I". Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-04libata: uninline atapi_cmd_type()Tejun Heo
Uninline atapi_cmd_type(). It doesn't really have to be inline and more case will be added which need to access unexported libata variable. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-04libata: fix IDENTIFY order in ata_bus_probe()Bartlomiej Zolnierkiewicz
Commit f58229f8060055b08b34008ea08f31de1e2f003c accidentally made ata_bus_probe() not use reverse order probing. Fix it. There currently isn't any PATA driver which uses obsolete ata_bus_probe() path, so this patch is mainly for correctness. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: selinux: prevent rentry into the FS
2008-04-03x86 ptrace: avoid unnecessary wrmsrRoland McGrath
This avoids using wrmsr on MSR_IA32_DEBUGCTLMSR when it's not needed. No wrmsr ever needs to be done if noone has ever used block stepping. Without this change, using ptrace on 2.6.25 on an x86 KVM guest will tickle KVM's missing support for the MSR and crash the guest kernel. Though host KVM is the buggy one, this makes for a regression in the guest behavior from 2.6.24->2.6.25 that we can easily avoid. I also corrected some bad whitespace. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: appletouch - add product IDs for the 4th generation MacBooks
2008-04-03Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Fix MPC5200 (not B!) device tree so FEC ethernet works [POWERPC] mpc5200: Amalgamated DTS fixes and updates [POWERPC] Fix rtas_flash procfs interface [POWERPC] Fix deadlock with mmu_hash_lock in hash_page_sync [POWERPC] Fix iSeries hard irq enabling regression [POWERPC] Fix CPM2 SCC1 clock initialization. [POWERPC] Fix defconfigs so we dont set both GENRTC and RTCLIB [POWERPC] fsldma: Use compatiable binding as spec [POWERPC] sata_fsl: reduce compatibility to fsl,pq-sata [POWERPC] 83xx: enable usb in 837x rdb and 83xx defconfigs [POWERPC] 83xx: Fix wrong USB phy type in mpc837xrdb dts
2008-04-03rxrpc: remove smp_processor_id() from debug macroSven Schnelle
Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-03afs: remove smp_prcessor_id() from debug macroSven Schnelle
Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-03splice: use mapping_gfp_maskHugh Dickins
The loop block driver is careful to mask __GFP_IO|__GFP_FS out of its mapping_gfp_mask, to avoid hangs under memory pressure. But nowadays it uses splice, usually going through __generic_file_splice_read. That must use mapping_gfp_mask instead of GFP_KERNEL to avoid those hangs. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04selinux: prevent rentry into the FSJosef Bacik
BUG fix. Keep us from re-entering the fs when we aren't supposed to. See discussion at http://marc.info/?t=120716967100004&r=1&w=2 Signed-off-by: Josef Bacik <jbacik@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2008-04-03[POWERPC] Fix MPC5200 (not B!) device tree so FEC ethernet worksRené Bürgel
This gets the FEC ethernet driver working again on the lite5200 platform. The FEC driver is also compatible with the MPC5200, not only with the MPC5200B, so this adds a suitable entry to the driver's match list. Furthermore this adds the settings for the PHY in the dts file for the Lite5200. Note, that this is not exactly the same as in the Lite5200B, because the PHY is located at f0003000:01 for the 5200, and at :00 for the 5200B. This was tested on a Lite5200 and a Lite5200B, both booted a kernel via tftp and mounted the root via nfs successfully. Signed-off-by: René Bürgel <r.buergel@unicontrol.de> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-03[POWERPC] mpc5200: Amalgamated DTS fixes and updatesBartlomiej Sieka
DTS updates that fix booting problems on mpc5200-based boards: - change to ethernet reg property - addition of mdio and phy nodes - removal of pci node (Motion-Pro board) Other DTS updates: - update i2c device tree nodes - add lpb bus node and flash device (without partitions defined) - add rtc i2c nodes Signed-off-by: Marian Balakowicz <m8@semihalf.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-03[POWERPC] Fix rtas_flash procfs interfaceMaxim Shchetynin
Handling of the proc_dir_entry->count was changed in 2.6.24-rc5. After this change, the default value for pde->count is 1 and not 0 as before. Therefore, if we want to check whether our procfs file is already opened (already in use), we have to check if pde->count is greater than 2 rather than 1. Signed-off-by: Maxim Shchetynin <maxim@de.ibm.com> Signed-off-by: Jens Osterkamp <jens@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-03[POWERPC] Fix deadlock with mmu_hash_lock in hash_page_syncBenjamin Herrenschmidt
hash_page_sync() takes and releases the low level mmu hash lock in order to sync with other processors disposing of page tables. Because that lock can be needed to service hash misses triggered by interrupt handlers, taking it must be done with interrupts off. However, hash_page_sync() appears to be called with interrupts enabled, thus causing occasional deadlocks. We fix it by making sure hash_page_sync() masks interrupts while holding the lock. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-03[POWERPC] Fix iSeries hard irq enabling regressionBenjamin Herrenschmidt
A subtle bug sneaked into iSeries recently. On this platform, we must not normally clear MSR:EE (the hardware external interrupt enable) except for short periods of time. Taking an interrupt while soft-disabled doesn't cause us to clear it for example. The iSeries kernel expects to mostly run with MSR:EE enabled at all times except in a few exception entry/exit code paths. Thus local_irq_enable() doesn't check if it needs to hard-enable as it expects this to be unnecessary on iSeries. However, hard_irq_disable() _does_ cause MSR:EE to be cleared, including on iSeries. A call to it was recently added to the context switch code, thus causing interrupts to become disabled for a long periods of time, causing the iSeries watchdog to kick in under some circumstances and other nasty things. This patch fixes it by making local_irq_enable() properly re-enable MSR:EE on iSeries. It basically removes a return statement here to make iSeries use the same code path as everybody else. That does mean that we might occasionally get spurious decrementer interrupts but I don't think that matters. Another option would have been to make hard_irq_disable() a nop on iSeries but I didn't like it much, in case we have good reasons to hard-disable. Part of the patch is fixes to make sure the hard_enabled PACA field is properly set on iSeries as it used not to be before, since it was mostly unused. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-03[POWERPC] Fix CPM2 SCC1 clock initialization.Laurent Pinchart
A missing break statement in a switch caused cpm2_clk_setup() to initialize SCC2 instead of SCC1. Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: USB: ohci: fix 2 timers to fire at jiffies + 1s USB: Allow initialization of broken keyspan serial adapters. USB: fix bug in sg initialization in usbtest USB: serial: fix regression in Visor/Palm OS module for kernels >= 2.6.24 USB: cp2101: Add identifiers for the Telegesys ETRX2USB USB: serial: ti_usb_3410_5052: Correct TUSB3410 endpoint requirements. USB: another ehci_iaa_watchdog fix
2008-04-02alpha: get_current(): don't add zero to current_thread_info()->taskAndrew Morton
A nasty compile error: In file included from security/keys/internal.h:16, from security/keys/sysctl.c:14: include/linux/key-ui.h: In function 'key_permission': include/linux/key-ui.h:51: error: invalid use of undefined type 'struct task_struct' apparently the compiler has decided that it needs to know sizeof(task_struct) so that it can add zero to a task_struct* (which is rather dumb of it). Getting task_struct in scope in these deeply-nested headers is scary-looking, so let's just remove the "+ 0". Cc: David Howells <dhowells@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02markers: use synchronize_sched()Mathieu Desnoyers
Markers do not mix well with CONFIG_PREEMPT_RCU because it uses preempt_disable/enable() and not rcu_read_lock/unlock for minimal intrusiveness. We would need call_sched and sched_barrier primitives. Currently, the modification (connection and disconnection) of probes from markers requires changes to the data structure done in RCU-style : a new data structure is created, the pointer is changed atomically, a quiescent state is reached and then the old data structure is freed. The quiescent state is reached once all the currently running preempt_disable regions are done running. We use the call_rcu mechanism to execute kfree() after such quiescent state has been reached. However, the new CONFIG_PREEMPT_RCU version of call_rcu and rcu_barrier does not guarantee that all preempt_disable code regions have finished, hence the race. The "proper" way to do this is to use rcu_read_lock/unlock, but we don't want to use it to minimize intrusiveness on the traced system. (we do not want the marker code to call into much of the OS code, because it would quickly restrict what can and cannot be instrumented, such as the scheduler). The temporary fix, until we get call_rcu_sched and rcu_barrier_sched in mainline, is to use synchronize_sched before each call_rcu calls, so we wait for the quiescent state in the system call code path. It will slow down batch marker enable/disable, but will make sure the race is gone. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02vmcoreinfo: add the symbol "phys_base"Ken'ichi Ohmichi
Fix the problem that makedumpfile sometimes fails on x86_64 machine. This patch adds the symbol "phys_base" to a vmcoreinfo data. The vmcoreinfo data has the minimum debugging information only for dump filtering. makedumpfile (dump filtering command) gets it to distinguish unnecessary pages, and makedumpfile creates a small dumpfile. On x86_64 kernel which compiled with CONFIG_PHYSICAL_START=0x0 and CONFIG_RELOCATABLE=y, makedumpfile fails like the following: # makedumpfile -d31 /proc/vmcore dumpfile The kernel version is not supported. The created dumpfile may be incomplete. _exclude_free_page: Can't get next online node. makedumpfile Failed. # The cause is the lack of the symbol "phys_base" in a vmcoreinfo data. If the symbol "phys_base" does not exist, makedumpfile considers an x86_64 kernel as non relocatable. As the result, makedumpfile misunderstands the physical address where the kernel is loaded, and it cannot translate a kernel virtual address to physical address correctly. To fix this problem, this patch adds the symbol "phys_base" to a vmcoreinfo data. Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@kernel.org> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02efs: update error msg to not refer to deleted read_inode()Robert P. J. Day
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02Char: rio, fix sparse warningsJiri Slaby
Add some locks and unlocks to some code paths. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02Char: ip2, fix sparse warningsJiri Slaby
Unlock two grabbed locks on some paths. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>