Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2008-07-03	cfq-iosched: add message logging through blktrace	Jens Axboe
	Now that blktrace has the ability to carry arbitrary messages in its stream, use that for some CFQ logging. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03	as-iosched: properly protect ioc_gone and ioc count	Jens Axboe
	If we have multiple tasks freeing io contexts when as-iosched is being unloaded, we could complete() ioc_gone twice. Fix that by protecting ioc_gone complete() and clearing with a spinlock for just that purpose. Doesn't matter from a performance perspective, since it'll only enter that path when ioc_gone != NULL (when as-iosched is being rmmod'ed). Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03	cfq-iosched: properly protect ioc_gone and ioc count	Jens Axboe
	If we have multiple tasks freeing cfq_io_contexts when cfq-iosched is being unloaded, we could complete() ioc_gone twice. Fix that by protecting ioc_gone complete() and clearing with a spinlock for just that purpose. Doesn't matter from a performance perspective, since it'll only enter that path when ioc_gone != NULL (when cfq-iosched is being rmmod'ed). Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-02	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: acpiphp: cleanup notify handler on all root bridges PCI: Limit VPD read/write lengths for Broadcom 5706, 5708, 5709 rev. PCI: Restrict VPD read permission to root
2008-07-02	Merge branch 'i2c-fix' of git://aeryn.fluff.org.uk/bjdooks/linux	Linus Torvalds
	* 'i2c-fix' of git://aeryn.fluff.org.uk/bjdooks/linux: I2C: S3C2410: Add MODULE_ALIAS() for s3c2440 device. I2C: S3C2410: Fixup error codes returned rom a transfer. I2C: S3C2410: Check ACK on byte transmission
2008-07-02	Merge branch 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds
	* 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block: Properly notify block layer of sync writes block: Fix the starving writes bug in the anticipatory IO scheduler
2008-07-02	Merge branch 'release' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] export account_system_vtime [IA64] Bugfix for system with 32 cpus
2008-07-02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: V4L/DVB (8178): uvc: Fix compilation breakage for the other drivers, if uvc is selected V4L/DVB (8145a): USB Video Class driver
2008-07-02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: ide: fix /proc/ide/ide?/mate reporting Revert "BAST: Remove old IDE driver"
2008-07-02	Merge master.kernel.org:/home/rmk/linux-2.6-arm	Linus Torvalds
	* master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] 5131/1: Annotate platform_secondary_init with trace_hardirqs_off [ARM] 5117/1: pxafb: fix __devinit/exit annotations [ARM] Export dma_sync_sg_for_device() [ARM] 5109/1: Mark rtc sa1100 driver as wakeup source before registering it [ARM] 5116/1: pxafb: cleanup and fix order of failure handling [ARM] 5115/1: pxafb: fix ifdef for command line option handling ARM: OMAP: Correcting the gpmc prefetch control register address ARM: OMAP: DMA: Don't mark channel active in omap_enable_channel_irq
2008-07-02	tty: Fix inverted logic in send_break	Alan Cox
	Not sure how this came to get inverted but it appears to have been my mess up. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-02	Merge branch 'sched-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: fix divide error when trying to configure rt_period to zero
2008-07-02	Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6	Linus Torvalds
	* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: i2c: Fix bad hint about irqs in i2c.h i2c: Documentation: fix device matching description
2008-07-02	Merge branch 'core-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: fix hotplug vs rcu race
2008-07-02	Merge branch 'x86-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix NODES_SHIFT Kconfig range
2008-07-02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] esp: tidy up target reference counting [SCSI] esp: Fix OOPS in esp_reset_cleanup(). [SCSI] ses: Fix timeout
2008-07-02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm crypt: use cond_resched
2008-07-02	Merge branch 'for-2.6.26' of git://neil.brown.name/md	Linus Torvalds
	* 'for-2.6.26' of git://neil.brown.name/md: Fix error paths if md_probe fails. Don't acknowlege that stripe-expand is complete until it really is. Ensure interrupted recovery completed properly (v1 metadata plus bitmap)
2008-07-02	Merge branch 'merge' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: powerpc/mpc5200: Fix lite5200b suspend/resume powerpc/legacy_serial: Bail if reg-offset/shift properties are present powerpc/bootwrapper: update for initrd with simpleImage
2008-07-02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (55 commits) net: fib_rules: fix error code for unsupported families netdevice: Fix wrong string handle in kernel command line parsing net: Tyop of sk_filter() comment netlink: Unneeded local variable net-sched: fix filter destruction in atm/hfsc qdisc destruction net-sched: change tcf_destroy_chain() to clear start of filter list ipv4: fix sysctl documentation of time related values mac80211: don't accept WEP keys other than WEP40 and WEP104 hostap: fix sparse warnings hostap: don't report useless WDS frames by default textsearch: fix Boyer-Moore text search bug netfilter: nf_conntrack_tcp: fixing to check the lower bound of valid ACK ipv6 route: Convert rt6_device_match() to use RT6_LOOKUP_F_xxx flags. netlabel: Fix a problem when dumping the default IPv6 static labels net/inet_lro: remove setting skb->ip_summed when not LRO-able inet fragments: fix race between inet_frag_find and inet_frag_secret_rebuild CONNECTOR: add a proc entry to list connectors netlink: Fix some doc comments in net/netlink/attr.c tcp: /proc/net/tcp rto,ato values not scaled properly (v2) include/linux/netdevice.h: don't export MAX_HEADER to userspace ...
2008-07-02	DRM/i915: only use tiled blits on 965+	Jesse Barnes
	When scheduled swaps occur, we need to blit between front & back buffers. If the buffers are tiled, we need to set the appropriate XY_SRC_COPY tile bit, but only on 965 chips, since it will cause corruption on pre-965 (e.g. 945). Bug reported by and fix tested by Tomas Janousek <tomi@nomi.cz>. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-by: Dave Airlie <airlied@linux.ie> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-02	drivers/input/ff-core.c needs <linux/sched.h>	Geert Uytterhoeven
	Commit 656acd2bbc4ce7f224de499ee255698701396c48 ("Input: fix locking in force-feedback core") causes the following regression on m68k: \| linux/drivers/input/ff-core.c: In function 'input_ff_upload': \| linux/drivers/input/ff-core.c:172: error: dereferencing pointer to incomplete type \| linux/drivers/input/ff-core.c: In function 'erase_effect': \| linux/drivers/input/ff-core.c:197: error: dereferencing pointer to incomplete type \| linux/drivers/input/ff-core.c:204: error: dereferencing pointer to incomplete type \| make[4]: *** [drivers/input/ff-core.o] Error 1 As the incomplete type is `struct task_struct', including <linux/sched.h> fixes it. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-03	Merge branch 'for-2.6.26' of git://git.secretlab.ca/git/linux-2.6-mpc52xx ↵	Paul Mackerras
	into merge
2008-07-02	PCI: acpiphp: cleanup notify handler on all root bridges	Alex Chiang
	During the development of the physical PCI slot patch series, Gary Hade kept on reporting strange oopses due to interactions between pci_slot and acpiphp. http://lkml.org/lkml/2007/11/28/319 find_root_bridges() unconditionally installs handle_hotplug_event_bridge() as an ACPI_SYSTEM_NOTIFY handler for all root bridges. However, during module cleanup, remove_bridge() will only remove the notify handler iff the root bridge had a hot-pluggable slot directly underneath. That is: root bridge -> hotplug slot But, if the topology looks like either of the following: root bridge -> non-hotplug slot root bridge -> p2p bridge -> hotplug slot Then we currently do not remove the notify handler from that root bridge. This can cause a kernel oops if we modprobe acpiphp later and it gets loaded somewhere else in memory. If the root bridge then receives a hotplug event, it will then attempt to call a stale, non-existent notify handler and we blow up. Much thanks goes to Gary Hade for his persistent debugging efforts. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-02	PCI: Limit VPD read/write lengths for Broadcom 5706, 5708, 5709 rev.	Benjamin Li
	For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the VPD end tag will hang the device. This problem was initially observed when a vpd entry was created in sysfs ('/sys/bus/pci/devices/<id>/vpd'). A read to this sysfs entry will dump 32k of data. Reading a full 32k will cause an access beyond the VPD end tag causing the device to hang. Once the device is hung, the bnx2 driver will not be able to reset the device. We believe that it is legal to read beyond the end tag and therefore the solution is to limit the read/write length. A majority of this patch is from Matthew Wilcox who gave code for reworking the PCI vpd size information. A PCI quirk added for the Broadcom NIC's to limit the read/write's. Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-02	V4L/DVB (8178): uvc: Fix compilation breakage for the other drivers, if uvc ↵	Mauro Carvalho Chehab
	is selected UVC makefile defines obj as: obj-$(CONFIG_USB_VIDEO_CLASS) := uvcvideo.o Instead of: obj-$(CONFIG_USB_VIDEO_CLASS) += uvcvideo.o Due to that, if uvc is selected, all obj-y or obj-m that were added to compilation were forget. This breaks a proper kernel build. Acked-by: Laurent Pinchart <laurent.pinchart@skynet.be> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-02	dm crypt: use cond_resched	Milan Broz
	Add cond_resched() to prevent monopolising CPU when processing large bios. dm-crypt processes encryption of bios in sector units. If the bio request is big it can spend a long time in the encryption call. Signed-off-by: Milan Broz <mbroz@redhat.com> Tested-by: Yan Li <elliot.li.tech@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-07-01	net: fib_rules: fix error code for unsupported families	Patrick McHardy
	The errno code returned must be negative. Fixes "RTNETLINK answers: Unknown error 18446744073709551519". Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01	netdevice: Fix wrong string handle in kernel command line parsing	Wang Chen
	v1->v2: Use strlcpy() to ensure s[i].name be null-termination. 1. In netdev_boot_setup_add(), a long name will leak. ex. : dev=21,0x1234,0x1234,0x2345,eth123456789verylongname......... 2. In netdev_boot_setup_check(), mismatch will happen if s[i].name is a substring of dev->name. ex. : dev=...eth1 dev=...eth11 [ With feedback from Ben Hutchings. ] Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01	net: Tyop of sk_filter() comment	Wang Chen
	Parameter "needlock" no long exists. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01	netlink: Unneeded local variable	Wang Chen
	We already have a variable, which has the same capability. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01	net-sched: fix filter destruction in atm/hfsc qdisc destruction	Patrick McHardy
	Filters need to be destroyed before beginning to destroy classes since the destination class needs to still be alive to unbind the filter. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01	net-sched: change tcf_destroy_chain() to clear start of filter list	Patrick McHardy
	Pass double tcf_proto pointers to tcf_destroy_chain() to make it clear the start of the filter list for more consistency. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01	ipv4: fix sysctl documentation of time related values	Stephen Hemminger
	These sysctl values are time related and all use the same routine (proc_dointvec_jiffies) that internally converts from seconds to jiffies. The code is fine, the documentation is just wrong. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01	powerpc/mpc5200: Fix lite5200b suspend/resume	Tim Yamin
	Suspend/resume ("echo mem > /sys/power/state") does not work with vanilla kernels -- the system does not suspend correctly and just hangs. This patch fixes this so suspend/resume works: 1) of_iomap does not map the whole 0xC000 of the MPC5200 immr so saving registers does not work. 2) PCI registers need to be saved and restored. Signed-off-by: Tim Yamin <plasm@roo.me.uk> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-07-01	powerpc/legacy_serial: Bail if reg-offset/shift properties are present	John Linn
	The legacy serial driver does not work with an 8250 type UART that is described in the device tree with the reg-offset and reg-shift properties. This change makes legacy_serial ignore these devices. Signed-off-by: John Linn <john.linn@xilinx.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-07-01	i2c: Fix bad hint about irqs in i2c.h	Wolfram Sang
	i2c.h mentions -1 as a not-issued irq. This false hint was taken by of_i2c and caused crashes. Don't give any advice as 'no irq' is not consistent across all architectures yet and it is not needed internally by the i2c-core. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-07-01	i2c: Documentation: fix device matching description	Ben Dooks
	The matching process described for new style clients in Documentation/i2c/writing-clients is classed as out-of-date as it requires the presence of an .id_table entry in the driver's i2c_driver entry. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-07-01	powerpc/bootwrapper: update for initrd with simpleImage	John Linn
	This change to the makefile corrects the build of a simpleImage with initrd. Signed-off-by: John Linn <john.linn@xilinx> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-07-01	PCI: Restrict VPD read permission to root	Ben Hutchings
	Some PCI devices will lock up if we attempt to read from VPD addresses beyond some device-dependent limit. Until we can identify these devices and adjust the file size accordingly, only let root read VPD through sysfs to prevent a DoS by normal users. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-01	I2C: S3C2410: Add MODULE_ALIAS() for s3c2440 device.	Ben Dooks
	Add a MODULE_ALIAS() statement for the i2c-s3c2410 controller to ensure that it can be autoloaded on the S3C2440 systems that we support. Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2008-07-01	I2C: S3C2410: Fixup error codes returned rom a transfer.	Ben Dooks
	The driver should be returning -ENXIO for transfers that do not pass the initial address byte stage. Note, also small tidyups to the driver comments in the area. Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2008-07-01	I2C: S3C2410: Check ACK on byte transmission	Ben Dooks
	We should check for the reception of an ACK after transmitting each data byte. The address send has been correctly checking this, but the data write byte state should have also been checking for these failures. As part of the same fix, we remove the ACK checking from the receive path where it should not have been checking for an ACK which our hardware was sending. Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2008-07-01	rcu: fix hotplug vs rcu race	Gautham R Shenoy
	Dhaval Giani reported this warning during cpu hotplug stress-tests: \| On running kernel compiles in parallel with cpu hotplug: \| \| WARNING: at arch/x86/kernel/smp.c:118 \| native_smp_send_reschedule+0x21/0x36() \| Modules linked in: \| Pid: 27483, comm: cc1 Not tainted 2.6.26-rc7 #1 \| [...] \| [<c0110355>] native_smp_send_reschedule+0x21/0x36 \| [<c014fe8f>] force_quiescent_state+0x47/0x57 \| [<c014fef0>] call_rcu+0x51/0x6d \| [<c01713b3>] __fput+0x130/0x158 \| [<c0171231>] fput+0x17/0x19 \| [<c016fd99>] filp_close+0x4d/0x57 \| [<c016fdff>] sys_close+0x5c/0x97 IMHO the warning is a spurious one. cpu_online_map is updated by the _cpu_down() using stop_machine_run(). Since force_quiescent_state is invoked from irqs disabled section, stop_machine_run() won't be executing while a cpu is executing force_quiescent_state(). Hence the cpu_online_map is stable while we're in the irq disabled section. However, a cpu might have been offlined _just_ before we disabled irqs while entering force_quiescent_state(). And rcu subsystem might not yet have handled the CPU_DEAD notification, leading to the offlined cpu's bit being set in the rcp->cpumask. Hence cpumask = (rcp->cpumask & cpu_online_map) to prevent sending smp_reschedule() to an offlined CPU. Here's the timeline: CPU_A CPU_B -------------------------------------------------------------- cpu_down(): . . . . . stop_machine(): /* disables preemption, . * and irqs / . . . . . take_cpu_down(); . . . . . . . cpu_disable(); /this removes cpu . from cpu_online_map . / . . . . . restart_machine(); /* enables irqs / . ------WINDOW DURING WHICH rcp->cpumask is stale --------------- . call_rcu(); . / disables irqs here / . .force_quiescent_state(); .CPU_DEAD: .for_each_cpu(rcp->cpumask) . . smp_send_reschedule(); . . . . WARN_ON() for offlined CPU! . . . rcu_cpu_notify: . -------- WINDOW ENDS ------------------------------------------ rcu_offline_cpu() / Which calls cpu_quiet() * which removes * cpu from rcp->cpumask. */ If a new batch was started just before calling stop_machine_run(), the "tobe-offlined" cpu is still present in rcp-cpumask. During a cpu-offline, from take_cpu_down(), we queue an rt-prio idle task as the next task to be picked by the scheduler. We also call cpu_disable() which will disable any further interrupts and remove the cpu's bit from the cpu_online_map. Once the stop_machine_run() successfully calls take_cpu_down(), it calls schedule(). That's the last time a schedule is called on the offlined cpu, and hence the last time when rdp->passed_quiesc will be set to 1 through rcu_qsctr_inc(). But the cpu_quiet() will be on this cpu will be called only when the next RCU_SOFTIRQ occurs on this CPU. So at this time, the offlined CPU is still set in rcp->cpumask. Now coming back to the idle_task which truely offlines the CPU, it does check for a pending RCU and raises the softirq, since it will find rdp->passed_quiesc to be 0 in this case. However, since the cpu is offline I am not sure if the softirq will trigger on the CPU. Even if it doesn't the rcu_offline_cpu() will find that rcp->completed is not the same as rcp->cur, which means that our cpu could be holding up the grace period progression. Hence we call cpu_quiet() and move ahead. But because of the window explained in the timeline, we could still have a call_rcu() before the RCU subsystem executes it's CPU_DEAD notification, and we send smp_send_reschedule() to offlined cpu while trying to force the quiescent states. The appended patch adds comments and prevents checking for offlined cpu everytime. cpu_online_map is updated by the _cpu_down() using stop_machine_run(). Since force_quiescent_state is invoked from irqs disabled section, stop_machine_run() won't be executing while a cpu is executing force_quiescent_state(). Hence the cpu_online_map is stable while we're in the irq disabled section. Reported-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Acked-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: laijs@cn.fujitsu.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rusty Russel <rusty@rustcorp.com.au> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-01	Properly notify block layer of sync writes	Jens Axboe
	fsync_buffers_list() and sync_dirty_buffer() both issue async writes and then immediately wait on them. Conceptually, that makes them sync writes and we should treat them as such so that the IO schedulers can handle them appropriately. This patch fixes a write starvation issue that Lin Ming reported, where xx is stuck for more than 2 minutes because of a large number of synchronous IO in the system: INFO: task kjournald:20558 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff810010820978 6712 20558 2 ffff81022ddb1d10 0000000000000046 ffff81022e7baa10 ffffffff803ba6f2 ffff81022ecd0000 ffff8101e6dc9160 ffff81022ecd0348 000000008048b6cb 0000000000000086 ffff81022c4e8d30 0000000000000000 ffffffff80247537 Call Trace: [<ffffffff803ba6f2>] kobject_get+0x12/0x17 [<ffffffff80247537>] getnstimeofday+0x2f/0x83 [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f [<ffffffff8066d195>] io_schedule+0x5d/0x9f [<ffffffff8029c1e7>] sync_buffer+0x3b/0x3f [<ffffffff8066d3f0>] __wait_on_bit+0x40/0x6f [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f [<ffffffff8066d48b>] out_of_line_wait_on_bit+0x6c/0x78 [<ffffffff80243909>] wake_bit_function+0x0/0x23 [<ffffffff8029e3ad>] sync_dirty_buffer+0x98/0xcb [<ffffffff8030056b>] journal_commit_transaction+0x97d/0xcb6 [<ffffffff8023a676>] lock_timer_base+0x26/0x4b [<ffffffff8030300a>] kjournald+0xc1/0x1fb [<ffffffff802438db>] autoremove_wake_function+0x0/0x2e [<ffffffff80302f49>] kjournald+0x0/0x1fb [<ffffffff802437bb>] kthread+0x47/0x74 [<ffffffff8022de51>] schedule_tail+0x28/0x5d [<ffffffff8020cac8>] child_rip+0xa/0x12 [<ffffffff80243774>] kthread+0x0/0x74 [<ffffffff8020cabe>] child_rip+0x0/0x12 Lin Ming confirms that this patch fixes the issue. I've run tests with it for the past week and no ill effects have been observed, so I'm proposing it for inclusion into 2.6.26. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-01	block: Fix the starving writes bug in the anticipatory IO scheduler	Divyesh Shah
	AS scheduler alternates between issuing read and write batches. It does the batch switch only after all requests from the previous batch are completed. When switching to a write batch, if there is an on-going read request, it waits for its completion and indicates its intention of switching by setting ad->changed_batch and the new direction but does not update the batch_expire_time for the new write batch which it does in the case of no previous pending requests. On completion of the read request, it sees that we were waiting for the switch and schedules work for kblockd right away and resets the ad->changed_data flag. Now when kblockd enters dispatch_request where it is expected to pick up a write request, it in turn ends the write batch because the batch_expire_timer was not updated and shows the expire timestamp for the previous batch. This results in the write starvation for all the cases where there is the intention for switching to a write batch, but there is a previous in-flight read request and the batch gets reverted to a read_batch right away. This also holds true in the reverse case (switching from a write batch to a read batch with an in-flight write request). I've checked that this bug exists on 2.6.11, 2.6.18, 2.6.24 and linux-2.6-block git HEAD. I've tested the fix on x86 platforms with SCSI drives where the driver asks for the next request while a current request is in-flight. This patch is based off linux-2.6-block git HEAD. Bug reproduction: A simple scenario which reproduces this bug is: - dd if=/dev/hda3 of=/dev/null & - lilo The lilo takes forever to complete. This can also be reproduced fairly easily with the earlier dd and another test program doing msync(). The example test program below should print out a message after every iteration but it simply hangs forever. With this bugfix it makes forward progress. ==== Example test program using msync() (thanks to suleiman AT google DOT com) inline uint64_t rdtsc(void) { int64_t tsc; __asm __volatile("rdtsc" : "=A" (tsc)); return (tsc); } int main(int argc, char *argv) { struct stat st; uint64_t e, s, t; char p, q; long i; int fd; if (argc < 2) { printf("Usage: %s <file>\n", argv[0]); return (1); } if ((fd = open(argv[1], O_RDWR \| O_NOATIME)) < 0) err(1, "open"); if (fstat(fd, &st) < 0) err(1, "fstat"); p = mmap(NULL, st.st_size, PROT_READ \| PROT_WRITE, MAP_SHARED, fd, 0); t = 0; for (i = 0; i < 1000; i++) { p = 0; msync(p, 4096, MS_SYNC); s = rdtsc(); p = 0; __asm __volatile(""::: "memory"); e = rdtsc(); if (argc > 2) printf("%d: %lld cycles %jd %jd\n", i, e - s, (intmax_t)s, (intmax_t)e); t += e - s; } printf("average time: %lld cycles\n", t / 1000); return (0); } Cc: <stable@kernel.org> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-01	x86: fix NODES_SHIFT Kconfig range	Thomas Gleixner
	commit 4323838215184f5a2f081e0d17b8d60731b03164 x86: change size of node ids from u8 to s16 set the range for NODES_SHIFT to 1..15. The possible range is 1..9 Fixes Bugzilla #10726 Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-01	sched: fix divide error when trying to configure rt_period to zero	Raistlin
	Here it is another little Oops we found while configuring invalid values via cgroups: echo 0 > /dev/cgroups/0/cpu.rt_period_us or echo 4294967296 > /dev/cgroups/0/cpu.rt_period_us [ 205.509825] divide error: 0000 [#1] [ 205.510151] Modules linked in: [ 205.510151] [ 205.510151] Pid: 2339, comm: bash Not tainted (2.6.26-rc8 #33) [ 205.510151] EIP: 0060:[<c030c6ef>] EFLAGS: 00000293 CPU: 0 [ 205.510151] EIP is at div64_u64+0x5f/0x70 [ 205.510151] EAX: 0000389f EBX: 00000000 ECX: 00000000 EDX: 00000000 [ 205.510151] ESI: d9800000 EDI: 00000000 EBP: c6cede60 ESP: c6cede50 [ 205.510151] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 [ 205.510151] Process bash (pid: 2339, ti=c6cec000 task=c79be370 task.ti=c6cec000) [ 205.510151] Stack: d9800000 0000389f c05971a0 d9800000 c6cedeb4 c0214dbd 00000000 00000000 [ 205.510151] c6cede88 c0242bd8 c05377c0 c7a41b40 00000000 00000000 00000000 c05971a0 [ 205.510151] c780ed20 c7508494 c7a41b40 00000000 00000002 c6cedebc c05971a0 ffffffea [ 205.510151] Call Trace: [ 205.510151] [<c0214dbd>] ? __rt_schedulable+0x1cd/0x240 [ 205.510151] [<c0242bd8>] ? cgroup_file_open+0x18/0xe0 [ 205.510151] [<c0214fe4>] ? tg_set_bandwidth+0xa4/0xf0 [ 205.510151] [<c0215066>] ? sched_group_set_rt_period+0x36/0x50 [ 205.510151] [<c021508e>] ? cpu_rt_period_write_uint+0xe/0x10 [ 205.510151] [<c0242dc5>] ? cgroup_file_write+0x125/0x160 [ 205.510151] [<c0232c15>] ? hrtimer_interrupt+0x155/0x190 [ 205.510151] [<c02f047f>] ? security_file_permission+0xf/0x20 [ 205.510151] [<c0277ad8>] ? rw_verify_area+0x48/0xc0 [ 205.510151] [<c0283744>] ? dupfd+0x104/0x130 [ 205.510151] [<c027838c>] ? vfs_write+0x9c/0x160 [ 205.510151] [<c0242ca0>] ? cgroup_file_write+0x0/0x160 [ 205.510151] [<c027850d>] ? sys_write+0x3d/0x70 [ 205.510151] [<c0203019>] ? sysenter_past_esp+0x6a/0x91 [ 205.510151] ======================= [ 205.510151] Code: 0f 45 de 31 f6 0f ad d0 d3 ea f6 c1 20 0f 45 c2 0f 45 d6 89 45 f0 89 55 f4 8b 55 f4 31 c9 8b 45 f0 39 d3 89 c6 77 08 89 d0 31 d2 <f7> f3 89 c1 83 c4 08 89 f0 f7 f3 89 ca 5b 5e 5d c3 55 89 e5 56 [ 205.510151] EIP: [<c030c6ef>] div64_u64+0x5f/0x70 SS:ESP 0068:c6cede50 The attached patch solves the issue for me. I'm checking as soon as possible for the period not being zero since, if it is, going ahead is useless. This way we also save a mutex_lock() and a read_lock() wrt doing it inside tg_set_bandwidth() or __rt_schedulable(). Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Michael Trimarchi <trimarchimichael@yahoo.it> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-30	[IA64] export account_system_vtime	Doug Chapman
	The symbol account_system_vtime is used by the kvm module but not exported. This breaks building with CONFIG_VIRT_CPU_ACCOUNTING and CONFIG_KVM=m. Signed-off-by: Doug Chapman <doug.chapman@hp.com> Acked-by: Hidetosho Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-06-30	[IA64] Bugfix for system with 32 cpus	Tony Luck
	On a system where there are no hot pluggable cpus "additional_cpus" is still set to -1 at the point where we call per_cpu_scan_finalize(). If we didn't find an SRAT table and so pick the default "32" for the number of cpus, when we get to: high_cpu = min(high_cpu + reserve_cpus, NR_CPUS); we will end up initializing for just 31 cpus ... and so we will die horribly when bringing up cpu#32. Problem introduced by: 2c6e6db41f01b6b4eb98809350827c9678996698 "Minimize per_cpu reservations." Acked-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>