aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-05-10[Blackfin] arch: remove useless IRQ_SW_INT definesMichael Hennerich
IRQ_SW_INT1 and IRQ_SW_INT2 obsolete: Remove useless defines Fix SYS_IRQS Keep numbering scheme, so we don't break existing configurations. Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-10[Blackfin] arch: protect linux/usb/musb.h include until the driver gets ↵Mike Frysinger
mainlined Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-10[Blackfin] arch: protect linux/usb/isp1362.h include until the driver gets ↵Mike Frysinger
mainlined Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: add EBIU supporting for BF54x EZKIT SMSC LAN911x/LAN921x ↵Michael Hennerich
families embedded ethernet driver Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: Set spi flash partition on bf527 as like bf548.Grace Pan
Signed-off-by: Grace Pan <grace.pan@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: fix bug - Remove module will not free L1 memory usedMeihui Fan
Remove module will not free L1 memory used which caused by memory access after free. This patch fixes it. Signed-off-by: Meihui Fan <mhfan@hhcn.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: fix wrong header name in commentMike Frysinger
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: Fix BUG - spi flash on bf527 ezkit would fail at mountMichael Hennerich
BF527-EZKit features 16MBit M25P16 flash Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: add twi_lcd and twi_keypad i2c board info to bf527-ezkitBryan Wu
- JP3 should be installed for STAMP enable - IRQ for twi_keypad driver is IRQ_PF8 Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: Add physmap partition for BF527-EZkitMichael Hennerich
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: fix gdb testing regressionBernd Schmidt
When transferring to IRQ5 from an exception, save SYSCFG in memory across the transfer and clear the trace bit. When we get a single step exception, check whether we can safely clear the trace bit in SYSCFG. We can (and should) clear it after the first instruction of the interrupt handler; the first insn saves SYSCFG to the stack in all handlers. Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: disable single stepping when delivering a signalBernd Schmidt
When delivering a signal, disable single stepping but call ptrace_notify if it was enabled before. The idea was taken from the x86 port. Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: Delete unused (copied from m68k) entries in asm-offsets.c.Bernd Schmidt
Fix some really ancient code that was correct only for the m68k port. Delete unused (i.e. copied from m68k) entries in asm-offsets.c. Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: In the double fault handler, set up the PT_RETI slotBernd Schmidt
In the double fault handler, set up the PT_RETI slot so that we print out the correct return address in the dumping code. Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: Support for CPU_FREQ and NOHZVitja Makarov
Singed-off-by: Vitja Makarov <vitja.makarov@gmail.com>
2008-05-07[Blackfin] arch: Functional power management support: Add CPU and platform ↵Michael Hennerich
voltage scaling support Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: fix bug - breaking the atomic sections code.Bernd Schmidt
The following cleanup patch: add __user markings to a few userspace system functions mysteriously added a "&" operator that doesn't belong in there, breaking the atomic sections code. Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: Equalize include files: Add VR_CTL masksMichael Hennerich
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-07[Blackfin] arch: Cleanup Kconfig, fix comment and make sure we exclude ↵Michael Hennerich
CCLK=SCLK for some configurations Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-05-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits) net: Added ASSERT_RTNL() to dev_open() and dev_close(). can: Fix can_send() handling on dev_queue_xmit() failures netns: Fix arbitrary net_device-s corruptions on net_ns stop. netfilter: Kconfig: default DCCP/SCTP conntrack support to the protocol config values netfilter: nf_conntrack_sip: restrict RTP expect flushing on error to last request macvlan: Fix memleak on device removal/crash on module removal net/ipv4: correct RFC 1122 section reference in comment tcp FRTO: SACK variant is errorneously used with NewReno e1000e: don't return half-read eeprom on error ucc_geth: Don't use RX clock as TX clock. cxgb3: Use CAP_SYS_RAWIO for firmware pcnet32: delete non NAPI code from driver. fs_enet: Fix a memory leak in fs_enet_mdio_probe [netdrvr] eexpress: IPv6 fails - multicast problems 3c59x: use netstats in net_device structure 3c980-TX needs EXTRA_PREAMBLE fix warning in drivers/net/appletalk/cops.c e1000e: Add support for BM PHYs on ICH9 uli526x: fix endianness issues in the setup frame uli526x: initialize the hardware prior to requesting interrupts ...
2008-05-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc: Fix SA_ONSTACK signal handling.
2008-05-08Revert "PCI: remove default PCI expansion ROM memory allocation"Linus Torvalds
This reverts commit 9f8daccaa05c14e5643bdd4faf5aed9cc8e6f11e, which was reported to break X startup (xf86-video-ati-6.8.0). See http://bugs.freedesktop.org/show_bug.cgi?id=15523 for details. Reported-by: Laurence Withers <l@lwithers.me.uk> Cc: Gary Hade <garyhade@us.ibm.com> Cc: Greg KH <greg@kroah.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes: sched: fix weight calculations semaphore: fix
2008-05-08Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: [ALSA] soc at91 minor bug fixes [ALSA] soc - at91-pcm - Fix line wrapping pcspkr: fix dependancies
2008-05-08Remove duplicated include in net/sunrpc/svc.cHuang Weiyi
<linux/sched.h> we included twice. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08fs/proc/task_mmu.c: remove duplicated include filesHuang Weiyi
Removed duplicated include files <linux/ptrace.h> and <linux/seq_file.h> in fs/proc/task_mmu.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08Fix drivers/media build for modular buildsIngo Molnar
Fix allmodconfig build bug introduced in latest -git by commit 7c91f0624a9 ("V4L/DVB(7767): Move tuners to common/tuners"): LD kernel/built-in.o LD drivers/built-in.o ld: drivers/media/built-in.o: No such file: No such file or directory which happens if all media drivers are modular: http://redhat.com/~mingo/misc/config-Wed_Apr_30_09_24_48_CEST_2008.bad In that case there's no obj-y rule connecting all the built-in.o files and the link tree breaks. The fix is to add a guaranteed obj-y rule for the core vmlinux to build. (which results in an empty object file if all media drivers are modular) Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/ehca: Wait for async events to finish before destroying QP IB/ipath: Fix SDMA error recovery in absence of link status change IB/ipath: Need to always request and handle PIO avail interrupts IB/ipath: Fix count of packets received by kernel IB/ipath: Return the correct opcode for RDMA WRITE with immediate IB/ipath: Fix bug that can leave sends disabled after freeze recovery IB/ipath: Only increment SSN if WQE is put on send queue IB/ipath: Only warn about prototype chip during init RDMA/cxgb3: Fix severe limit on userspace memory registration size RDMA/cxgb3: Don't add PBL memory to gen_pool in chunks
2008-05-08MN10300: Make cpu_relax() invoke barrier()David Howells
Make cpu_relax() invoke barrier() to be the same as other arches. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: Revert "relay: fix splice problem" docbook: fix bio missing parameter block: use unitialized_var() in bio_alloc_bioset() block: avoid duplicate calls to get_part() in disk stat code cfq-iosched: make io priorities inherit CPU scheduling class as well as nice block: optimize generic_unplug_device() block: get rid of likely/unlikely predictions in merge logic vfs: splice remove_suid() cleanup cfq-iosched: fix RCU race in the cfq io_context destructor handling block: adjust tagging function queue bit locking block: sysfs store function needs to grab queue_lock and use queue_flag_*()
2008-05-08Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6: udf: Fix memory corruption when fs mounted with noadinicb option udf: Make udf exportable udf: fs/udf/partition.c:udf_get_pblock() mustn't be inline
2008-05-08Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6Linus Torvalds
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] guest page hinting light [S390] tty3270: fix put_char fail/success conversion. [S390] compat ptrace cleanup [S390] s390mach compile warning [S390] cio: Fix parsing mechanism for blacklisted devices. [S390] cio: Remove cio_msg kernel parameter. [S390] s390-kvm: leave sie context on work. Removes preemption requirement [S390] s390: Optimize user and work TIF check
2008-05-08drivers/scsi/dpt_i2o.c: fix build on alphaAndrew Morton
alpha: drivers/scsi/dpt_i2o.c:1997: error: implicit declaration of function 'adpt_alpha_info' drivers/scsi/dpt_i2o.c: At top level: drivers/scsi/dpt_i2o.c:2032: warning: conflicting types for 'adpt_alpha_info' drivers/scsi/dpt_i2o.c:2032: error: static declaration of 'adpt_alpha_info' follows non-static declaration drivers/scsi/dpt_i2o.c:1997: error: previous implicit declaration of 'adpt_alpha_info' was here Due to a copy-n-paste error in drivers/scsi/dpti.h. Fix that up and remove some of the many daft static-declarations-in-a-header which this driver enjoys. Cc: Miquel van Smoorenburg <miquels@cistron.nl> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08Fix cpuset sched_relax_domain_level control filePaul Menage
Due to a merge conflict, the sched_relax_domain_level control file was marked as being handled by cpuset_read/write_u64, but the code to handle it was actually in cpuset_common_file_read/write. Since the value being written/read is in fact a signed integer, it should be treated as such; this patch adds cpuset_read/write_s64 functions, and uses them to handle the sched_relax_domain_level file. With this patch, the sched_relax_domain_level can be read and written, and the correct contents seen/updated. Signed-off-by: Paul Menage <menage@google.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08slub: fix atomic usage in any_slab_objects()Benjamin Herrenschmidt
any_slab_objects() does an atomic_read on an atomic_long_t, this fixes it to use atomic_long_read instead. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Lameter <clameter@sgi.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08sys_pipe(): fix file descriptor leaksUlrich Drepper
Remember to close the files if copy_to_user() failed. Spotted by dm.n9107@gmail.com. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: DM <dm.n9107@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08vt: fix canonical input in UTF-8 modeSamuel Thibault
For e.g. proper TTY canonical support, IUTF8 termios flag has to be set as appropriate. Linux used to not care about setting that flag for VT TTYs. This patch fixes that by activating it according to the current mode of the VT, and sets the default value according to the vt.default_utf8 parameter. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Willy Tarreau <w@1wt.eu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08usb/asix: add Buffalo LUA-U2-GT 10/100/1000Mattia Dongili
The USB net adapter Buffalo LUA-U2-GT (0411:006e) carries a AX88178 chip. Tested on the above HW. Signed-off-by: Mattia Dongili <malattia@linux.it> Acked-off-by: David Hollis <dhollis@davehollis.com> Cc: Greg KH <greg@kroah.com> Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08sx.c: fix printk warnings on sparc32Andrew Morton
drivers/char/sx.c: In function 'sx_set_real_termios': drivers/char/sx.c:973: warning: format '%u' expects type 'unsigned int', but argument 2 has type 'long unsigned int' drivers/char/sx.c:999: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'tcflag_t' drivers/char/sx.c:1012: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'tcflag_t' sparc32 seems to use weird types for its tty things. [ Fine by me but this is ancient debug and most of the debug in sx just wants deleting eventually. - Alan ] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08uml: fix inconsistence due to tty_operation changeWANG Cong
'put_char' of 'struct tty_operations' has changed from 'void' into 'int'. This can also shut up compiler warnings. Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: WANG Cong <wangcong@zeuux.org> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08misc: fix integer as NULL pointer warningsHarvey Harrison
drivers/md/raid10.c:889:17: warning: Using plain integer as NULL pointer drivers/media/video/cx18/cx18-driver.c:616:12: warning: Using plain integer as NULL pointer sound/oss/kahlua.c:70:12: warning: Using plain integer as NULL pointer Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Neil Brown <neilb@suse.de> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08fix irq flags for iuu_phoenix.cSteven Rostedt
The file drivers/usb/serial/iuu_phoenix.c uses "int" for flags. This can cause hard to find bugs on some architectures. This patch converts the flags to use "long" instead. This bug was discovered by doing an allyesconfig make on the -rt kernel where checks are done to ensure all flags are of size sizeof(long). Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08fix irq flags in rtc-ds1511Steven Rostedt
The file in drivers/rtc/rtc-ds1551.c uses "int" for flags. This can cause hard to find bugs on some architectures. This patch converts the flags to use "long" instead. This bug was discovered by doing an allyesconfig make on the -rt kernel where checks are done to ensure all flags are of size sizeof(long). Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08fix irq flags in saa7134Steven Rostedt
Some files in the drivers/media/video/saa7134 directory uses "int" for flags. This can cause hard to find bugs on some architectures. This patch converts the flags to use "long" instead. This bug was discovered by doing an allyesconfig make on the -rt kernel where checks are done to ensure all flags are of size sizeof(long). Signed-off-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08fix irq flags in mac80211 codeSteven Rostedt
A file in the net/mac80211 directory uses "int" for flags. This can cause hard to find bugs on some architectures. This patch converts the flags to use "long" instead. This bug was discovered by doing an allyesconfig make on the -rt kernel where checks are done to ensure all flags are of size sizeof(long). Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: "John W. Linville" <linville@tuxdriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08serial: access after NULL check in uart_flush_buffer()Tetsuo Handa
I noticed that static void uart_flush_buffer(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; struct uart_port *port = state->port; unsigned long flags; /* * This means you called this function _after_ the port was * closed. No cookie for you. */ if (!state || !state->info) { WARN_ON(1); return; } is too late for checking state != NULL. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08Kconfig: improved help for CONFIG_ACCESSIBILITYSamuel Thibault
Add a small explanation of what accessibility is. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08sched: fix weight calculationsMike Galbraith
The conversion between virtual and real time is as follows: dvt = rw/w * dt <=> dt = w/rw * dvt Since we want the fair sleeper granularity to be in real time, we actually need to do: dvt = - rw/w * l This bug could be related to the regression reported by Yanmin Zhang: | Comparing with kernel 2.6.25, sysbench+mysql(oltp, readonly) has lots | of regressions with 2.6.26-rc1: | | 1) 8-core stoakley: 28%; | 2) 16-core tigerton: 20%; | 3) Itanium Montvale: 50%. Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-08semaphore: fixIngo Molnar
Yanmin Zhang reported: | Comparing with kernel 2.6.25, AIM7 (use tmpfs) has more th | regression under 2.6.26-rc1 on my 8-core stoakley, 16-core tigerton, | and Itanium Montecito. Bisect located the patch below: | | 64ac24e738823161693bf791f87adc802cf529ff is first bad commit | commit 64ac24e738823161693bf791f87adc802cf529ff | Author: Matthew Wilcox <matthew@wil.cx> | Date: Fri Mar 7 21:55:58 2008 -0500 | | Generic semaphore implementation | | After I manually reverted the patch against 2.6.26-rc1 while fixing | lots of conflicts/errors, aim7 regression became less than 2%. i reproduced the AIM7 workload and can confirm Yanmin's findings that -.26-rc1 regresses over .25 - by over 67% here. Looking at the workload i found and fixed what i believe to be the real bug causing the AIM7 regression: it was inefficient wakeup / scheduling / locking behavior of the new generic semaphore code, causing suboptimal performance. The problem comes from the following code. The new semaphore code does this on down(): spin_lock_irqsave(&sem->lock, flags); if (likely(sem->count > 0)) sem->count--; else __down(sem); spin_unlock_irqrestore(&sem->lock, flags); and this on up(): spin_lock_irqsave(&sem->lock, flags); if (likely(list_empty(&sem->wait_list))) sem->count++; else __up(sem); spin_unlock_irqrestore(&sem->lock, flags); where __up() does: list_del(&waiter->list); waiter->up = 1; wake_up_process(waiter->task); and where __down() does this in essence: list_add_tail(&waiter.list, &sem->wait_list); waiter.task = task; waiter.up = 0; for (;;) { [...] spin_unlock_irq(&sem->lock); timeout = schedule_timeout(timeout); spin_lock_irq(&sem->lock); if (waiter.up) return 0; } the fastpath looks good and obvious, but note the following property of the contended path: if there's a task on the ->wait_list, the up() of the current owner will "pass over" ownership to that waiting task, in a wake-one manner, via the waiter->up flag and by removing the waiter from the wait list. That is all and fine in principle, but as implemented in kernel/semaphore.c it also creates a nasty, hidden source of contention! The contention comes from the following property of the new semaphore code: the new owner owns the semaphore exclusively, even if it is not running yet. So if the old owner, even if just a few instructions later, does a down() [lock_kernel()] again, it will be blocked and will have to wait on the new owner to eventually be scheduled (possibly on another CPU)! Or if another task gets to lock_kernel() sooner than the "new owner" scheduled, it will be blocked unnecessarily and for a very long time when there are 2000 tasks running. I.e. the implementation of the new semaphores code does wake-one and lock ownership in a very restrictive way - it does not allow opportunistic re-locking of the lock at all and keeps the scheduler from picking task order intelligently. This kind of scheduling, with 2000 AIM7 processes running, creates awful cross-scheduling between those 2000 tasks, causes reduced parallelism, a throttled runqueue length and a lot of idle time. With increasing number of CPUs it causes an exponentially worse behavior in AIM7, as the chance for a newly woken new-owner task to actually run anytime soon is less and less likely. Note that it takes just a tiny bit of contention for the 'new-semaphore catastrophy' to happen: the wakeup latencies get added to whatever small contention there is, and quickly snowball out of control! I believe Yanmin's findings and numbers support this analysis too. The best fix for this problem is to use the same scheduling logic that the kernel/mutex.c code uses: keep the wake-one behavior (that is OK and wanted because we do not want to over-schedule), but also allow opportunistic locking of the lock even if a wakee is already "in flight". The patch below implements this new logic. With this patch applied the AIM7 regression is largely fixed on my quad testbox: # v2.6.25 vanilla: .................. Tasks Jobs/Min JTI Real CPU Jobs/sec/task 2000 56096.4 91 207.5 789.7 0.4675 2000 55894.4 94 208.2 792.7 0.4658 # v2.6.26-rc1-166-gc0a1811 vanilla: ................................... Tasks Jobs/Min JTI Real CPU Jobs/sec/task 2000 33230.6 83 350.3 784.5 0.2769 2000 31778.1 86 366.3 783.6 0.2648 # v2.6.26-rc1-166-gc0a1811 + semaphore-speedup: ............................................... Tasks Jobs/Min JTI Real CPU Jobs/sec/task 2000 55707.1 92 209.0 795.6 0.4642 2000 55704.4 96 209.0 796.0 0.4642 i.e. a 67% speedup. We are now back to within 1% of the v2.6.25 performance levels and have zero idle time during the test, as expected. Btw., interactivity also improved dramatically with the fix - for example console-switching became almost instantaneous during this workload (which after all is running 2000 tasks at once!), without the patch it was stuck for a minute at times. There's another nice side-effect of this speedup patch, the new generic semaphore code got even smaller: text data bss dec hex filename 1241 0 0 1241 4d9 semaphore.o.before 1207 0 0 1207 4b7 semaphore.o.after (because the waiter.up complication got removed.) Longer-term we should look into using the mutex code for the generic semaphore code as well - but i's not easy due to legacies and it's outside of the scope of v2.6.26 and outside the scope of this patch as well. Bisected-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-08Revert "relay: fix splice problem"Jens Axboe
This reverts commit c3270e577c18b3d0e984c3371493205a4807db9d.