Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2005-10-28	Auto-update from upstream	Kyle McMartin

2005-10-28	[PATCH] gfp_t: dma-mapping (parisc)	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-21	[PARISC] Specify level to fix binutils level promotion bug	Grant Grundler
	fixup.S needs to specify .level and use correct LDREG macro. New binutils has a bug where it doesn't "promote" from PA1.0 to PA1.1 correctly when using ",s" completer. remove use of __LP64__ in assembly.h and add some white space. Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Define pgprot_noncached macro in pgtable.h	Grant Grundler
	drivers/infiniband depends on definition of pgprot_noncached() macro. Someone else will have to fix it's wrong. Signed-off-by: Grant Grundler <grundler@parisc-linux.org Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Add other CRT_ID for newer cards to grfioctl.h	Kyle McMartin
	Add IDs for some other STI graphics cards found on HP PA-RISC machines. Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Add ECANCELED to errno.h	Grant Grundler
	add ECANCELED - SuSv3 wants one L. IB/SDP actually returns this error. Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Fix the alloc_slabmgmt panic	James Bottomley
	Fix the alloc_slabmgmt panic Hopefully this should also fix a lot of other intermittent kernel bugs. The problem has been around since 2.6.9-rc2-pa6 when we allowed floating point registers to be used in kernel code. The essence of the problem is that gcc prefers to use floating point for integer divides and multiples. Further, it can rely on the values in the no clobber fp regs being correct across a function call. Unfortunately, our task switch function only saves the integer no clobber registers, not the fp ones, so if gcc makes a function call to any function in the kernel which could sleep, the values it is relying on in any no clobber floating point register may be lost. In the case of alloc_slabmgmt, the value of the page offset is being stored in %fr12 across a call to kmem_getpages(), which sleeps if no pages are available. Thus, the offset can be trashed and the slab code can end up with a completely bogus address leading to corruption. Kudos to Randolph who came up with the program to trip this problem at will and thus allowed it to be tracked and fixed. Signed-off-by: James Bottomley <jejb@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Fix compile warning in pci.h	Matthew Wilcox
	Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Use work queue in LED/LCD driver instead of tasklet.	Grant Grundler
	2.6.12-rc1-pa6 use work queue in LED/LCD driver instead of tasklet. Main advantage is it allows use of msleep() in the led_LCD_driver to "atomically" perform two MMIO writes (CMD, then DATA). Lead to nice cleanup of the main led_work_func() and led_LCD_driver(). Kudos to David for being persistent. From: David Pye <dmp@davidmpye.dyndns.org> Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Add new ioprio_{set,get} syscalls	Jens Axboe
	add syscall entries for ioprio_set/get as per Jens Axboe. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Update bitops from parisc tree	Grant Grundler
	Optimize ext2_find_next_zero_bit. Gives about 25% perf improvement with a rsync test with ext3. Signed-off-by: Randolph Chung <tausq@parisc-linux.org> fix ext3 performance - ext2_find_next_zero() was culprit. Kudos to jejb for pointing out the the possibility that ext2_test_bit and ext2_find_next_zero() may in fact not be enumerating bits in the bitmap because of endianess. Took sparc64 implementation and adapted it to our tree. I suspect the real problem is ffz() wants an unsigned long and was getting garbage in the top half of the unsigned int. Not confirmed but that's what I suspect. Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Fix find_next_bit for 32-bit Make masking consistent for bitops From: Joel Soete <soete.joel@tiscali.be> Signed-off-by: Randolph Chung <tausq@parisc-linux.org> Add back incorrectly removed ext2_find_first_zero_bit definition Signed-off-by: James Bottomley <jejb@parisc-linux.org> Fixup bitops.h to use volatile for _bit() ops Based on this email thread: http://marc.theaimsgroup.com/?t=108826637900003 In a nutshell: _bit() want use of volatile. ___bit() are "relaxed" and don't use spinlock or volatile. other minor changes: o replaces hweight64() macro with alias to generic_hweight64() (Joel Soete) o cleanup ext2 macros so (a) it's obvious what the XOR magic is about and (b) one version that works for both 32/64-bit. o replace 2 uses of CONFIG_64BIT with __LP64__. bitops.h used both. I think header files that might go to user space should use something userspace will know about (__LP64__). Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Move SHIFT_PER_LONG to standard location for BITS_PER_LONG (asm/types.h) and ditch the second definition of BITS_PER_LONG in bitops.h Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Add ability for prctl to change unaligned trap behaviour	Kyle McMartin
	Add support for changing unaligned trap behaviour on a per-thread basis. Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Take into account nullified insn and lock functions for profiling	Randolph Chung
	export profile_pc() symbol - oprofile needs it when built as a module. Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Take into account nullified insn and lock functions for profiling This is needed at the end of functions; it is typical that the return branch nullifies the next insn, which is in the next function. This causes profiling data to show up against the "wrong" function. We also count lock times against the locker. This is consistent with other architectures. Signed-off-by: Randolph Chung <tausq@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Update spinlocks from parisc tree	Matthew Wilcox
	Neaten up the CONFIG_PA20 ifdefs More merge fixes, this time for SMP Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> Prettify the CONFIG_DEBUG_SPINLOCK __SPIN_LOCK_UNLOCKED initializers. Clean up some warnings with CONFIG_DEBUG_SPINLOCK enabled. Fix build with spinlock debugging turned on. Patch is cleaner like this, too. Remove mandatory 16-byte alignment requirement on PA2.0 processors by using the ldcw,CO completer. Provides a nice insn savings. Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Move pa_tlb_lock to tlb_flush.h	Grant Grundler
	move pa_tlb_lock and it's primary consumers to tlb_flush.h Future step will be to move spinlock_t definition out of system.h. Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Make sure use of RFI conforms to PA 2.0 and 1.1 arch docs	Grant Grundler
	2.6.12-rc4-pa3 : first pass at making sure use of RFI conforms to PA 2.0 arch pages F-4 and F-5, PA 1.1 Arch page 3-19 and 3-20. The discussion revolves around all the rules for clearing PSW Q-bit. The hard part is meeting all the rules for "relied upon translation". .align directive is used to guarantee the critical sequence ends more than 8 instructions (32 bytes) from the end of page. Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Convert parisc_device to use struct resource for hpa	Matthew Wilcox
	Convert pa_dev->hpa from an unsigned long to a struct resource. Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> Fix up users of ->hpa to use ->hpa.start instead. Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-10-21	[PARISC] Convert parisc_device tree to use struct device klists	Matthew Wilcox
	Fix parse_tree_node. much more needs to be done to fix this file. Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> Make drivers.c compile based on a patch from Pat Mochel. From: Patrick Mochel <mochel@digitalimplant.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org> Fix drivers.c to create new device tree nodes when no match is found. Signed-off-by: Richard Hirst <rhirst@parisc-linux.org> Do a proper depth-first search returning parents before children, using the new klist infrastructure. Signed-off-by: Richard Hirst <rhirst@parisc-linux.org> Fixed parisc_device traversal so that pdc_stable works again Fixed check_dev so it doesn't dereference a parisc_device until it has verified the bus type Signed-off-by: Randolph Chung <tausq@parisc-linux.org> Convert pa_dev->hpa from an unsigned long to a struct resource. Use insert_resource() instead of request_mem_region(). Request resources at bus walk time instead of driver probe time. Don't release the resources as we don't have any hotplug parisc_device support yet. Add parisc_pathname() to conveniently get the textual representation of the hwpath used in sysfs. Inline the remnants of claim_device() into its caller. Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> I noticed that some of the STI regions weren't showing up in iomem. Reading the STI spec indicated that all STI devices occupy at least 32MB. So check for STI HPAs and give them 32MB instead of 4kB. Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2005-09-21	[PATCH] Remove unused var from asm/futex.h	Paolo 'Blaisorblade' Giarrusso
	As recently done by Russell King for ARM, commit 4732efbeb997189d9f9b04708dc26bf8613ed721 introduces a generic asm/futex.h copied along most arches, which includes a "-ENOSYS support" to be changed if needed. However, it includes an unused var (taken from the "real" version) which GCC warns about. Remove it from all arches having that file version (i.e. same GIT id). $ git-diff-tree -r HEAD and $ git-ls-tree -r HEAD include/\|grep 9feff4ce1424bc390608326240be369eb13aa648 may be more interesting than looking at the patch itself, to make sure I've just copied the arm header to all other archs having the original dummy version of this file. Cc: Jakub Jelinek <jakub@redhat.com> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13	[PATCH] feature removal of io_remap_page_range()	Randy Dunlap
	As written in Documentation/feature-removal-schedule.txt, remove the io_remap_page_range() kernel API. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10	[PATCH] spinlock consolidation	Ingo Molnar
	This patch (written by me and also containing many suggestions of Arjan van de Ven) does a major cleanup of the spinlock code. It does the following things: - consolidates and enhances the spinlock/rwlock debugging code - simplifies the asm/spinlock.h files - encapsulates the raw spinlock type and moves generic spinlock features (such as ->break_lock) into the generic code. - cleans up the spinlock code hierarchy to get rid of the spaghetti. Most notably there's now only a single variant of the debugging code, located in lib/spinlock_debug.c. (previously we had one SMP debugging variant per architecture, plus a separate generic one for UP builds) Also, i've enhanced the rwlock debugging facility, it will now track write-owners. There is new spinlock-owner/CPU-tracking on SMP builds too. All locks have lockup detection now, which will work for both soft and hard spin/rwlock lockups. The arch-level include files now only contain the minimally necessary subset of the spinlock code - all the rest that can be generalized now lives in the generic headers: include/asm-i386/spinlock_types.h \| 16 include/asm-x86_64/spinlock_types.h \| 16 I have also split up the various spinlock variants into separate files, making it easier to see which does what. The new layout is: SMP \| UP ----------------------------\|----------------------------------- asm/spinlock_types_smp.h \| linux/spinlock_types_up.h linux/spinlock_types.h \| linux/spinlock_types.h asm/spinlock_smp.h \| linux/spinlock_up.h linux/spinlock_api_smp.h \| linux/spinlock_api_up.h linux/spinlock.h \| linux/spinlock.h /* * here's the role of the various spinlock/rwlock related include files: * * on SMP builds: * * asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the * initializers * * linux/spinlock_types.h: * defines the generic type and initializers * * asm/spinlock.h: contains the __raw_spin_()/etc. lowlevel implementations, mostly inline assembly code * * (also included on UP-debug builds:) * * linux/spinlock_api_smp.h: * contains the prototypes for the _spin_() APIs. * linux/spinlock.h: builds the final spin_() APIs. * on UP builds: * * linux/spinlock_type_up.h: * contains the generic, simplified UP spinlock type. * (which is an empty structure on non-debug builds) * * linux/spinlock_types.h: * defines the generic type and initializers * * linux/spinlock_up.h: * contains the __raw_spin_()/etc. version of UP builds. (which are NOPs on non-debug, non-preempt * builds) * * (included on UP-non-debug builds:) * * linux/spinlock_api_up.h: * builds the _spin_() APIs. * linux/spinlock.h: builds the final spin_() APIs. / All SMP and UP architectures are converted by this patch. arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via crosscompilers. m32r, mips, sh, sparc, have not been tested yet, but should be mostly fine. From: Grant Grundler <grundler@parisc-linux.org> Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU). Builds 32-bit SMP kernel (not booted or tested). I did not try to build non-SMP kernels. That should be trivial to fix up later if necessary. I converted bit ops atomic_hash lock to raw_spinlock_t. Doing so avoids some ugly nesting of linux/.h and asm/.h files. Those particular locks are well tested and contained entirely inside arch specific code. I do NOT expect any new issues to arise with them. If someone does ever need to use debug/metrics with them, then they will need to unravel this hairball between spinlocks, atomic ops, and bit ops that exist only because parisc has exactly one atomic instruction: LDCW (load and clear word). From: "Luck, Tony" <tony.luck@intel.com> ia64 fix Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjanv@infradead.org> Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se> Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09	Merge master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild	Linus Torvalds

2005-09-09	kbuild: m68k,parisc,ppc,ppc64,s390,xtensa use generic asm-offsets.h support	Sam Ravnborg
	Delete obsoleted parts form arch makefiles and rename to asm-offsets.h Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-09-08	[PATCH] Make sparc64 use setup-res.c	David S. Miller
	There were three changes necessary in order to allow sparc64 to use setup-res.c: 1) Sparc64 roots the PCI I/O and MEM address space using parent resources contained in the PCI controller structure. I'm actually surprised no other platforms do this, especially ones like Alpha and PPC{,64}. These resources get linked into the iomem/ioport tree when PCI controllers are probed. So the hierarchy looks like this: iomem --\| PCI controller 1 MEM space --\| device 1 device 2 etc. PCI controller 2 MEM space --\| ... ioport --\| PCI controller 1 IO space --\| ... PCI controller 2 IO space --\| ... You get the idea. The drivers/pci/setup-res.c code allocates using plain iomem_space and ioport_space as the root, so that wouldn't work with the above setup. So I added a pcibios_select_root() that is used to handle this. It uses the PCI controller struct's io_space and mem_space on sparc64, and io{port,mem}_resource on every other platform to keep current behavior. 2) quirk_io_region() is buggy. It takes in raw BUS view addresses and tries to use them as a PCI resource. pci_claim_resource() expects the resource to be fully formed when it gets called. The sparc64 implementation would do the translation but that's absolutely wrong, because if the same resource gets released then re-claimed we'll adjust things twice. So I fixed up quirk_io_region() to do the proper pcibios_bus_to_resource() conversion before passing it on to pci_claim_resource(). 3) I was mistakedly __init'ing the function methods the PCI controller drivers provide on sparc64 to implement some parts of these routines. This was, of course, easy to fix. So we end up with the following, and that nasty SPARC64 makefile ifdef in drivers/pci/Makefile is finally zapped. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-07	[PATCH] Clean up struct flock64 definitions	Stephen Rothwell
	This patch gathers all the struct flock64 definitions (and the operations), puts them under !CONFIG_64BIT and cleans up the arch files. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] Clean up struct flock definitions	Stephen Rothwell
	This patch just gathers together all the struct flock definitions except xtensa into asm-generic/fcntl.h. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] Clean up the fcntl operations	Stephen Rothwell
	This patch puts the most popular of each fcntl operation/flag into asm-generic/fcntl.h and cleans up the arch files. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] Clean up the open flags	Stephen Rothwell
	This patch puts the most popular of each open flag into asm-generic/fcntl.h and cleans up the arch files. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] Create asm-generic/fcntl.h	Stephen Rothwell
	This set of patches creates asm-generic/fcntl.h and consolidates as much as possible from the asm-/fcntl.h files into it. This patch just gathers all the identical bits of the asm-/fcntl.h files into asm-generic/fcntl.h. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] remove verify_area(): remove verify_area() from various uaccess.h ↵	Jesper Juhl
	headers Remove the deprecated (and unused) verify_area() from various uaccess.h headers. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] remove asm-*/hdreg.h	Christoph Hellwig
	unused and useless.. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] CHECK_IRQ_PER_CPU() to avoid dead code in __do_IRQ()	Karsten Wiese
	IRQ_PER_CPU is not used by all architectures. This patch introduces the macros ARCH_HAS_IRQ_PER_CPU and CHECK_IRQ_PER_CPU() to avoid the generation of dead code in __do_IRQ(). ARCH_HAS_IRQ_PER_CPU is defined by architectures using IRQ_PER_CPU in their include/asm_ARCH/irq.h file. Through grepping the tree I found the following architectures currently use IRQ_PER_CPU: cris, ia64, ppc, ppc64 and parisc. Signed-off-by: Karsten Wiese <annabellesgarden@yahoo.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] auxiliary vector cleanups	H. J. Lu
	The size of auxiliary vector is fixed at 42 in linux/sched.h. But it isn't very obvious when looking at linux/elf.h. This patch adds AT_VECTOR_SIZE so that we can change it if necessary when a new vector is added. Because of include file ordering problems, doing this necessitated the extraction of the AT_* symbols into a standalone header file. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] compat: be more consistent about [ug]id_t	Stephen Rothwell
	When I first wrote the compat layer patches, I was somewhat cavalier about the definition of compat_uid_t and compat_gid_t (or maybe I just misunderstood :-)). This patch makes the compat types much more consistent with the types we are being compatible with and hopefully will fix a few bugs along the way. compat type type in compat arch __compat_[ug]id_t __kernel_[ug]id_t __compat_[ug]id32_t __kernel_[ug]id32_t compat_[ug]id_t [ug]id_t The difference is that compat_uid_t is always 32 bits (for the archs we care about) but __compat_uid_t may be 16 bits on some. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] FUTEX_WAKE_OP: pthread_cond_signal() speedup	Jakub Jelinek
	ATM pthread_cond_signal is unnecessarily slow, because it wakes one waiter (which at least on UP usually means an immediate context switch to one of the waiter threads). This waiter wakes up and after a few instructions it attempts to acquire the cv internal lock, but that lock is still held by the thread calling pthread_cond_signal. So it goes to sleep and eventually the signalling thread is scheduled in, unlocks the internal lock and wakes the waiter again. Now, before 2003-09-21 NPTL was using FUTEX_REQUEUE in pthread_cond_signal to avoid this performance issue, but it was removed when locks were redesigned to the 3 state scheme (unlocked, locked uncontended, locked contended). Following scenario shows why simply using FUTEX_REQUEUE in pthread_cond_signal together with using lll_mutex_unlock_force in place of lll_mutex_unlock is not enough and probably why it has been disabled at that time: The number is value in cv->__data.__lock. thr1 thr2 thr3 0 pthread_cond_wait 1 lll_mutex_lock (cv->__data.__lock) 0 lll_mutex_unlock (cv->__data.__lock) 0 lll_futex_wait (&cv->__data.__futex, futexval) 0 pthread_cond_signal 1 lll_mutex_lock (cv->__data.__lock) 1 pthread_cond_signal 2 lll_mutex_lock (cv->__data.__lock) 2 lll_futex_wait (&cv->__data.__lock, 2) 2 lll_futex_requeue (&cv->__data.__futex, 0, 1, &cv->__data.__lock) # FUTEX_REQUEUE, not FUTEX_CMP_REQUEUE 2 lll_mutex_unlock_force (cv->__data.__lock) 0 cv->__data.__lock = 0 0 lll_futex_wake (&cv->__data.__lock, 1) 1 lll_mutex_lock (cv->__data.__lock) 0 lll_mutex_unlock (cv->__data.__lock) # Here, lll_mutex_unlock doesn't know there are threads waiting # on the internal cv's lock Now, I believe it is possible to use FUTEX_REQUEUE in pthread_cond_signal, but it will cost us not one, but 2 extra syscalls and, what's worse, one of these extra syscalls will be done for every single waiting loop in pthread_cond_wait. We would need to use lll_mutex_unlock_force in pthread_cond_signal after requeue and lll_mutex_cond_lock in pthread_cond_wait after lll_futex_wait. Another alternative is to do the unlocking pthread_cond_signal needs to do (the lock can't be unlocked before lll_futex_wake, as that is racy) in the kernel. I have implemented both variants, futex-requeue-glibc.patch is the first one and futex-wake_op{,-glibc}.patch is the unlocking inside of the kernel. The kernel interface allows userland to specify how exactly an unlocking operation should look like (some atomic arithmetic operation with optional constant argument and comparison of the previous futex value with another constant). It has been implemented just for ppc*, x86_64 and i?86, for other architectures I'm including just a stub header which can be used as a starting point by maintainers to write support for their arches and ATM will just return -ENOSYS for FUTEX_WAKE_OP. The requeue patch has been (lightly) tested just on x86_64, the wake_op patch on ppc64 kernel running 32-bit and 64-bit NPTL and x86_64 kernel running 32-bit and 64-bit NPTL. With the following benchmark on UP x86-64 I get: for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so --library-path .:$i /tmp/bench; \ for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 2>&1; done; done time elf/ld.so --library-path .:nptl-orig /tmp/bench real 0m0.655s user 0m0.253s sys 0m0.403s real 0m0.657s user 0m0.269s sys 0m0.388s time elf/ld.so --library-path .:nptl-requeue /tmp/bench real 0m0.496s user 0m0.225s sys 0m0.271s real 0m0.531s user 0m0.242s sys 0m0.288s time elf/ld.so --library-path .:nptl-wake_op /tmp/bench real 0m0.380s user 0m0.176s sys 0m0.204s real 0m0.382s user 0m0.175s sys 0m0.207s The benchmark is at: http://sourceware.org/ml/libc-alpha/2005-03/txt00001.txt Older futex-requeue-glibc.patch version is at: http://sourceware.org/ml/libc-alpha/2005-03/txt00002.txt Older futex-wake_op-glibc.patch version is at: http://sourceware.org/ml/libc-alpha/2005-03/txt00003.txt Will post a new version (just x86-64 fixes so that the patch applies against pthread_cond_signal.S) to libc-hacker ml soon. Attached is the kernel FUTEX_WAKE_OP patch as well as a simple-minded testcase that will not test the atomicity of the operation, but at least check if the threads that should have been woken up are woken up and whether the arithmetic operation in the kernel gave the expected results. Acked-by: Ingo Molnar <mingo@redhat.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Jamie Lokier <jamie@shareable.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05	[PATCH] sab: consolidate kmem_bufctl_t	Kyle Moffett
	This is used only in slab.c and each architecture gets to define whcih underlying type is to be used. Seems a bit silly - move it to slab.c and use the same type for all architectures: unsigned int. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05	[PATCH] mm: consolidate get_order	Stephen Rothwell
	Someone mentioned that almost all the architectures used basically the same implementation of get_order. This patch consolidates them into asm-generic/page.h and includes that in the appropriate places. The exceptions are ia64 and ppc which have their own (presumably optimised) versions. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-29	[NET]: Introduce SO_{SND,RCV}BUFFORCE socket options	Patrick McHardy
	Allows overriding of sysctl_{wmem,rmrm}_max Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-04	[PATCH] pci and yenta: pcibios_bus_to_resource	Dominik Brodowski
	In yenta_socket, we default to using the resource setting of the CardBus bridge. However, this is a PCI-bus-centric view of resources and thus needs to be converted to generic resources first. Therefore, add a call to pcibios_bus_to_resource() call in between. This function is a mere wrapper on x86 and friends, however on some others it already exists, is added in this patch (alpha, arm, ppc, ppc64) or still needs to be provided (parisc -- where is its pcibios_resource_to_bus() ?). Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26	[PATCH] Add emergency_restart()	Eric W. Biederman
	When the kernel is working well and we want to restart cleanly kernel_restart is the function to use. But in many instances the kernel wants to reboot when thing are expected to be working very badly such as from panic or a software watchdog handler. This patch adds the function emergency_restart() so that callers can be clear what semantics they expect when calling restart. emergency_restart() is expected to be callable from interrupt context and possibly reliable in even more trying circumstances. This is an initial generic implementation for all architectures. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-29	[PATCH] Serial: Split 8250 port table (part 2)	Russell King
	Remove legacy ISA serial ports for Accent, Boca, Fourport, Hub6 and MCA from the architecture specific serial.h include. The only ports which remain in asm-*/serial.h are the platform specific entries. These should really be converted by platform maintainers to use a platform device, such as can be found in arch/arm/mach-footbridge/isa.c Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-27	[PATCH] PCI: fix up errors after dma bursting patch and CONFIG_PCI=n	Andrew Morton
	With CONFIG_PCI=n: In file included from include/linux/pci.h:917, from lib/iomap.c:6: include/asm/pci.h:104: warning: `enum pci_dma_burst_strategy' declared inside parameter list include/asm/pci.h:104: warning: its scope is only this definition or declaration, which is probably not what you want. include/asm/pci.h: In function `pci_dma_burst_advice': include/asm/pci.h:106: dereferencing pointer to incomplete type include/asm/pci.h:106: `PCI_DMA_BURST_INFINITY' undeclared (first use in this function) include/asm/pci.h:106: (Each undeclared identifier is reported only once include/asm/pci.h:106: for each function it appears in.) make[1]: *** [lib/iomap.o] Error 1 Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27	[PATCH] PCI: DMA bursting advice	David S. Miller
	After seeing, at best, "guesses" as to the following kind of information in several drivers, I decided that we really need a way for platforms to specifically give advice in this area for what works best with their PCI controller implementation. Basically, this new interface gives DMA bursting advice on PCI. There are three forms of the advice: 1) Burst as much as possible, it is not necessary to end bursts on some particular boundary for best performance. 2) Burst on some byte count multiple. A DMA burst to some multiple of number of bytes may be done, but it is important to end the burst on an exact multiple for best performance. The best example of this I am aware of are the PPC64 PCI controllers, where if you end a burst mid-cacheline then chip has to refetch the data and the IOMMU translations which hurts performance a lot. 3) Burst on a single byte count multiple. Bursts shall end exactly on the next multiple boundary for best performance. Sparc64 and Alpha's PCI controllers operate this way. They disconnect any device which tries to burst across a cacheline boundary. Actually, newer sparc64 PCI controllers do not have this behavior. That is why the "pdev" is passed into the interface, so I can add code later to check which PCI controller the system is using and give advice accordingly. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-23	[PATCH] compat: introduce compat_time_t	Stephen Rothwell
	This patch is based on work by Carlos O'Donell and Matthew Wilcox. It introduces/updates the compat_time_t type and uses it for compat siginfo structures. I have built this on ppc64 and x86_64. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] streamline preempt_count type across archs	Jesper Juhl
	The preempt_count member of struct thread_info is currently either defined as int, unsigned int or __s32 depending on arch. This patch makes the type of preempt_count an int on all archs. Having preempt_count be an unsigned type prevents the catching of preempt_count < 0 bugs, and using int on some archs and __s32 on others is not exactely "neat" - much nicer when it's just int all over. A previous version of this patch was already ACK'ed by Robert Love, and the only change in this version of the patch compared to the one he ACK'ed is that this one also makes sure the preempt_count member is consistently commented. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] remove non-DISCONTIG use of pgdat->node_mem_map	Dave Hansen
	This patch effectively eliminates direct use of pgdat->node_mem_map outside of the DISCONTIG code. On a flat memory system, these fields aren't currently used, neither are they on a sparsemem system. There was also a node_mem_map(nid) macro on many architectures. Its use along with the use of ->node_mem_map itself was not consistent. It has been removed in favor of two new, more explicit, arch-independent macros: pgdat_page_nr(pgdat, pagenr) nid_page_nr(nid, pagenr) I called them "pgdat" and "nid" because we overload the term "node" to mean "NUMA node", "DISCONTIG node" or "pg_data_t" in very confusing ways. I believe the newer names are much clearer. These macros can be overridden in the sparsemem case with a theoretically slower operation using node_start_pfn and pfn_to_page(), instead. We could make this the only behavior if people want, but I don't want to change too much at once. One thing at a time. This patch removes more code than it adds. Compile tested on alpha, alpha discontig, arm, arm-discontig, i386, i386 generic, NUMAQ, Summit, ppc64, ppc64 discontig, and x86_64. Full list here: http://sr71.net/patches/2.6.12/2.6.12-rc1-mhp2/configs/ Boot tested on NUMAQ, x86 SMP and ppc64 power4/5 LPARs. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Martin J. Bligh <mbligh@aracnet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21	[PATCH] smp_processor_id() cleanup	Ingo Molnar
	This patch implements a number of smp_processor_id() cleanup ideas that Arjan van de Ven and I came up with. The previous __smp_processor_id/_smp_processor_id/smp_processor_id API spaghetti was hard to follow both on the implementational and on the usage side. Some of the complexity arose from picking wrong names, some of the complexity comes from the fact that not all architectures defined __smp_processor_id. In the new code, there are two externally visible symbols: - smp_processor_id(): debug variant. - raw_smp_processor_id(): nondebug variant. Replaces all existing uses of _smp_processor_id() and __smp_processor_id(). Defined by every SMP architecture in include/asm-*/smp.h. There is one new internal symbol, dependent on DEBUG_PREEMPT: - debug_smp_processor_id(): internal debug variant, mapped to smp_processor_id(). Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new lib/smp_processor_id.c file. All related comments got updated and/or clarified. I have build/boot tested the following 8 .config combinations on x86: {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT} I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT. (Other architectures are untested, but should work just fine.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05	[PATCH] make some things static	Adrian Bunk
	This patch makes some needlessly global identifiers static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Arjan van de Ven <arjanv@infradead.org> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01	[PATCH] misc verify_area cleanups	Jesper Juhl
	There were still a few comments left refering to verify_area, and two functions, verify_area_skas & verify_area_tt that just wrap corresponding access_ok_skas & access_ok_tt functions, just like verify_area does for access_ok - deprecate those. There was also a few places that still used verify_area in commented-out code, fix those up to use access_ok. After applying this one there should not be anything left but finally removing verify_area completely, which will happen after a kernel release or two. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01	[PATCH] add EOWNERDEAD and ENOTRECOVERABLE version 2	Joe Korty
	Add EOWNERDEAD and ENOTRECOVERABLE to all architectures. This is to support the upcoming patches for robust mutexes. We normally don't reserve parts of the name/number space for external patches, but robust mutexes are sufficiently popular and important to justify it in this case. Signed-off-by: Joe Korty <joe.korty@ccur.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>