aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2007-07-16FRV: Connect up new syscallsDavid Howells
Connect up new system calls. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16split mmapMiklos Szeredi
This is a straightforward split of do_mmap_pgoff() into two functions: - do_mmap_pgoff() checks the parameters, and calculates the vma flags. Then it calls - mmap_region(), which does the actual mapping Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16slob: initial NUMA supportPaul Mundt
This adds preliminary NUMA support to SLOB, primarily aimed at systems with small nodes (tested all the way down to a 128kB SRAM block), whether asymmetric or otherwise. We follow the same conventions as SLAB/SLUB, preferring current node placement for new pages, or with explicit placement, if a node has been specified. Presently on UP NUMA this has the side-effect of preferring node#0 allocations (since numa_node_id() == 0, though this could be reworked if we could hand off a pfn to determine node placement), so single-CPU NUMA systems will want to place smaller nodes further out in terms of node id. Once a page has been bound to a node (via explicit node id typing), we only do block allocations from partial free pages that have a matching node id in the page flags. The current implementation does have some scalability problems, in that all partial free pages are tracked in the global freelist (with contention due to the single spinlock). However, these are things that are being reworked for SMP scalability first, while things like per-node freelists can easily be built on top of this sort of functionality once it's been added. More background can be found in: http://marc.info/?l=linux-mm&m=118117916022379&w=2 http://marc.info/?l=linux-mm&m=118170446306199&w=2 http://marc.info/?l=linux-mm&m=118187859420048&w=2 and subsequent threads. Acked-by: Christoph Lameter <clameter@sgi.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16kill vmalloc_earlyreserveJan Beulich
This symbol got orphaned quite a while ago. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16page table handling cleanupJan Beulich
Kill pte_rdprotect(), pte_exprotect(), pte_mkread(), pte_mkexec(), pte_read(), pte_exec(), and pte_user() except where arch-specific code is making use of them. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16invalidate_mapping_pages(): add cond_reschedAndrew Morton
invalidate_mapping_pages() can sometimes take a long time (millions of pages to free). Long enough for the softlockup detector to trigger. We used to have a cond_resched() in there but I took it out because the drop_caches code calls invalidate_mapping_pages() under inode_lock. The patch adds a nasty flag and puts the cond_resched() back. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16Remove the deprecated "kmem_cache_t" typedef from slab.h.Robert P. J. Day
Given that there is no remaining usage of the deprecated kmem_cache_t typedef anywhere in the tree, remove that typedef. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16change zonelist order: zonelist order selection logicKAMEZAWA Hiroyuki
Make zonelist creation policy selectable from sysctl/boot option v6. This patch makes NUMA's zonelist (of pgdat) order selectable. Available order are Default(automatic)/ Node-based / Zone-based. [Default Order] The kernel selects Node-based or Zone-based order automatically. [Node-based Order] This policy treats the locality of memory as the most important parameter. Zonelist order is created by each zone's locality. This means lower zones (ex. ZONE_DMA) can be used before higher zone (ex. ZONE_NORMAL) exhausion. IOW. ZONE_DMA will be in the middle of zonelist. current 2.6.21 kernel uses this. Pros. * A user can expect local memory as much as possible. Cons. * lower zone will be exhansted before higher zone. This may cause OOM_KILL. Maybe suitable if ZONE_DMA is relatively big and you never see OOM_KILL because of ZONE_DMA exhaution and you need the best locality. (example) assume 2 node NUMA. node(0) has ZONE_DMA/ZONE_NORMAL, node(1) has ZONE_NORMAL. *node(0)'s memory allocation order: node(0)'s NORMAL -> node(0)'s DMA -> node(1)'s NORMAL. *node(1)'s memory allocation order: node(1)'s NORMAL -> node(0)'s NORMAL -> node(0)'s DMA. [Zone-based order] This policy treats the zone type as the most important parameter. Zonelist order is created by zone-type order. This means lower zone never be used bofere higher zone exhaustion. IOW. ZONE_DMA will be always at the tail of zonelist. Pros. * OOM_KILL(bacause of lower zone) occurs only if the whole zones are exhausted. Cons. * memory locality may not be best. (example) assume 2 node NUMA. node(0) has ZONE_DMA/ZONE_NORMAL, node(1) has ZONE_NORMAL. *node(0)'s memory allocation order: node(0)'s NORMAL -> node(1)'s NORMAL -> node(0)'s DMA. *node(1)'s memory allocation order: node(1)'s NORMAL -> node(0)'s NORMAL -> node(0)'s DMA. bootoption "numa_zonelist_order=" and proc/sysctl is supporetd. command: %echo N > /proc/sys/vm/numa_zonelist_order Will rebuild zonelist in Node-based order. command: %echo Z > /proc/sys/vm/numa_zonelist_order Will rebuild zonelist in Zone-based order. Thanks to Lee Schermerhorn, he gives me much help and codes. [Lee.Schermerhorn@hp.com: add check_highest_zone to build_zonelists_in_zone_order] [akpm@linux-foundation.org: build fix] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@suse.de> Cc: "jesse.barnes@intel.com" <jesse.barnes@intel.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16serial: convert early_uart to earlycon for 8250Yinghai Lu
Beacuse SERIAL_PORT_DFNS is removed from include/asm-i386/serial.h and include/asm-x86_64/serial.h. the serial8250_ports need to be probed late in serial initializing stage. the console_init=>serial8250_console_init=> register_console=>serial8250_console_setup will return -ENDEV, and console ttyS0 can not be enabled at that time. need to wait till uart_add_one_port in drivers/serial/serial_core.c to call register_console to get console ttyS0. that is too late. Make early_uart to use early_param, so uart console can be used earlier. Make it to be bootconsole with CON_BOOT flag, so can use console handover feature. and it will switch to corresponding normal serial console automatically. new command line will be: console=uart8250,io,0x3f8,9600n8 console=uart8250,mmio,0xff5e0000,115200n8 or earlycon=uart8250,io,0x3f8,9600n8 earlycon=uart8250,mmio,0xff5e0000,115200n8 it will print in very early stage: Early serial console at I/O port 0x3f8 (options '9600n8') console [uart0] enabled later for console it will print: console handover: boot [uart0] -> real [ttyS0] Signed-off-by: <yinghai.lu@sun.com> Cc: Andi Kleen <ak@suse.de> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Gerd Hoffmann <kraxel@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16lib: add idr_remove_allKristian Hoegsberg
Remove all ids from the given idr tree. idr_destroy() only frees up unused, cached idp_layers, but this function will remove all id mappings and leave all idp_layers unused. A typical clean-up sequence for objects stored in an idr tree, will use idr_for_each() to free all objects, if necessay, then idr_remove_all() to remove all ids, and idr_destroy() to free up the cached idr_layers. Signed-off-by: Kristian Hoegsberg <krh@redhat.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16lib: add idr_for_each()Kristian Hoegsberg
This patch adds an iterator function for the idr data structure. Compared to just iterating through the idr with an integer and idr_find, this iterator is (almost, but not quite) linear in the number of elements, as opposed to the number of integers in the range covered by the idr. This makes a difference for sparse idrs, but more importantly, it's a nicer way to iterate through the elements. The drm subsystem is moving to idr for tracking contexts and drawables, and with this change, we can use the idr exclusively for tracking these resources. [akpm@linux-foundation.org: fix comment] Signed-off-by: Kristian Hoegsberg <krh@redhat.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16LZO1X: fix lzo1x_worst_compressNitin Gupta
This is a correction for a macro which gives worst case compressed data size by LZO1X. This patch was provided by the LZO author (Markus Oberhumer). Signed-off-by: Nitin Gupta <nitingupta910@gmail.com> Cc: "Markus F.X.J. Oberhumer" <markus@oberhumer.com> Cc: "Richard Purdie" <rpurdie@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-15Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (166 commits) [SCSI] ibmvscsi: convert to use the data buffer accessors [SCSI] dc395x: convert to use the data buffer accessors [SCSI] ncr53c8xx: convert to use the data buffer accessors [SCSI] sym53c8xx: convert to use the data buffer accessors [SCSI] ppa: coding police and printk levels [SCSI] aic7xxx_old: remove redundant GFP_ATOMIC from kmalloc [SCSI] i2o: remove redundant GFP_ATOMIC from kmalloc from device.c [SCSI] remove the dead CYBERSTORMIII_SCSI option [SCSI] don't build scsi_dma_{map,unmap} for !HAS_DMA [SCSI] Clean up scsi_add_lun a bit [SCSI] 53c700: Remove printk, which triggers because of low scsi clock on SNI RMs [SCSI] sni_53c710: Cleanup [SCSI] qla4xxx: Fix underrun/overrun conditions [SCSI] megaraid_mbox: use mutex instead of semaphore [SCSI] aacraid: add 51245, 51645 and 52245 adapters to documentation. [SCSI] qla2xxx: update version to 8.02.00-k1. [SCSI] qla2xxx: add support for NPIV [SCSI] stex: use resid for xfer len information [SCSI] Add Brownie 1200U3P to blacklist [SCSI] scsi.c: convert to use the data buffer accessors ...
2007-07-15Merge branch 'master' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (53 commits) [TCP]: Verify the presence of RETRANS bit when leaving FRTO [IPV6]: Call inet6addr_chain notifiers on link down [NET_SCHED]: Kill CONFIG_NET_CLS_POLICE [NET_SCHED]: act_api: qdisc internal reclassify support [NET_SCHED]: sch_dsmark: act_api support [NET_SCHED]: sch_atm: act_api support [NET_SCHED]: sch_atm: Lindent [IPV6]: MSG_ERRQUEUE messages do not pass to connected raw sockets [IPV4]: Cleanup call to __neigh_lookup() [NET_SCHED]: Revert "avoid transmit softirq on watchdog wakeup" optimization [NETFILTER]: nf_conntrack: UDPLITE support [NETFILTER]: nf_conntrack: mark protocols __read_mostly [NETFILTER]: x_tables: add connlimit match [NETFILTER]: Lower *tables printk severity [NETFILTER]: nf_conntrack: Don't track locally generated special ICMP error [NETFILTER]: nf_conntrack: Introduces nf_ct_get_tuplepr and uses it [NETFILTER]: nf_conntrack: make l3proto->prepare() generic and renames it [NETFILTER]: nf_conntrack: Increment error count on parsing IPv4 header [NET]: Add ethtool support for NETIF_F_IPV6_CSUM devices. [AF_IUCV]: Add lock when updating accept_q ...
2007-07-15Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: fix a race condition bug in umount which caused a segfault 9p: re-enable mount time debug option 9p: cache meta-data when cache=loose net/9p: set error to EREMOTEIO if trans->write returns zero net/9p: change net/9p module name to 9pnet 9p: Reorganization of 9p file system code
2007-07-15frv: missing __clear_user()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-15fix return type of skb_checksum_complete()Al Viro
It returns __sum16, not unsigned int Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-15[NET_SCHED]: Kill CONFIG_NET_CLS_POLICEPatrick McHardy
The NET_CLS_ACT option is now a full replacement for NET_CLS_POLICE, remove the old code. The config option will be kept around to select the equivalent NET_CLS_ACT options for a short time to allow easier upgrades. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-15[NET_SCHED]: act_api: qdisc internal reclassify supportPatrick McHardy
The behaviour of NET_CLS_POLICE for TC_POLICE_RECLASSIFY was to return it to the qdisc, which could handle it internally or ignore it. With NET_CLS_ACT however, tc_classify starts over at the first classifier and never returns it to the qdisc. This makes it impossible to support qdisc-internal reclassification, which in turn makes it impossible to remove the old NET_CLS_POLICE code without breaking compatibility since we have two qdiscs (CBQ and ATM) that support this. This patch adds a tc_classify_compat function that handles reclassification the old way and changes CBQ and ATM to use it. This again is of course not fully backwards compatible with the previous NET_CLS_ACT behaviour. Unfortunately there is no way to fully maintain compatibility *and* support qdisc internal reclassification with NET_CLS_ACT, but this seems like the better choice over keeping the two incompatible options around forever. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14Merge master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6David S. Miller
Conflicts: crypto/Kconfig
2007-07-14[NETFILTER]: nf_conntrack: mark protocols __read_mostlyPatrick McHardy
Also remove two unnecessary EXPORT_SYMBOLs and move the nf_conntrack_l3proto_ipv4 declaration to the correct file. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14[NETFILTER]: x_tables: add connlimit matchJan Engelhardt
ipt_connlimit has been sitting in POM-NG for a long time. Here is a new shiny xt_connlimit with: * xtables'ified * will request the layer3 module (previously it hotdropped every packet when it was not loaded) * fixed: there was a deadlock in case of an OOM condition * support for any layer4 protocol (e.g. UDP/SCTP) * using jhash, as suggested by Eric Dumazet * ipv6 support Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14[NETFILTER]: nf_conntrack: Introduces nf_ct_get_tuplepr and uses itYasuyuki Kozakai
nf_ct_get_tuple() requires the offset to transport header and that bothers callers such as icmp[v6] l4proto modules. This introduces new function to simplify them. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14[NETFILTER]: nf_conntrack: make l3proto->prepare() generic and renames itYasuyuki Kozakai
The icmp[v6] l4proto modules parse headers in ICMP[v6] error to get tuple. But they have to find the offset to transport protocol header before that. Their processings are almost same as prepare() of l3proto modules. This makes prepare() more generic to simplify icmp[v6] l4proto module later. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14[NET]: Add ethtool support for NETIF_F_IPV6_CSUM devices.Michael Chan
Add ethtool utility function to set or clear IPV6_CSUM feature flag. Modify tg3.c and bnx2.c to use this function when doing ethtool -K to change tx checksum. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14[AF_IUCV]: Add lock when updating accept_qUrsula Braun
The accept_queue of an af_iucv socket will be corrupted, if adding and deleting of entries in this queue occurs at the same time (connect request from one client, while accept call is processed for another client). Solution: add locking when updating accept_q Signed-off-by: Ursula Braun <braunu@de.ibm.com> Acked-by: Frank Pavlic <fpavlic@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14[INET_SOCK]: make net/ipv4/inet_timewait_sock.c:__inet_twsk_kill() staticAdrian Bunk
This patch makes the needlessly global __inet_twsk_kill() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14Merge branch 'upstream-davem' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6
2007-07-14[NET]: Add macvlan driverPatrick McHardy
Add macvlan driver, which allows to create virtual ethernet devices based on MAC address. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14[VLAN]: Use multicast list synchronization helpersPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14[VLAN]: Fix promiscous/allmulti synchronization racesPatrick McHardy
The set_multicast_list function may be called without holding the rtnl mutex, resulting in races when changing the underlying device's promiscous and allmulti state. Use the change_rx_mode hook, which is always invoked under the rtnl. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14[NET]: dev_mcast: add multicast list synchronization helpersPatrick McHardy
The method drivers currently use to synchronize multicast lists is not very pretty: - walk the multicast list - search each entry on a copy of the previous list - if new add to lower device - walk the copy of the previous list - search each entry on the current list - if removed delete from lower device - copy entire list This patch adds a new field to struct dev_addr_list to store the synchronization state and adds two helper functions for synchronization and cleanup. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14[NET]: Add net_device change_rx_mode callbackPatrick McHardy
Currently the set_multicast_list (and set_rx_mode) callbacks are responsible for configuring the device according to the IFF_PROMISC, IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in case of set_rx_mode). These callbacks can be invoked from BH context without the rtnl_mutex by dev_mc_add/dev_mc_delete, which makes reading the device flags and promiscous/allmulti count racy. For real hardware drivers that just commit all changes to the hardware this is not a real problem since the stack guarantees to call them for every change, so at least the final call will not race and commit the correct configuration to the hardware. For software devices that want to synchronize promiscous and multicast state to an underlying device however this can cause corruption of the underlying device's flags or promisc/allmulti counts. When the software device is concurrently put in promiscous or allmulti mode while set_multicast_list is invoked from bottem half context, the device might synchronize the change to the underlying device without holding the rtnl_mutex, which races with concurrent changes to the underlying device. Add a dev->change_rx_flags hook that is invoked when any of the flags that affect rx filtering change (under the rtnl_mutex), which allows drivers to perform synchronization immediately and only synchronize the address lists in set_multicast_list/set_rx_mode. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14[SCSI] Remove unused method scsi_device_cancelPriyanka Gupta
Removes an obsolete method scsi_device_cancel which isn't being used anywhere in the kernel. Signed-off-by: Priyanka Gupta <priyankag@google.com> Acked-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-07-149p: Reorganization of 9p file system codeLatchesar Ionkov
This patchset moves non-filesystem interfaces of v9fs from fs/9p to net/9p. It moves the transport, packet marshalling and connection layers to net/9p leaving only the VFS related files in fs/9p. This work is being done in preparation for in-kernel 9p servers as well as alternate 9p clients (other than VFS). Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-07-13Merge git://git.linux-nfs.org/pub/linux/nfs-2.6Linus Torvalds
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (122 commits) sunrpc: drop BKL around wrap and unwrap NFSv4: Make sure unlock is really an unlock when cancelling a lock NLM: fix source address of callback to client SUNRPC client: add interface for binding to a local address SUNRPC server: record the destination address of a request SUNRPC: cleanup transport creation argument passing NFSv4: Make the NFS state model work with the nosharedcache mount option NFS: Error when mounting the same filesystem with different options NFS: Add the mount option "nosharecache" NFS: Add support for mounting NFSv4 file systems with string options NFS: Add final pieces to support in-kernel mount option parsing NFS: Introduce generic mount client API NFS: Add enums and match tables for mount option parsing NFS: Improve debugging output in NFS in-kernel mount client NFS: Clean up in-kernel NFS mount NFS: Remake nfsroot_mount as a permanent part of NFS client SUNRPC: Add a convenient default for the hostname when calling rpc_create() SUNRPC: Rename rpcb_getport to be consistent with new rpcb_getport_sync name SUNRPC: Rename rpcb_getport_external routine SUNRPC: Allow rpcbind requests to be interrupted by a signal. ...
2007-07-13Merge branch 'ioat-md-accel-for-linus' of ↵Linus Torvalds
git://lost.foo-projects.org/~dwillia2/git/iop * 'ioat-md-accel-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop: (28 commits) ioatdma: add the unisys "i/oat" pci vendor/device id ARM: Add drivers/dma to arch/arm/Kconfig iop3xx: surface the iop3xx DMA and AAU units to the iop-adma driver iop13xx: surface the iop13xx adma units to the iop-adma driver dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines md: remove raid5 compute_block and compute_parity5 md: handle_stripe5 - request io processing in raid5_run_ops md: handle_stripe5 - add request/completion logic for async expand ops md: handle_stripe5 - add request/completion logic for async read ops md: handle_stripe5 - add request/completion logic for async check ops md: handle_stripe5 - add request/completion logic for async compute ops md: handle_stripe5 - add request/completion logic for async write ops md: common infrastructure for running operations with raid5_run_ops md: raid5_run_ops - run stripe operations outside sh->lock raid5: replace custom debug PRINTKs with standard pr_debug raid5: refactor handle_stripe5 and handle_stripe6 (v3) async_tx: add the async_tx api xor: make 'xor_blocks' a library routine for use with async_tx dmaengine: make clients responsible for managing channels dmaengine: refactor dmaengine around dma_async_tx_descriptor ...
2007-07-13Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] Workaround for a sparse warning in include/asm-mips/mach-tx4927/ioremap.h [MIPS] Make show_code static and add __user tag [MIPS] Workaround for a sparse warning in include/asm-mips/compat.h [MIPS] Add some __user tags [MIPS] math-emu minor cleanup [MIPS] Kill CONFIG_TX4927BUG_WORKAROUND [MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_FB_XPERT98 [MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_AU1000_SRC_CLK [MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_AU1000_USE32K [MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_AU1XXX_PSC_SPI [CHAR] Delete leftovers of old Alchemy UART driver
2007-07-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-schedLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: [PATCH] sched: small topology.h cleanup [PATCH] sched: fix show_task()/show_tasks() output [PATCH] sched: remove stale version info from kernel/sched_debug.c [PATCH] sched: allow larger granularity [PATCH] sched: fix prio_to_wmult[] for nice 1 [ I re-did the commits to get rid of some bogus merge commit that Ingo had. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-13[PATCH] sched: small topology.h cleanupIngo Molnar
trivial cleanup: LOCAL_DISTANCE and REMOTE_DISTANCE are only used in topology.h and inside an #ifndef section - limit their existence to that #ifndef. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-13[MIPS] Workaround for a sparse warning in include/asm-mips/mach-tx4927/ioremap.hAtsushi Nemoto
include2/asm/mach-tx49xx/ioremap.h:39:52: warning: cast truncates bits from constant value (fff000000 becomes ff000000) Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-07-13[MIPS] Workaround for a sparse warning in include/asm-mips/compat.hAtsushi Nemoto
Cast to a __user pointer via "unsigned long" to get rid of this warning: include2/asm/compat.h:135:10: warning: cast adds address space to expression (<asn:1>) Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-07-13ioatdma: add the unisys "i/oat" pci vendor/device idDan Williams
Cc: John Magolan <john.magolan@unisys.com> Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2007-07-13iop3xx: surface the iop3xx DMA and AAU units to the iop-adma driverDan Williams
Adds the platform device definitions and the architecture specific support routines (i.e. register initialization and descriptor formats) for the iop-adma driver. Changelog: * add support for > 1k zero sum buffer sizes * added dma/aau platform devices to iq80321 and iq80332 setup * fixed the calculation in iop_desc_is_aligned * support xor buffer sizes larger than 16MB * fix places where software descriptors are assumed to be contiguous, only hardware descriptors are contiguous for up to a PAGE_SIZE buffer size * convert to async_tx * add interrupt support * add platform devices for 80219 boards * do not call platform register macros in driver code * remove switch() statements for compatible register offsets/layouts * change over to bitmap based capabilities * remove unnecessary ARM assembly statement * checkpatch.pl fixes * gpl v2 only correction * phys move to dma_async_tx_descriptor Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2007-07-13iop13xx: surface the iop13xx adma units to the iop-adma driverDan Williams
Adds the platform device definitions and the architecture specific support routines (i.e. register initialization and descriptor formats) for the iop-adma driver. Changelog: * added 'descriptor pool size' to the platform data * add base support for buffer sizes larger than 16MB (hw max) * build error fix from Kirill A. Shutemov * rebase for async_tx changes * add interrupt support * do not call platform register macros in driver code * remove unnecessary ARM assembly statement * checkpatch.pl fixes * gpl v2 only correction Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2007-07-13dmaengine: driver for the iop32x, iop33x, and iop13xx raid enginesDan Williams
The Intel(R) IOP series of i/o processors integrate an Xscale core with raid acceleration engines. The capabilities per platform are: iop219: (2) copy engines iop321: (2) copy engines (1) xor and block fill engine iop33x: (2) copy and crc32c engines (1) xor, xor zero sum, pq, pq zero sum, and block fill engine iop34x (iop13xx): (2) copy, crc32c, xor, xor zero sum, and block fill engines (1) copy, crc32c, xor, xor zero sum, pq, pq zero sum, and block fill engine The driver supports the features of the async_tx api: * asynchronous notification of operation completion * implicit (interupt triggered) handling of inter-channel transaction dependencies The driver adapts to the platform it is running by two methods. 1/ #include <asm/arch/adma.h> which defines the hardware specific iop_chan_* and iop_desc_* routines as a series of static inline functions 2/ The private platform data attached to the platform_device defines the capabilities of the channels 20070626: Callbacks are run in a tasklet. Given the recent discussion on LKML about killing tasklets in favor of workqueues I did a quick conversion of the driver. Raid5 resync performance dropped from 50MB/s to 30MB/s, so the tasklet implementation remains until a generic softirq interface is available. Changelog: * fixed a slot allocation bug in do_iop13xx_adma_xor that caused too few slots to be requested eventually leading to data corruption * enabled the slot allocation routine to attempt to free slots before returning -ENOMEM * switched the cleanup routine to solely use the software chain and the status register to determine if a descriptor is complete. This is necessary to support other IOP engines that do not have status writeback capability * make the driver iop generic * modified the allocation routines to understand allocating a group of slots for a single operation * added a null xor initialization operation for the xor only channel on iop3xx * support xor operations on buffers larger than the hardware maximum * split the do_* routines into separate prep, src/dest set, submit stages * added async_tx support (dependent operations initiation at cleanup time) * simplified group handling * added interrupt support (callbacks via tasklets) * brought the pending depth inline with ioat (i.e. 4 descriptors) * drop dma mapping methods, suggested by Chris Leech * don't use inline in C files, Adrian Bunk * remove static tasklet declarations * make iop_adma_alloc_slots easier to read and remove chances for a corrupted descriptor chain * fix locking bug in iop_adma_alloc_chan_resources, Benjamin Herrenschmidt * convert capabilities over to dma_cap_mask_t * fixup sparse warnings * add descriptor flush before iop_chan_enable * checkpatch.pl fixes * gpl v2 only correction * move set_src, set_dest, submit to async_tx methods * move group_list and phys to async_tx Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2007-07-13md: handle_stripe5 - add request/completion logic for async read opsDan Williams
When a read bio is attached to the stripe and the corresponding block is marked R5_UPTODATE, then a read (biofill) operation is scheduled to copy the data from the stripe cache to the bio buffer. handle_stripe flags the blocks to be operated on with the R5_Wantfill flag. If new read requests arrive while raid5_run_ops is running they will not be handled until handle_stripe is scheduled to run again. Changelog: * cleanup to_read and to_fill accounting * do not fail reads that have reached the cache Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-By: NeilBrown <neilb@suse.de>
2007-07-13md: handle_stripe5 - add request/completion logic for async compute opsDan Williams
handle_stripe will compute a block when a backing disk has failed, or when it determines it can save a disk read by computing the block from all the other up-to-date blocks. Previously a block would be computed under the lock and subsequent logic in handle_stripe could use the newly up-to-date block. With the raid5_run_ops implementation the compute operation is carried out a later time outside the lock. To preserve the old functionality we take advantage of the dependency chain feature of async_tx to flag the block as R5_Wantcompute and then let other parts of handle_stripe operate on the block as if it were up-to-date. raid5_run_ops guarantees that the block will be ready before it is used in another operation. However, this only works in cases where the compute and the dependent operation are scheduled at the same time. If a previous call to handle_stripe sets the R5_Wantcompute flag there is no facility to pass the async_tx dependency chain across successive calls to raid5_run_ops. The req_compute variable protects against this case. Changelog: * remove the req_compute BUG_ON Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-By: NeilBrown <neilb@suse.de>
2007-07-13md: raid5_run_ops - run stripe operations outside sh->lockDan Williams
When the raid acceleration work was proposed, Neil laid out the following attack plan: 1/ move the xor and copy operations outside spin_lock(&sh->lock) 2/ find/implement an asynchronous offload api The raid5_run_ops routine uses the asynchronous offload api (async_tx) and the stripe_operations member of a stripe_head to carry out xor+copy operations asynchronously, outside the lock. To perform operations outside the lock a new set of state flags is needed to track new requests, in-flight requests, and completed requests. In this new model handle_stripe is tasked with scanning the stripe_head for work, updating the stripe_operations structure, and finally dropping the lock and calling raid5_run_ops for processing. The following flags outline the requests that handle_stripe can make of raid5_run_ops: STRIPE_OP_BIOFILL - copy data into request buffers to satisfy a read request STRIPE_OP_COMPUTE_BLK - generate a missing block in the cache from the other blocks STRIPE_OP_PREXOR - subtract existing data as part of the read-modify-write process STRIPE_OP_BIODRAIN - copy data out of request buffers to satisfy a write request STRIPE_OP_POSTXOR - recalculate parity for new data that has entered the cache STRIPE_OP_CHECK - verify that the parity is correct STRIPE_OP_IO - submit i/o to the member disks (note this was already performed outside the stripe lock, but it made sense to add it as an operation type The flow is: 1/ handle_stripe sets STRIPE_OP_* in sh->ops.pending 2/ raid5_run_ops reads sh->ops.pending, sets sh->ops.ack, and submits the operation to the async_tx api 3/ async_tx triggers the completion callback routine to set sh->ops.complete and release the stripe 4/ handle_stripe runs again to finish the operation and optionally submit new operations that were previously blocked Note this patch just defines raid5_run_ops, subsequent commits (one per major operation type) modify handle_stripe to take advantage of this routine. Changelog: * removed ops_complete_biodrain in favor of ops_complete_postxor and ops_complete_write. * removed the raid5_run_ops workqueue * call bi_end_io for reads in ops_complete_biofill, saves a call to handle_stripe * explicitly handle the 2-disk raid5 case (xor becomes memcpy), Neil Brown * fix race between async engines and bi_end_io call for reads, Neil Brown * remove unnecessary spin_lock from ops_complete_biofill * remove test_and_set/test_and_clear BUG_ONs, Neil Brown * remove explicit interrupt handling for channel switching, this feature was absorbed (i.e. it is now implicit) by the async_tx api * use return_io in ops_complete_biofill Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-By: NeilBrown <neilb@suse.de>
2007-07-13raid5: refactor handle_stripe5 and handle_stripe6 (v3)Dan Williams
handle_stripe5 and handle_stripe6 have very deep logic paths handling the various states of a stripe_head. By introducing the 'stripe_head_state' and 'r6_state' objects, large portions of the logic can be moved to sub-routines. 'struct stripe_head_state' consumes all of the automatic variables that previously stood alone in handle_stripe5,6. 'struct r6_state' contains the handle_stripe6 specific variables like p_failed and q_failed. One of the nice side effects of the 'stripe_head_state' change is that it allows for further reductions in code duplication between raid5 and raid6. The following new routines are shared between raid5 and raid6: handle_completed_write_requests handle_requests_to_failed_array handle_stripe_expansion Changes: * v2: fixed 'conf->raid_disk-1' for the raid6 'handle_stripe_expansion' path * v3: removed the unused 'dirty' field from struct stripe_head_state * v3: coalesced open coded bi_end_io routines into return_io() Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-By: NeilBrown <neilb@suse.de>