aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2009-11-24locking: Remove unused prototypeThomas Gleixner
commit 910067d1(remove generic__raw_read_trylock()) removed the implementation but left the prototype around. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-11-24remove CONFIG_SECURITY_FILE_CAPABILITIES compile optionSerge E. Hallyn
As far as I know, all distros currently ship kernels with default CONFIG_SECURITY_FILE_CAPABILITIES=y. Since having the option on leaves a 'no_file_caps' option to boot without file capabilities, the main reason to keep the option is that turning it off saves you (on my s390x partition) 5k. In particular, vmlinux sizes came to: without patch fscaps=n: 53598392 without patch fscaps=y: 53603406 with this patch applied: 53603342 with the security-next tree. Against this we must weigh the fact that there is no simple way for userspace to figure out whether file capabilities are supported, while things like per-process securebits, capability bounding sets, and adding bits to pI if CAP_SETPCAP is in pE are not supported with SECURITY_FILE_CAPABILITIES=n, leaving a bit of a problem for applications wanting to know whether they can use them and/or why something failed. It also adds another subtly different set of semantics which we must maintain at the risk of severe security regressions. So this patch removes the SECURITY_FILE_CAPABILITIES compile option. It drops the kernel size by about 50k over the stock SECURITY_FILE_CAPABILITIES=y kernel, by removing the cap_limit_ptraced_target() function. Changelog: Nov 20: remove cap_limit_ptraced_target() as it's logic was ifndef'ed. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: Andrew G. Morgan" <morgan@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
2009-11-22rcu: Re-arrange code to reduce #ifdef painPaul E. McKenney
Remove #ifdefs from kernel/rcupdate.c and include/linux/rcupdate.h by moving code to include/linux/rcutiny.h, include/linux/rcutree.h, and kernel/rcutree.c. Also remove some definitions that are no longer used. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1258908830885-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-22rcu: Eliminate unneeded function wrappingPaul E. McKenney
The functions rcu_init() is a wrapper for __rcu_init(), and also sets up the CPU-hotplug notifier for rcu_barrier_cpu_hotplug(). But TINY_RCU doesn't need CPU-hotplug notification, and the rcu_barrier_cpu_hotplug() is a simple wrapper for rcu_cpu_notify(). So push rcu_init() out to kernel/rcutree.c and kernel/rcutiny.c and get rid of the wrapper function rcu_barrier_cpu_hotplug(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12589088302320-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-20i2c: i2c-pnx: Made buf type unsigned to prevent sign extensionKevin Wells
Made buf type unsigned to prevent sign extension Signed-off-by: Kevin Wells <kevin.wells@nxp.com> Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2009-11-19vt: Fix use of "new" in a struct fieldAlan Cox
As this struct is exposed to user space and the API was added for this release it's a bit of a pain for the C++ world and we still have time to fix it. Rename the fields before we end up with that pain in an actual release. Signed-off-by: Alan Cox <alan@linux.intel.com> Reported-by: Olivier Goffart Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-11-19CacheFiles: Catch an overly long wait for an old active objectDavid Howells
Catch an overly long wait for an old, dying active object when we want to replace it with a new one. The probability is that all the slow-work threads are hogged, and the delete can't get a look in. What we do instead is: (1) if there's nothing in the slow work queue, we sleep until either the dying object has finished dying or there is something in the slow work queue behind which we can queue our object. (2) if there is something in the slow work queue, we return ETIMEDOUT to fscache_lookup_object(), which then puts us back on the slow work queue, presumably behind the deletion that we're blocked by. We are then deferred for a while until we work our way back through the queue - without blocking a slow-work thread unnecessarily. A backtrace similar to the following may appear in the log without this patch: INFO: task kslowd004:5711 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kslowd004 D 0000000000000000 0 5711 2 0x00000080 ffff88000340bb80 0000000000000046 ffff88002550d000 0000000000000000 ffff88002550d000 0000000000000007 ffff88000340bfd8 ffff88002550d2a8 000000000000ddf0 00000000000118c0 00000000000118c0 ffff88002550d2a8 Call Trace: [<ffffffff81058e21>] ? trace_hardirqs_on+0xd/0xf [<ffffffffa011c4d8>] ? cachefiles_wait_bit+0x0/0xd [cachefiles] [<ffffffffa011c4e1>] cachefiles_wait_bit+0x9/0xd [cachefiles] [<ffffffff81353153>] __wait_on_bit+0x43/0x76 [<ffffffff8111ae39>] ? ext3_xattr_get+0x1ec/0x270 [<ffffffff813531ef>] out_of_line_wait_on_bit+0x69/0x74 [<ffffffffa011c4d8>] ? cachefiles_wait_bit+0x0/0xd [cachefiles] [<ffffffff8104c125>] ? wake_bit_function+0x0/0x2e [<ffffffffa011bc79>] cachefiles_mark_object_active+0x203/0x23b [cachefiles] [<ffffffffa011c209>] cachefiles_walk_to_object+0x558/0x827 [cachefiles] [<ffffffffa011a429>] cachefiles_lookup_object+0xac/0x12a [cachefiles] [<ffffffffa00aa1e9>] fscache_lookup_object+0x1c7/0x214 [fscache] [<ffffffffa00aafc5>] fscache_object_state_machine+0xa5/0x52d [fscache] [<ffffffffa00ab4ac>] fscache_object_slow_work_execute+0x5f/0xa0 [fscache] [<ffffffff81082093>] slow_work_execute+0x18f/0x2d1 [<ffffffff8108239a>] slow_work_thread+0x1c5/0x308 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34 [<ffffffff810821d5>] ? slow_work_thread+0x0/0x308 [<ffffffff8104be91>] kthread+0x7a/0x82 [<ffffffff8100beda>] child_rip+0xa/0x20 [<ffffffff8100b87c>] ? restore_args+0x0/0x30 [<ffffffff8104be17>] ? kthread+0x0/0x82 [<ffffffff8100bed0>] ? child_rip+0x0/0x20 1 lock held by kslowd004/5711: #0: (&sb->s_type->i_mutex_key#7/1){+.+.+.}, at: [<ffffffffa011be64>] cachefiles_walk_to_object+0x1b3/0x827 [cachefiles] Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19CacheFiles: Don't write a full page if there's only a partial page to cacheDavid Howells
cachefiles_write_page() writes a full page to the backing file for the last page of the netfs file, even if the netfs file's last page is only a partial page. This causes the EOF on the backing file to be extended beyond the EOF of the netfs, and thus the backing file will be truncated by cachefiles_attr_changed() called from cachefiles_lookup_object(). So we need to limit the write we make to the backing file on that last page such that it doesn't push the EOF too far. Also, if a backing file that has a partial page at the end is expanded, we discard the partial page and refetch it on the basis that we then have a hole in the file with invalid data, and should the power go out... A better way to deal with this could be to record a note that the partial page contains invalid data until the correct data is written into it. This isn't a problem for netfs's that discard the whole backing file if the file size changes (such as NFS). Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19FS-Cache: Start processing an object's operations on that object's deathDavid Howells
Start processing an object's operations when that object moves into the DYING state as the object cannot be destroyed until all its outstanding operations have completed. Furthermore, make sure that read and allocation operations handle being woken up on a dead object. Such events are recorded in the Allocs.abt and Retrvls.abt statistics as viewable through /proc/fs/fscache/stats. The code for waiting for object activation for the read and allocation operations is also extracted into its own function as it is much the same in all cases, differing only in the stats incremented. Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19FS-Cache: Handle pages pending storage that get evicted under OOM conditionsDavid Howells
Handle netfs pages that the vmscan algorithm wants to evict from the pagecache under OOM conditions, but that are waiting for write to the cache. Under these conditions, vmscan calls the releasepage() function of the netfs, asking if a page can be discarded. The problem is typified by the following trace of a stuck process: kslowd005 D 0000000000000000 0 4253 2 0x00000080 ffff88001b14f370 0000000000000046 ffff880020d0d000 0000000000000007 0000000000000006 0000000000000001 ffff88001b14ffd8 ffff880020d0d2a8 000000000000ddf0 00000000000118c0 00000000000118c0 ffff880020d0d2a8 Call Trace: [<ffffffffa00782d8>] __fscache_wait_on_page_write+0x8b/0xa7 [fscache] [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34 [<ffffffffa0078240>] ? __fscache_check_page_write+0x63/0x70 [fscache] [<ffffffffa00b671d>] nfs_fscache_release_page+0x4e/0xc4 [nfs] [<ffffffffa00927f0>] nfs_release_page+0x3c/0x41 [nfs] [<ffffffff810885d3>] try_to_release_page+0x32/0x3b [<ffffffff81093203>] shrink_page_list+0x316/0x4ac [<ffffffff8109372b>] shrink_inactive_list+0x392/0x67c [<ffffffff813532fa>] ? __mutex_unlock_slowpath+0x100/0x10b [<ffffffff81058df0>] ? trace_hardirqs_on_caller+0x10c/0x130 [<ffffffff8135330e>] ? mutex_unlock+0x9/0xb [<ffffffff81093aa2>] shrink_list+0x8d/0x8f [<ffffffff81093d1c>] shrink_zone+0x278/0x33c [<ffffffff81052d6c>] ? ktime_get_ts+0xad/0xba [<ffffffff81094b13>] try_to_free_pages+0x22e/0x392 [<ffffffff81091e24>] ? isolate_pages_global+0x0/0x212 [<ffffffff8108e743>] __alloc_pages_nodemask+0x3dc/0x5cf [<ffffffff81089529>] grab_cache_page_write_begin+0x65/0xaa [<ffffffff8110f8c0>] ext3_write_begin+0x78/0x1eb [<ffffffff81089ec5>] generic_file_buffered_write+0x109/0x28c [<ffffffff8103cb69>] ? current_fs_time+0x22/0x29 [<ffffffff8108a509>] __generic_file_aio_write+0x350/0x385 [<ffffffff8108a588>] ? generic_file_aio_write+0x4a/0xae [<ffffffff8108a59e>] generic_file_aio_write+0x60/0xae [<ffffffff810b2e82>] do_sync_write+0xe3/0x120 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34 [<ffffffff810b18e1>] ? __dentry_open+0x1a5/0x2b8 [<ffffffff810b1a76>] ? dentry_open+0x82/0x89 [<ffffffffa00e693c>] cachefiles_write_page+0x298/0x335 [cachefiles] [<ffffffffa0077147>] fscache_write_op+0x178/0x2c2 [fscache] [<ffffffffa0075656>] fscache_op_execute+0x7a/0xd1 [fscache] [<ffffffff81082093>] slow_work_execute+0x18f/0x2d1 [<ffffffff8108239a>] slow_work_thread+0x1c5/0x308 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34 [<ffffffff810821d5>] ? slow_work_thread+0x0/0x308 [<ffffffff8104be91>] kthread+0x7a/0x82 [<ffffffff8100beda>] child_rip+0xa/0x20 [<ffffffff8100b87c>] ? restore_args+0x0/0x30 [<ffffffff8102ef83>] ? tg_shares_up+0x171/0x227 [<ffffffff8104be17>] ? kthread+0x0/0x82 [<ffffffff8100bed0>] ? child_rip+0x0/0x20 In the above backtrace, the following is happening: (1) A page storage operation is being executed by a slow-work thread (fscache_write_op()). (2) FS-Cache farms the operation out to the cache to perform (cachefiles_write_page()). (3) CacheFiles is then calling Ext3 to perform the actual write, using Ext3's standard write (do_sync_write()) under KERNEL_DS directly from the netfs page. (4) However, for Ext3 to perform the write, it must allocate some memory, in particular, it must allocate at least one page cache page into which it can copy the data from the netfs page. (5) Under OOM conditions, the memory allocator can't immediately come up with a page, so it uses vmscan to find something to discard (try_to_free_pages()). (6) vmscan finds a clean netfs page it might be able to discard (possibly the one it's trying to write out). (7) The netfs is called to throw the page away (nfs_release_page()) - but it's called with __GFP_WAIT, so the netfs decides to wait for the store to complete (__fscache_wait_on_page_write()). (8) This blocks a slow-work processing thread - possibly against itself. The system ends up stuck because it can't write out any netfs pages to the cache without allocating more memory. To avoid this, we make FS-Cache cancel some writes that aren't in the middle of actually being performed. This means that some data won't make it into the cache this time. To support this, a new FS-Cache function is added fscache_maybe_release_page() that replaces what the netfs releasepage() functions used to do with respect to the cache. The decisions fscache_maybe_release_page() makes are counted and displayed through /proc/fs/fscache/stats on a line labelled "VmScan". There are four counters provided: "nos=N" - pages that weren't pending storage; "gon=N" - pages that were pending storage when we first looked, but weren't by the time we got the object lock; "bsy=N" - pages that we ignored as they were actively being written when we looked; and "can=N" - pages that we cancelled the storage of. What I'd really like to do is alter the behaviour of the cancellation heuristics, depending on how necessary it is to expel pages. If there are plenty of other pages that aren't waiting to be written to the cache that could be ejected first, then it would be nice to hold up on immediate cancellation of cache writes - but I don't see a way of doing that. Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19FS-Cache: Fix lock misorder in fscache_write_op()David Howells
FS-Cache has two structs internally for keeping track of the internal state of a cached file: the fscache_cookie struct, which represents the netfs's state, and fscache_object struct, which represents the cache's state. Each has a pointer that points to the other (when both are in existence), and each has a spinlock for pointer maintenance. Since netfs operations approach these structures from the cookie side, they get the cookie lock first, then the object lock. Cache operations, on the other hand, approach from the object side, and get the object lock first. It is not then permitted for a cache operation to get the cookie lock whilst it is holding the object lock lest deadlock occur; instead, it must do one of two things: (1) increment the cookie usage counter, drop the object lock and then get both locks in order, or (2) simply hold the object lock as certain parts of the cookie may not be altered whilst the object lock is held. It is also not permitted to follow either pointer without holding the lock at the end you start with. To break the pointers between the cookie and the object, both locks must be held. fscache_write_op(), however, violates the locking rules: It attempts to get the cookie lock without (a) checking that the cookie pointer is a valid pointer, and (b) holding the object lock to protect the cookie pointer whilst it follows it. This is so that it can access the pending page store tree without interference from __fscache_write_page(). This is fixed by splitting the cookie lock, such that the page store tracking tree is protected by its own lock, and checking that the cookie pointer is non-NULL before we attempt to follow it whilst holding the object lock. The new lock is subordinate to both the cookie lock and the object lock, and so should be taken after those. Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19FS-Cache: Allow the current state of all objects to be dumpedDavid Howells
Allow the current state of all fscache objects to be dumped by doing: cat /proc/fs/fscache/objects By default, all objects and all fields will be shown. This can be restricted by adding a suitable key to one of the caller's keyrings (such as the session keyring): keyctl add user fscache:objlist "<restrictions>" @s The <restrictions> are: K Show hexdump of object key (don't show if not given) A Show hexdump of object aux data (don't show if not given) And paired restrictions: C Show objects that have a cookie c Show objects that don't have a cookie B Show objects that are busy b Show objects that aren't busy W Show objects that have pending writes w Show objects that don't have pending writes R Show objects that have outstanding reads r Show objects that don't have outstanding reads S Show objects that have slow work queued s Show objects that don't have slow work queued If neither side of a restriction pair is given, then both are implied. For example: keyctl add user fscache:objlist KB @s shows objects that are busy, and lists their object keys, but does not dump their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is not implied. Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19FS-Cache: Annotate slow-work runqueue proc lines for FS-Cache work itemsDavid Howells
Annotate slow-work runqueue proc lines for FS-Cache work items. Objects include the object ID and the state. Operations include the object ID, the operation ID and the operation type and state. Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19SLOW_WORK: Allow a requeueable work item to sleep till the thread is neededDavid Howells
Add a function to allow a requeueable work item to sleep till the thread processing it is needed by the slow-work facility to perform other work. Sometimes a work item can't progress immediately, but must wait for the completion of another work item that's currently being processed by another slow-work thread. In some circumstances, the waiting item could instead - theoretically - put itself back on the queue and yield its thread back to the slow-work facility, thus waiting till it gets processing time again before attempting to progress. This would allow other work items processing time on that thread. However, this only works if there is something on the queue for it to queue behind - otherwise it will just get a thread again immediately, and will end up cycling between the queue and the thread, eating up valuable CPU time. So, slow_work_sleep_till_thread_needed() is provided such that an item can put itself on a wait queue that will wake it up when the event it is actually interested in occurs, then call this function in lieu of calling schedule(). This function will then sleep until either the item's event occurs or another work item appears on the queue. If another work item is queued, but the item's event hasn't occurred, then the work item should requeue itself and yield the thread back to the slow-work facility by returning. This can be used by CacheFiles for an object that is being created on one thread to wait for an object being deleted on another thread where there is nothing on the queue for the creation to go and wait behind. As soon as an item appears on the queue that could be given thread time instead, CacheFiles can stick the creating object back on the queue and return to the slow-work facility - assuming the object deletion didn't also complete. Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19SLOW_WORK: Allow the owner of a work item to determine if it is queued or notDavid Howells
Add a function (slow_work_is_queued()) to permit the owner of a work item to determine if the item is queued or not. The work item is counted as being queued if it is actually on the queue, not just if it is pending. If it is executing and pending, then it is not on the queue, but will rather be put back on the queue when execution finishes. This permits a caller to quickly work out if it may be able to put another, dependent work item on the queue behind it, or whether it will have to wait till that is finished. This can be used by CacheFiles to work out whether the creation a new object can be immediately deferred when it has to wait for an old object to be deleted, or whether a wait must take place. If a wait is necessary, then the slow-work thread can otherwise get blocked, preventing the deletion from taking place. Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19SLOW_WORK: Allow the work items to be viewed through a /proc fileDavid Howells
Allow the executing and queued work items to be viewed through a /proc file for debugging purposes. The contents look something like the following: THR PID ITEM ADDR FL MARK DESC === ===== ================ == ===== ========== 0 3005 ffff880023f52348 a 952ms FSC: OBJ17d3: LOOK 1 3006 ffff880024e33668 2 160ms FSC: OBJ17e5 OP60d3b: Write1/Store fl=2 2 3165 ffff8800296dd180 a 424ms FSC: OBJ17e4: LOOK 3 4089 ffff8800262c8d78 a 212ms FSC: OBJ17ea: CRTN 4 4090 ffff88002792bed8 2 388ms FSC: OBJ17e8 OP60d36: Write1/Store fl=2 5 4092 ffff88002a0ef308 2 388ms FSC: OBJ17e7 OP60d2e: Write1/Store fl=2 6 4094 ffff88002abaf4b8 2 132ms FSC: OBJ17e2 OP60d4e: Write1/Store fl=2 7 4095 ffff88002bb188e0 a 388ms FSC: OBJ17e9: CRTN vsq - ffff880023d99668 1 308ms FSC: OBJ17e0 OP60f91: Write1/EnQ fl=2 vsq - ffff8800295d1740 1 212ms FSC: OBJ16be OP4d4b6: Write1/EnQ fl=2 vsq - ffff880025ba3308 1 160ms FSC: OBJ179a OP58dec: Write1/EnQ fl=2 vsq - ffff880024ec83e0 1 160ms FSC: OBJ17ae OP599f2: Write1/EnQ fl=2 vsq - ffff880026618e00 1 160ms FSC: OBJ17e6 OP60d33: Write1/EnQ fl=2 vsq - ffff880025a2a4b8 1 132ms FSC: OBJ16a2 OP4d583: Write1/EnQ fl=2 vsq - ffff880023cbe6d8 9 212ms FSC: OBJ17eb: LOOK vsq - ffff880024d37590 9 212ms FSC: OBJ17ec: LOOK vsq - ffff880027746cb0 9 212ms FSC: OBJ17ed: LOOK vsq - ffff880024d37ae8 9 212ms FSC: OBJ17ee: LOOK vsq - ffff880024d37cb0 9 212ms FSC: OBJ17ef: LOOK vsq - ffff880025036550 9 212ms FSC: OBJ17f0: LOOK vsq - ffff8800250368e0 9 212ms FSC: OBJ17f1: LOOK vsq - ffff880025036aa8 9 212ms FSC: OBJ17f2: LOOK In the 'THR' column, executing items show the thread they're occupying and queued threads indicate which queue they're on. 'PID' shows the process ID of a slow-work thread that's executing something. 'FL' shows the work item flags. 'MARK' indicates how long since an item was queued or began executing. Lastly, the 'DESC' column permits the owner of an item to give some information. Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19SLOW_WORK: Add delayed_slow_work supportJens Axboe
This adds support for starting slow work with a delay, similar to the functionality we have for workqueues. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19SLOW_WORK: Add support for cancellation of slow workJens Axboe
Add support for cancellation of queued slow work and delayed slow work items. The cancellation functions will wait for items that are pending or undergoing execution to be discarded by the slow work facility. Attempting to enqueue work that is in the process of being cancelled will result in ECANCELED. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19SLOW_WORK: Wait for outstanding work items belonging to a module to clearDavid Howells
Wait for outstanding slow work items belonging to a module to clear when unregistering that module as a user of the facility. This prevents the put_ref code of a work item from being taken away before it returns. Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-19tracing: Remove the stale include/trace/power.hJosh Stone
Commit 6161352 moved the power tracing to include/trace/events/, but left the old header behind. No one is using the old header, and its declarations are now incorrect, so it should be removed. Signed-off-by: Josh Stone <jistone@redhat.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1258578415-14752-1-git-send-email-jistone@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits) cxgb3: fix premature page unmap ibm_newemac: Fix EMACx_TRTR[TRT] bit shifts vlan: Fix register_vlan_dev() error path gro: Fix illegal merging of trailer trash sungem: Fix Serdes detection. net: fix mdio section mismatch warning ppp: fix BUG on non-linear SKB (multilink receive) ixgbe: Fixing EEH handler to handle more than one error net: Fix the rollback test in dev_change_name() Revert "isdn: isdn_ppp: Use SKB list facilities instead of home-grown implementation." TI Davinci EMAC : Fix Console Hang when bringing the interface down smsc911x: Fix Console Hang when bringing the interface down. mISDN: fix error return in HFCmulti_init() forcedeth: mac address fix r6040: fix version printing Bluetooth: Fix regression with L2CAP configuration in Basic Mode Bluetooth: Select Basic Mode as default for SOCK_SEQPACKET Bluetooth: Set general bonding security for ACL by default r8169: Fix receive buffer length when MTU is between 1515 and 1536 can: add the missing netlink get_xstats_size callback ...
2009-11-18generic-ipi: Add smp_call_function_any()Rusty Russell
Andrew points out that acpi-cpufreq uses cpumask_any, when it really would prefer to use the same CPU if possible (to avoid an IPI). In general, this seems a good idea to offer. [ tglx: Documented selection preference and Inlined the UP case to avoid the copy of smp_call_function_single() and the extra EXPORT ] Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Zhao Yakui <yakui.zhao@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-11-17fcntl: rename F_OWNER_GID to F_OWNER_PGRPPeter Zijlstra
This is for consistency with various ioctl() operations that include the suffix "PGRP" in their names, and also for consistency with PRIO_PGRP, used with setpriority() and getpriority(). Also, using PGRP instead of GID avoids confusion with the common abbreviation of "group ID". I'm fine with anything that makes it more consistent, and if PGRP is what is the predominant abbreviation then I see no need to further confuse matters by adding a third one. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-11-17mm: allow memory hotplug and hibernation in the same kernelAndi Kleen
Allow memory hotplug and hibernation in the same kernel Memory hotplug and hibernation were exclusive in Kconfig. This is obviously a problem for distribution kernels who want to support both in the same image. After some discussions with Rafael and others the only problem is with parallel memory hotadd or removal while a hibernation operation is in process. It was also working for s390 before. This patch removes the Kconfig level exclusion, and simply makes the memory add / remove functions grab the pm_mutex to exclude against hibernation. Fixes a regression - old kernels didn't exclude memory hotadd and hibernation. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-11-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] bfa: declare MODULE_FIRMWARE [SCSI] gdth: Prevent negative offsets in ioctl CVE-2009-3080 [SCSI] libsas: do not set res = 0 in sas_ex_discover_dev() [SCSI] Fix incorrect reporting of host protection capabilities [SCSI] pmcraid: Fix ppc64 driver build for using cpu_to_le32 on U8 data type [SCSI] ipr: add workaround for MSI interrupts on P7 [SCSI] scsi_transport_fc: Fix WARN message for FC passthru failure paths [SCSI] bfa: fix test in bfad_os_fc_host_init()
2009-11-17Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: deleted inconsistent comment in nilfs_load_inode_block() nilfs2: deleted struct nilfs_dat_group_desc nilfs2: fix lock order reversal in chcp operation
2009-11-17Merge commit 'v2.6.32-rc7' into core/iommuIngo Molnar
Merge reason: Add fixes we'll depend on. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-15Revert "isdn: isdn_ppp: Use SKB list facilities instead of home-grown ↵David S. Miller
implementation." This reverts commit 38783e671399b5405f1fd177d602c400a9577ae6. It causes kernel bugzilla #14594 Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-15nilfs2: deleted struct nilfs_dat_group_descJiro SEKIBA
struct nilfs_dat_group_desc is not used both in kernel and user spaces. struct nilfs_palloc_group_desc is used instead. Signed-off-by: Jiro SEKIBA <jir@unicus.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-11-15swiotlb: Remove duplicate swiotlb_force extern declarationsFUJITA Tomonori
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: tony.luck@intel.com LKML-Reference: <1258199198-16657-4-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: psmouse - remove unneeded '\n' from psmouse.proto parameter Input: atkbd - restore LED state at reconnect Input: force LED reset on resume Input: fix locking in memoryless force-feedback devices
2009-11-13sctp: Set source addresses on the association before adding transportsVlad Yasevich
Recent commit 8da645e101a8c20c6073efda3c7cc74eec01b87f sctp: Get rid of an extra routing lookup when adding a transport introduced a regression in the connection setup. The behavior was different between IPv4 and IPv6. IPv4 case ended up working because the route lookup routing returned a NULL route, which triggered another route lookup later in the output patch that succeeded. In the IPv6 case, a valid route was returned for first call, but we could not find a valid source address at the time since the source addresses were not set on the association yet. Thus resulted in a hung connection. The solution is to set the source addresses on the association prior to adding peers. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13tracing: Fix event format exportJohannes Berg
For some reason the export of the event print format to userspace uses '#fmt' which breaks if the format string is anything but a plain string, for example if it is built with macros then the macro names are exported instead of their contents. Use "\"%s\"", fmt instead of "%s", #fmt to export the string and not the way it is built. For example, in net/mac80211/driver-trace.h for the trace event drv_start there is: TP_printk( LOCAL_PR_FMT, LOCAL_PR_ARG ) Which use to produce: print fmt: LOCAL_PR_FMT, REC->wiphy_name Now produces: print fmt: "%s", REC->wiphy_name Signed-off-by: Johannes Berg <johannes@sipsolutions.net> LKML-Reference: <20091113224009.GB23942@elte.hu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-11-13locking: Make inlining decision Kconfig basedThomas Gleixner
commit 892a7c67 (locking: Allow arch-inlined spinlocks) implements the selection of which lock functions are inlined based on defines in arch/.../spinlock.h: #define __always_inline__LOCK_FUNCTION Despite of the name __always_inline__* the lock functions can be built out of line depending on config options. Also if the arch does not set some inline defines the generic code might set them; again depending on config options. This makes it unnecessary hard to figure out when and which lock functions are inlined. Aside of that it makes it way harder and messier for -rt to manipulate the lock functions. Convert the inlining decision to CONFIG switches. Each lock function is inlined depending on CONFIG_INLINE_*. The configs implement the existing dependencies. The architecture code can select ARCH_INLINE_* to signal that it wants the corresponding lock function inlined. ARCH_INLINE_* is necessary as Kconfig ignores "depends on" restrictions when a config element is selected. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20091109151428.504477141@linutronix.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Ingo Molnar <mingo@elte.hu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2009-11-12serial: add support for the Lava Quattro PCI quad-port 16550A cardLennert Buytenhek
This seems to be a different model (with a different PCI ID) than the "Quatro" card that is also in the list. Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-11-12alpha: fix F_SETOWN_EX and F_GETLK64 conflictPeter Zijlstra
Fix a bug in commit ba0a6c9f6fceed11c6a99e8326f0477fe383e6b5 Author: Peter Zijlstra <a.p.zijlstra@chello.nl> AuthorDate: Wed Sep 23 15:57:03 2009 -0700 Commit: Linus Torvalds <torvalds@linux-foundation.org> CommitDate: Thu Sep 24 07:21:01 2009 -0700 fcntl: add F_[SG]ETOWN_EX In asm-generic/fcntl.h, F_SETOWN_EX and F_GETLK64 both have value 12, and F_GETOWN_EX and F_SETLK64 both have value 13. Reported-by: "Joseph S. Myers" <joseph@codesourcery.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Andreas Schwab <schwab@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-11-12fb: remove fb_save_state() and fb_restore_state operationsKrzysztof Helt
Remove fb_save_state() and fb_restore_state operations from frame buffer layer. They are used only in two drivers: 1. savagefb - and cause bug #11248 2. uvesafb Usage of these operations is misunderstood in both drivers so kill these operations, fix the bug #11248 and avoid confusion in the future. Tested on Savage 3D/MV card and the patch fixes the bug #11248. The frame buffer layer uses these funtions during switch between graphics and text mode of the console, but these drivers saves state before switching of the frame buffer (in the fb_open) and after releasing it (in the fb_release). This defeats the purpose of these operations. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=11248 Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Reported-by: Jochen Hein <jochen@jochen.org> Tested-by: Jochen Hein <jochen@jochen.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Januszewski <spock@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-11-11ext3: Wait for proper transaction commit on fsyncJan Kara
We cannot rely on buffer dirty bits during fsync because pdflush can come before fsync is called and clear dirty bits without forcing a transaction commit. What we do is that we track which transaction has last changed the inode and which transaction last changed allocation and force it to disk on fsync. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2009-11-10Input: fix locking in memoryless force-feedback devicesDmitry Torokhov
Now that input core acquires dev->event_lock spinlock and disables interrupts when propagating input events, using spin_lock_bh() in ff-memless driver is not allowed. Actually, the timer_lock itself is not needed anymore, we should simply use dev->event_lock as well. Also do a small cleanup in force-feedback core. Reported-by: kerneloops.org Reported-by: http://www.kerneloops.org/searchweek.php?search=ml_ff_set_gain Reported-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-11-10swiotlb: Defer swiotlb init printing, export swiotlb_print_info()FUJITA Tomonori
This enables us to avoid printing swiotlb memory info when we initialize swiotlb. After swiotlb initialization, we could find that we don't need swiotlb. This patch removes the code to print swiotlb memory info in swiotlb_init() and exports the function to do that. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com Cc: tony.luck@intel.com Cc: benh@kernel.crashing.org LKML-Reference: <1257849980-22640-9-git-send-email-fujita.tomonori@lab.ntt.co.jp> [ -v2: merge up conflict ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10swiotlb: Add swiotlb_free() functionFUJITA Tomonori
swiotlb_free() function frees all allocated memory for swiotlb. We need to initialize swiotlb before IOMMU initialization (x86 and powerpc needs to allocate memory from bootmem allocator). If IOMMU initialization is successful, we need to free swiotlb resource (don't want to waste 64MB). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-8-git-send-email-fujita.tomonori@lab.ntt.co.jp> [ -v2: build fix for the !CONFIG_SWIOTLB case ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10bootmem: Add free_bootmem_late()FUJITA Tomonori
Add a new function for freeing bootmem after the bootmem allocator has been released and the unreserved pages given to the page allocator. This allows us to reserve bootmem and then release it if we later discover it was not needed. ( This new API will be used by the swiotlb code to recover a significant amount of RAM (64MB). ) Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com Cc: hannes@cmpxchg.org Cc: tj@kernel.org Cc: akpm@linux-foundation.org Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1257849980-22640-7-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10x86: intel-iommu: Convert detect_intel_iommu to use iommu_init hookFUJITA Tomonori
This changes detect_intel_iommu() to set intel_iommu_init() to iommu_init hook if detect_intel_iommu() finds the IOMMU. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-6-git-send-email-fujita.tomonori@lab.ntt.co.jp> [ -v2: build fix for the !CONFIG_DMAR case ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10security: report the module name to security_module_requestEric Paris
For SELinux to do better filtering in userspace we send the name of the module along with the AVC denial when a program is denied module_request. Example output: type=SYSCALL msg=audit(11/03/2009 10:59:43.510:9) : arch=x86_64 syscall=write success=yes exit=2 a0=3 a1=7fc28c0d56c0 a2=2 a3=7fffca0d7440 items=0 ppid=1727 pid=1729 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=rpc.nfsd exe=/usr/sbin/rpc.nfsd subj=system_u:system_r:nfsd_t:s0 key=(null) type=AVC msg=audit(11/03/2009 10:59:43.510:9) : avc: denied { module_request } for pid=1729 comm=rpc.nfsd kmod="net-pf-10" scontext=system_u:system_r:nfsd_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=system Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2009-11-09Merge branch 'i2c-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging * 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging: i2c: Add an interface to lock/unlock an I2C bus segment i2c-piix4: Modify code name SB900 to Hudson-2
2009-11-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (34 commits) net/fsl_pq_mdio: add module license GPL can: fix WARN_ON dump in net/core/rtnetlink.c:rtmsg_ifinfo() can: should not use __dev_get_by_index() without locks hisax: remove bad udelay call to fix build error on ARM ipip: Fix handling of DF packets when pmtudisc is OFF qlge: Set PCIe reset type for EEH to fundamental. qlge: Fix early exit from mbox cmd complete wait. ixgbe: fix traffic hangs on Tx with ioatdma loaded ixgbe: Fix checking TFCS register for TXOFF status when DCB is enabled ixgbe: Fix gso_max_size for 82599 when DCB is enabled macsonic: fix crash on PowerBook 520 NET: cassini, fix lock imbalance ems_usb: Fix byte order issues on big endian machines be2net: Bug fix to send config commands to hardware after netdev_register be2net: fix to set proper flow control on resume netfilter: xt_connlimit: fix regression caused by zero family value rt2x00: Don't queue ieee80211 work after USB removal Revert "ipw2200: fix oops on missing firmware" decnet: netdevice refcount leak netfilter: nf_nat: fix NAT issue in 2.6.30.4+ ...
2009-11-09PCMCIA: ss: allow PCI IRQs > 255Russell King - ARM Linux
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2009-11-09pcmcia: remove now-defunct cs_error, pcmcia_error_{func,ret}Dominik Brodowski
As all in-tree drivers have been converted to not use cs_error() any more, drop these functions and definitions, and update the Documentation. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2009-11-08pcmcia: remove pcmcia_get_{first,next}_tuple()Dominik Brodowski
Remove the pcmcia_get_{first,next}_tuple() calls no longer needed by (current) pcmcia device drivers. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2009-11-08pcmcia: add new CIS access helpersDominik Brodowski
As a replacement to pcmcia_get_{first,next}_tuple() and pcmcia_get_tuple_data(), three new -- and easier to use -- functions are added: - pcmcia_get_tuple() to get the very first CIS entry of one type. - pcmcia_loop_tuple() to loop over all CIS entries of one type. - pcmcia_get_mac_from_cis() to read out the hardware MAC address from CISTPL_FUNCE. Only a handful of drivers need these functions anyway, as most CIS access is already handled by pcmcia_loop_config(), which now shares the same backed (pccard_loop_tuple()) with pcmcia_loop_tuple(). A pcmcia_get_mac_from_cis() bug noted by Komuro <komurojun-mbn@nifty.com> has been fixed in this revision. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>