aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2007-02-16RDMA/cma: Add multicast communication supportSean Hefty
Extend rdma_cm to support multicast communication. Multicast support is added to the existing RDMA_PS_UDP port space, as well as a new RDMA_PS_IPOIB port space. The latter port space allows joining the multicast groups used by IPoIB, which enables offloading IPoIB traffic to a separate QP. The port space determines the signature used in the MGID when joining the group. The newly added RDMA_PS_IPOIB also allows for unicast operations, similar to RDMA_PS_UDP. Supporting the RDMA_PS_IPOIB requires changing how UD QPs are initialized, since we can no longer assume that the qkey is constant. This requires saving the Q_Key to use when attaching to a device, so that it is available when creating the QP. The Q_Key information is exported to the user through the existing rdma_init_qp_attr() interface. Multicast support is also exported to userspace through the rdma_ucm. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IB/sa: Track multicast join/leave requestsSean Hefty
The IB SA tracks multicast join/leave requests on a per port basis and does not do any reference counting: if two users of the same port join the same group, and one leaves that group, then the SA will remove the port from the group even though there is one user who wants to stay a member left. Therefore, in order to support multiple users of the same multicast group from the same port, we need to perform reference counting locally. To do this, add an multicast submodule to ib_sa to perform reference counting of multicast join/leave operations. Modify ib_ipoib (the only in-kernel user of multicast) to use the new interface. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IPoIB: CM error handling thinko fixMichael S. Tsirkin
ipoib_cm_alloc_rx_skb() might be called from IRQ context, so it must use dev_kfree_skb_any(), not kfree_skb(). Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16RDMA/cxgb3: Remove Open Grid Computing copyrights in iw_cxgb3 driverSteve Wise
Remove the Open Grid Computing copyright. It shouldn't be there. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16RDMA/cxgb3: Fail posts synchronously when in TERMINATE stateSteve Wise
For T3B devices, mark user QP in error once we transition to TERMINATE. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16RDMA/iwcm: iw_cm_id destruction race fixesSteve Wise
iwcm iw_cm_id destruction race condition fixes: - iwcm_deref_id() always wakes up if there's another reference. - clean up race condition in cm_work_handler(). - create static void free_cm_id() which deallocs the work entries and then kfrees the cm_id memory. This reduces code replication. - rem_ref() if this is the last reference -and- the IWCM owns freeing the cm_id, then free it. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IB/ehca: Change query_port() to return LINK_UP instead UNKNOWNHoang-Nam Nguyen
Set the port phys state as returned from ehca_query_port() to LINK_UP. ehca actually represents a logical HCA, whose phys/link state always is LINK_UP. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IB/ehca: Allow en/disabling scaling code via module parameterHoang-Nam Nguyen
Allow users to en/disable scaling code when loading ib_ehca module, rather than requiring the module to be rebuilt to change the setting. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IB/ehca: Fix race condition/locking issues in scaling codeHoang-Nam Nguyen
Fix a race condition in find_next_cpu_online() and some other locking issues in ehca scaling code. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IB/ehca: Rework irq handlerHoang-Nam Nguyen
Rework ehca interrupt handling to avoid/reduce missed irq events. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IPoIB: Only allow root to change between datagram and connected modeRoland Dreier
Change the permissions of the "mode" sysfs attribute to be S_IWUSR instead of S_IWUGO. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IB/mthca: Fix allocation of ICM chunks in coherent memoryRoland Dreier
The change to allow allocating ICM chunks from coherent memory did not increment the count of sg entries properly, so a chunk that required more than allocation would not be mapped properly by the HCA. Fix this by adding the missing increment of chunk->nsg. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16IB/mthca: Allow the QP state transition RESET->RESETDotan Barak
RESET->RESET is an allowed QP state transition, so mthca should handle it correctly, by just returning success without involving the firmware. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-16Merge branch 'for-linus' of git://www.atmel.no/~hskinnemoen/linux/kernel/avr32Linus Torvalds
* 'for-linus' of git://www.atmel.no/~hskinnemoen/linux/kernel/avr32: [AVR32] Use per-controller spi_board_info structures [AVR32] Warn, don't BUG if clk_disable is called too many times [AVR32] Make sure all genclocks have a parent [AVR32] Remove unnecessary sys_nfsservctl conditional [AVR32] Wire up the SysV IPC calls properly [AVR32] Define ioremap_nocache, ioport_map and ioport_unmap [AVR32] Fix prototypes for __raw_writesb and friends
2007-02-16Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgartLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart: [AGPGART] allow drm populated agp memory types cleanups [AGPGART] intel-agp: Use ARRAY_SIZE macro when appropriate [AGPGART] Add agp-type-to-mask-type method missing from some drivers. [AGPGART] Don't try to remap i810 registers on resume. [AGPGART] Allow drm-populated agp memory types [AGPGART] compat ioctl
2007-02-16Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreqLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Longhaul - Redo Longhaul ver. 2 [CPUFREQ] EPS - Correct 2nd brand test [CPUFREQ] Longhaul - Separate frequency and voltage transition [CPUFREQ] Longhaul - Models of Nehemiah [CPUFREQ] Whitespace fixup [CPUFREQ] Longhaul - Simplier minmult [CPUFREQ] CPU_FREQ_TABLE shouldn't be a def_tristate [CPUFREQ] ondemand governor use new cpufreq rwsem locking in work callback [CPUFREQ] ondemand governor restructure the work callback [CPUFREQ] Rewrite lock in cpufreq to eliminate cpufreq/hotplug related issues [CPUFREQ] Remove hotplug cpu crap [CPUFREQ] Enhanced PowerSaver driver [CPUFREQ] Longhaul - Add VT8235 support [CPUFREQ] Longhaul - Fix guess_fsb function [CPUFREQ] Longhaul - Remove duplicate tables [CPUFREQ] Longhaul - Introduce Nehemiah C [CPUFREQ] fix cpuinfo_cur_freq for CPU_HW_PSTATE [CPUFREQ] Longhaul - Remove "ignore_latency" option
2007-02-16[PATCH] s3c2410fb: fix un-initialised dev fieldBen Dooks
The current driver is not setting the dev field in the private data structure, which can lead to an OOPS if the driver tries to report an error. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: James Simmons <jsimmons@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] ecryptfs: fix forgotten format specifierThomas Hisch
Add format specifier %d for uid in ecryptfs_printk Signed-off-by: Thomas Hisch <t.hisch@gmail.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] eCryptfs: Reduce stack usage in ecryptfs_generate_key_packet_set()Michael Halcrow
eCryptfs is gobbling a lot of stack in ecryptfs_generate_key_packet_set() because it allocates a temporary memory-hungry ecryptfs_key_record struct. This patch introduces a new kmem_cache for that struct and converts ecryptfs_generate_key_packet_set() to use it. Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] knfsd: stop NFSD writes from being broken into lots of little writes ↵NeilBrown
to filesystem When NFSD receives a write request, the data is typically in a number of 1448 byte segments and writev is used to collect them together. Unfortunately, generic_file_buffered_write passes these to the filesystem one at a time, so an e.g. 32K over-write becomes a series of partial-page writes to each page, causing the filesystem to have to pre-read those pages - wasted effort. generic_file_buffered_write handles one segment of the vector at a time as it has to pre-fault in each segment to avoid deadlocks. When writing from kernel-space (and nfsd does) this is not an issue, so generic_file_buffered_write does not need to break and iovec from nfsd into little pieces. This patch avoids the splitting when get_fs is KERNEL_DS as it is from NFSd. This issue was introduced by commit 6527c2bdf1f833cc18e8f42bd97973d583e4aa83 Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Norman Weathers <norman.r.weathers@conocophillips.com> Cc: Vladimir V. Saveliev <vs@namesys.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] knfsd: nfsd4: fix handling of directories without default ACLsJ. Bruce Fields
When setting an ACL that lacks inheritable ACEs on a directory, we should set a default ACL of zero length, not a default ACL with all bits denied. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] knfsd: nfsd4: acls: avoid unnecessary deniesJ. Bruce Fields
We're inserting deny's between some ACEs in order to enforce posix draft acl semantics which prevent permissions from accumulating across entries in an acl. That's fine, but we're doing that by inserting a deny after *every* allow, which is overkill. We shouldn't be adding them in places where they actually make no difference. Also replaced some helper functions for creating acl entries; I prefer just assigning directly to the struct fields--it takes a few more lines, but the field names provide some documentation that I think makes the result easier understand. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] knfsd: nfsd4: acls: don't return explicit maskJ. Bruce Fields
Return just the effective permissions, and forget about the mask. It isn't worth the complexity. WARNING: This breaks backwards compatibility with overly-picky nfsv4->posix acl translation, as may has been included in some patched versions of libacl. To our knowledge no such version was every distributed by anyone outside citi. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] knfsd: nfsd4: fix error return on unsupported aclJ. Bruce Fields
We should be returning ATTRNOTSUPP, not NOTSUPP, when acls are unsupported. Also fix a comment. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] knfsd: nfsd4: fix memory leak on kmalloc failure in savememJ. Bruce Fields
The wrong pointer is being kfree'd in savemem() when defer_free returns with an error. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] knfsd: nfsd4: represent nfsv4 acl with array instead of linked listJ. Bruce Fields
Simplify the memory management and code a bit by representing acls with an array instead of a linked list. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] knfsd: nfsd4: simplify nfsv4->posix translationJ. Bruce Fields
The code that splits an incoming nfsv4 ACL into inheritable and effective parts can be combined with the the code that translates each to a posix acl, resulting in simpler code that requires one less pass through the ACL. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] knfsd: nfsd4: relax checking of ACL inheritance bitsJ. Bruce Fields
The rfc allows us to be more permissive about the ACL inheritance bits we accept: "If the server supports a single "inherit ACE" flag that applies to both files and directories, the server may reject the request (i.e., requiring the client to set both the file and directory inheritance flags). The server may also accept the request and silently turn on the ACE4_DIRECTORY_INHERIT_ACE flag." Let's take the latter option--the ACL is a complex attribute that could be rejected for a wide variety of reasons, and the protocol gives us little ability to explain the reason for the rejection, so erroring out is a user-unfriendly last resort. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] knfsd: nfsd4: fix non-terminated stringJ. Bruce Fields
The server name is expected to be a null-terminated string, so we can't pass in the raw client identifier. What's more, the client identifier is just a binary, not necessarily printable, blob. Let's just use the ip address instead. The server name appears to exist just to help debugging by making some printk's more informative. Note that the string is copies into the rpc client structure, so the pointer to the local variable does not outlive the function call. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] small irq management simplificationJan Beulich
Use mask_ack_irq() where possible. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] IRQ kernel-doc fixesRandy Dunlap
Fix kernel-doc warnings in IRQ management. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] genirq: remove IRQ_DISABLEDIngo Molnar
Now that disable_irq() defaults to delayed-disable semantics, the IRQ_DISABLED flag is not needed anymore. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] genirq: do not mask interrupts by defaultIngo Molnar
Never mask interrupts immediately upon request. Disabling interrupts in high-performance codepaths is rare, and on the other hand this change could recover lost edges (or even other types of lost interrupts) by conservatively only masking interrupts after they happen. (NOTE: with this change the highlevel irq-disable code still soft-disables this IRQ line - and if such an interrupt happens then the IRQ flow handler keeps the IRQ masked.) Mark i8529A controllers as 'never loses an edge'. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] posix timers: RCU optimization for clock_gettime()Paul E. McKenney
Use RCU to avoid the need to acquire tasklist_lock in the single-threaded case of clock_gettime(). It still acquires tasklist_lock when for a (potentially multithreaded) process. This change allows realtime applications to frequently monitor CPU consumption of individual tasks, as requested (and now deployed) by some off-list users. This has been in Ingo Molnar's -rt patchset since late 2005 with no problems reported, and tests successfully on 2.6.20-rc6, so I believe that it is long-since ready for mainline adoption. [paulmck@linux.vnet.ibm.com: fix exit()/posix_cpu_clock_get() race spotted by Oleg] Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] time: x86_64: re-enable vsyscall support for x86_64john stultz
Cleanup and re-enable vsyscall gettimeofday using the generic clocksource infrastructure. [akpm@osdl.org: cleanup] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@muc.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] time: x86_64: convert x86_64 to use GENERIC_TIMEjohn stultz
This patch converts x86_64 to use the GENERIC_TIME infrastructure and adds clocksource structures for both TSC and HPET (ACPI PM is shared w/ i386). [akpm@osdl.org: fix printk timestamps] [akpm@osdl.org: fix printk ckeanups] [akpm@osdl.org: hpet build fix] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@muc.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] time: x86_64: split x86_64/kernel/time.c upjohn stultz
In preparation for the x86_64 generic time conversion, this patch splits out TSC and HPET related code from arch/x86_64/kernel/time.c into respective hpet.c and tsc.c files. [akpm@osdl.org: fix printk timestamps] [akpm@osdl.org: cleanup] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@muc.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] time: x86_64: hpet_address cleanupjohn stultz
In preparation for supporting generic timekeeping, this patch cleans up x86-64's use of vxtime.hpet_address, changing it to just hpet_address as is also used in i386. This is necessary since the vxtime structure will be going away. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@muc.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] generic: vsyscall-gtod support for GENERIC_TIMEjohn stultz
Provides generic infrastructure for vsyscall-gtod. [akpm@osdl.org: cleanup] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@muc.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] Add SysRq-Q to print timer_list debug infoIngo Molnar
Add SysRq-Q to print pending timers and other timer info. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] Add debugging feature /proc/timer_listIngo Molnar
add /proc/timer_list, which prints all currently pending (high-res) timers, all clock-event sources and their parameters in a human-readable form. Sample output: Timer List Version: v0.1 HRTIMER_MAX_CLOCK_BASES: 2 now at 4246046273872 nsecs cpu: 0 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1273998312645738432 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_stop_sched_tick, swapper/0 # expires at 4246432689566 nsecs [in 386415694 nsecs] #1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, pcscd/2050 # expires at 4247018194689 nsecs [in 971920817 nsecs] #2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, irqbalance/1909 # expires at 4247351358392 nsecs [in 1305084520 nsecs] #3: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, crond/2157 # expires at 4249097614968 nsecs [in 3051341096 nsecs] #4: <f5a90ec8>, it_real_fn, do_setitimer, syslogd/1888 # expires at 4251329900926 nsecs [in 5283627054 nsecs] .expires_next : 4246432689566 nsecs .hres_active : 1 .check_clocks : 0 .nr_events : 31306 .idle_tick : 4246020791890 nsecs .tick_stopped : 1 .idle_jiffies : 986504 .idle_calls : 40700 .idle_sleeps : 36014 .idle_entrytime : 4246019418883 nsecs .idle_sleeptime : 4178181972709 nsecs cpu: 1 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1273998312645738432 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_restart_sched_tick, swapper/0 # expires at 4246050084568 nsecs [in 3810696 nsecs] #1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, atd/2227 # expires at 4261010635003 nsecs [in 14964361131 nsecs] #2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, smartd/2332 # expires at 5469485798970 nsecs [in 1223439525098 nsecs] .expires_next : 4246050084568 nsecs .hres_active : 1 .check_clocks : 0 .nr_events : 24043 .idle_tick : 4246046084568 nsecs .tick_stopped : 0 .idle_jiffies : 986510 .idle_calls : 26360 .idle_sleeps : 22551 .idle_entrytime : 4246043874339 nsecs .idle_sleeptime : 4170763761184 nsecs tick_broadcast_mask: 00000003 event_broadcast_mask: 00000001 CPU#0's local event device: Clock Event Device: lapic capabilities: 0000000e max_delta_ns: 807385544 min_delta_ns: 1443 mult: 44624025 shift: 32 set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt .installed: 1 .expires: 4246432689566 nsecs CPU#1's local event device: Clock Event Device: lapic capabilities: 0000000e max_delta_ns: 807385544 min_delta_ns: 1443 mult: 44624025 shift: 32 set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt .installed: 1 .expires: 4246050084568 nsecs Clock Event Device: hpet capabilities: 00000007 max_delta_ns: 2147483647 min_delta_ns: 3352 mult: 61496110 shift: 32 set_next_event: hpet_next_event set_mode: hpet_set_mode event_handler: handle_nextevt_broadcast Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] Add debugging feature /proc/timer_statIngo Molnar
Add /proc/timer_stats support: debugging feature to profile timer expiration. Both the starting site, process/PID and the expiration function is captured. This allows the quick identification of timer event sources in a system. Sample output: # echo 1 > /proc/timer_stats # cat /proc/timer_stats Timer Stats Version: v0.1 Sample period: 4.010 s 24, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick) 11, 0 swapper sk_reset_timer (tcp_delack_timer) 6, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick) 2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) 17, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick) 2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) 4, 2050 pcscd do_nanosleep (hrtimer_wakeup) 5, 4179 sshd sk_reset_timer (tcp_write_timer) 4, 2248 yum-updatesd schedule_timeout (process_timeout) 18, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick) 3, 0 swapper sk_reset_timer (tcp_delack_timer) 1, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer) 2, 1 swapper e1000_up (e1000_watchdog) 1, 1 init schedule_timeout (process_timeout) 100 total events, 25.24 events/sec [ cleanups and hrtimers support from Thomas Gleixner <tglx@linutronix.de> ] [bunk@stusta.de: nr_entries can become static] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] hrtimers: prevent possible itimer DoSThomas Gleixner
Fix potential setitimer DoS with high-res timers by pushing itimer rearm processing to process context. [Fixes from: Ingo Molnar <mingo@elte.hu>] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] hrtimers: add high resolution timer supportThomas Gleixner
Implement high resolution timers on top of the hrtimers infrastructure and the clockevents / tick-management framework. This provides accurate timers for all hrtimer subsystem users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] i386: enable dynticks in kconfigIngo Molnar
Enable dynamic ticks selection. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] i386 prepare nmi watchdog for dynticksThomas Gleixner
The NMI watchdog implementation assumes that the local APIC timer interrupt is happening. This assumption is not longer true when high resolution timers and dynamic ticks come into play, as they may switch off the local APIC timer completely. Take the PIT/HPET interrupts into account too, to avoid false positives. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@suse.de> Cc: Zachary Amsden <zach@vmware.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Rohit Seth <rohitseth@google.com> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] i386 prepare for dyntickIngo Molnar
Prepare i386 for dyntick: idle handler callbacks. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] i386 rework local apic timer calibrationThomas Gleixner
The local apic timer calibration has two problem cases: 1. The calibration is based on readout of the PIT/HPET timer to detect the wrap of the periodic tick. It happens that a box gets stuck in the calibration loop due to a PIT with a broken readout function. 2. CoreDuo boxen show a sporadic PIT runs too slow defect, which results in a wrong lapic calibration. The PIT goes back to normal operation once the lapic timer is switched to periodic mode. Both are existing and unfixed problems in the current upstream kernel and prevent certain laptops and other systems from booting Linux. Rework the code to address both problems: - Make the calibration interrupt driven. This removes the wait_timer_tick magic hackery from lapic.c and time_hpet.c. The clockevents framework allows easy substitution of the global tick event handler for the calibration. This is more accurate than monitoring jiffies. At this point of the boot process, nothing disturbes the interrupt delivery, so the results are very accurate. - Verify the calibration against the PM timer, when available by using the early access function. When the measured calibration period is outside of an one percent window, then the lapic timer calibration is adjusted to the pm timer result. - Verify the calibration by running the lapic timer with the calibration handler. Disable lapic timer in case of deviation. This also removes the "synchronization" of the local apic timer to the global tick. This synchronization never worked, as there is no way to synchronize PIT(HPET) and local APIC timer. The synchronization by waiting for the tick just alignes the local APIC timer for the first events, but later the events drift away due to the different clocks. Removing the "sync" is just randomizing the asynchronous behaviour at setup time. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Zachary Amsden <zach@vmware.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Rohit Seth <rohitseth@google.com> Cc: Andi Kleen <ak@suse.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] clockevents: i386 driversThomas Gleixner
Add clockevent drivers for i386: lapic (local) and PIT/HPET (global). Update the timer IRQ to call into the PIT/HPET driver's event handler and the lapic-timer IRQ to call into the lapic clockevent driver. The assignement of timer functionality is delegated to the core framework code and replaces the compile and runtime evalution in do_timer_interrupt_hook() Use the clockevents broadcast support and implement the lapic_broadcast function for ACPI. No changes to existing functionality. [ kdump fix from Vivek Goyal <vgoyal@in.ibm.com> ] [ fixes based on review feedback from Arjan van de Ven <arjan@infradead.org> ] Cleanups-from: Adrian Bunk <bunk@stusta.de> Build-fixes-from: Andrew Morton <akpm@osdl.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16[PATCH] tick-management: dyntick / highres functionalityThomas Gleixner
With Ingo Molnar <mingo@elte.hu> Add functions to provide dynamic ticks and high resolution timers. The code which keeps track of jiffies and handles the long idle periods is shared between tick based and high resolution timer based dynticks. The dyntick functionality can be disabled on the kernel commandline. Provide also the infrastructure to support high resolution timers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>