Age | Commit message (Collapse) | Author |
|
I've only seen this on x86_64.
The vsyscall state only gets updated when a timer interrupts comes in. So
if the time is set long before the next timer, there will be a period when
a gettimeofday() won't reflect the correct time.
I added an explicit update_vsyscall() during the settimeofday(), that way
the vsyscall state doesn't get stale.
Signed-off-by: Daniel Walker <dwalker@mvista.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The TIMER_SOFTIRQ runs the hrtimers during bootup until a usable
clocksource and clock event sources are registered. The switch to high
resolution mode happens inside of the TIMER_SOFTIRQ, but runs the softirq
afterwards. That way the tick emulation timer, which was set up in the
switch to highres might be executed in the softirq context, which is a BUG.
The rbtree has not to be touched by the softirq after the highres switch.
This BUG was observed by Andres Salomon, who provided the information to
debug it.
Return early from the softirq, when the switch was sucessful.
[dilinger@debian.org: add debug warning]
[akpm@linux-foundation.org: make debug warning compile]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andres Salomon <dilinger@debian.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The programming of periodic tick devices needs to be saved/restored
across suspend/resume - otherwise we might end up with a system coming
up that relies on getting a PIT (or HPET) interrupt, while those devices
default to 'no interrupts' after powerup. (To confuse things it worked
to a certain degree on some systems because the lapic gets initialized
as a side-effect of SMP bootup.)
This suspend / resume thing was dropped unintentionally during the
last-minute -mm code reshuffling.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Doing something like this on a two cpu system
# echo 0 > /sys/devices/system/cpu/cpu0/online
# echo 1 > /sys/devices/system/cpu/cpu0/online
# echo 0 > /sys/devices/system/cpu/cpu1/online
will give me this:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.21-rc2-g562aa1d4-dirty #7
-------------------------------------------------------
bash/1282 is trying to acquire lock:
(&cpu_base->lock_key){.+..}, at: [<000000000005f17e>] hrtimer_cpu_notify+0xc6/0x240
but task is already holding lock:
(&cpu_base->lock_key#2){.+..}, at: [<000000000005f174>] hrtimer_cpu_notify+0xbc/0x240
which lock already depends on the new lock.
This happens because we have the following code in kernel/hrtimer.c:
migrate_hrtimers(int cpu)
[...]
old_base = &per_cpu(hrtimer_bases, cpu);
new_base = &get_cpu_var(hrtimer_bases);
[...]
spin_lock(&new_base->lock);
spin_lock(&old_base->lock);
Which means the spinlocks are taken in an order which depends on which cpu
gets shut down from which other cpu. Therefore lockdep complains that there
might be an ABBA deadlock. Since migrate_hrtimers() gets only called on
cpu hotplug it's safe to assume that it isn't executed concurrently on a
The same problem exists in kernel/timer.c: migrate_timers().
As pointed out by Christian Borntraeger one possible solution to avoid
the locking order complaints would be to make sure that the locks are
always taken in the same order. E.g. by taking the lock of the cpu with
the lower number first.
To achieve this we introduce two new spinlock functions double_spin_lock
and double_spin_unlock which lock or unlock two locks in a given order.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Christian Borntraeger <cborntra@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch resolves the issue found here:
http://bugme.osdl.org/show_bug.cgi?id=7426
The basic summary is:
Currently we register most of i386/x86_64 clocksources at module_init
time. Then we enable clocksource selection at late_initcall time. This
causes some problems for drivers that use gettimeofday for init
calibration routines (specifically the es1968 driver in this case),
where durring module_init, the only clocksource available is the low-res
jiffies clocksource. This may cause slight calibration errors, due to
the small sampling time used.
It should be noted that drivers that require fine grained time may not
function on architectures that do not have better then jiffies
resolution timekeeping (there are a few). However, this does not
discount the reasonable need for such fine-grained timekeeping at init
time.
Thus the solution here is to register clocksources earlier (ideally when
the hardware is being initialized), and then we enable clocksource
selection at fs_initcall (before device_initcall).
This patch should probably get some testing time in -mm, since
clocksource selection is one of the most important issues for correct
timekeeping, and I've only been able to test this on a few of my own
boxes.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Remove the SMT-nice feature which idles sibling cpus on SMT cpus to
facilitiate nice working properly where cpu power is shared. The idling of
cpus in the presence of runnable tasks is considered too fragile, easy to
break with outside code, and the complexity of managing this system if an
architecture comes along with many logical cores sharing cpu power will be
unworkable.
Remove the associated per_cpu_gain variable in sched_domains used only by
this code.
Also:
The reason is that with dynticks enabled, this code breaks without yet
further tweaks so dynticks brought on the rapid demise of this code. So
either we tweak this code or kill it off entirely. It was Ingo's preference
to kill it off. Either way this needs to happen for 2.6.21 since dynticks
has gone in.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The SMT scheduler incorrectly skips kernel threads even if they are
runnable (but they are preempted by a higher-prio user-space task which got
SMT-delayed by an even higher-priority task running on a sibling CPU).
Fix this for now by only doing the SMT-nice optimization if the
to-be-delayed task is the only runnable task. (This should cover most of
the real-life cases anyway.)
This bug has been in the SMT scheduler since 2.6.17 or so, but has only
been noticed now by the active check in the dynticks code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
lockdep_init() is marked __init but used in several places
outside __init code. This causes following warnings:
$ scripts/mod/modpost kernel/lockdep.o
WARNING: kernel/built-in.o - Section mismatch: reference to .init.text:lockdep_init from .text.lockdep_init_map after 'lockdep_init_map' (at offset 0x105)
WARNING: kernel/built-in.o - Section mismatch: reference to .init.text:lockdep_init from .text.lockdep_reset_lock after 'lockdep_reset_lock' (at offset 0x35)
WARNING: kernel/built-in.o - Section mismatch: reference to .init.text:lockdep_init from .text.__lock_acquire after '__lock_acquire' (at offset 0xb2)
The warnings are less obviously due to heavy inlining by gcc - this is not
altered.
Fix the section mismatch warnings by removing the __init marking, which
seems obviously wrong.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
/home/bunk/linux/kernel-2.6/linux-2.6.20-mm2/kernel/sysctl.c:1411: error: conflicting types for 'register_sysctl_table'
/home/bunk/linux/kernel-2.6/linux-2.6.20-mm2/include/linux/sysctl.h:1042: error: previous declaration of 'register_sysctl_table' was here
make[2]: *** [kernel/sysctl.o] Error 1
Caused by commit 0b4d414714f0d2f922d39424b0c5c82ad900a381.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Problem description at:
http://bugzilla.kernel.org/show_bug.cgi?id=8048
Commit b18ec80396834497933d77b81ec0918519f4e2a7
[PATCH] sched: improve migration accuracy
optimized the scheduler time calculations, but broke posix-cpu-timers.
The problem is that the p->last_ran value is not updated after a context
switch. So a subsequent call to current_sched_time() calculates with a
stale p->last_ran value, i.e. accounts the full time, which the task was
scheduled away.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix kernel-doc warnings in 2.6.20-git15 (lib/, mm/, kernel/, include/).
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Daniel Walker <dwalker@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
* master.kernel.org:/pub/scm/linux/kernel/git/kyle/parisc-2.6: (78 commits)
[PARISC] Use symbolic last syscall in __NR_Linux_syscalls
[PARISC] Add missing statfs64 and fstatfs64 syscalls
Revert "[PARISC] Optimize TLB flush on SMP systems"
[PARISC] Compat signal fixes for 64-bit parisc
[PARISC] Reorder syscalls to match unistd.h
Revert "[PATCH] make kernel/signal.c:kill_proc_info() static"
[PARISC] fix sys_rt_sigqueueinfo
[PARISC] fix section mismatch warnings in harmony sound driver
[PARISC] do not export get_register/set_register
[PARISC] add ENTRY()/ENDPROC() and simplify assembly of HP/UX emulation code
[PARISC] convert to use CONFIG_64BIT instead of __LP64__
[PARISC] use CONFIG_64BIT instead of __LP64__
[PARISC] add ASM_EXCEPTIONTABLE_ENTRY() macro
[PARISC] more ENTRY(), ENDPROC(), END() conversions
[PARISC] fix ENTRY() and ENDPROC() for 64bit-parisc
[PARISC] Fixes /proc/cpuinfo cache output on B160L
[PARISC] implement standard ENTRY(), END() and ENDPROC()
[PARISC] kill ENTRY_SYS_CPUS
[PARISC] clean up debugging printks in smp.c
[PARISC] factor syscall_restart code out of do_signal
...
Fix conflict in include/linux/sched.h due to kill_proc_info() being made
publicly available to PARISC again.
|
|
* master.kernel.org:/pub/scm/linux/kernel/git/davem/tick-2.6:
[TICK] tick-common: Fix one-shot handling in tick_handle_periodic().
[TIME] tick-sched: Add missing asm/irq_regs.h include.
|
|
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6:
Revert "Driver core: let request_module() send a /sys/modules/kmod/-uevent"
Driver core: fix error by cleanup up symlinks properly
make kernel/kmod.c:kmod_mk static
power management: fix struct layout and docs
power management: no valid states w/o pm_ops
Driver core: more fallout from class_device changes for pcmcia
sysfs: move struct sysfs_dirent to private header
driver core: refcounting fix
Driver core: remove class_device_rename
|
|
When clockevents_program_event() is given an expire time in the
past, it does not update dev->next_event, so this looping code
would loop forever once the first in-the-past expiration time
was used.
Keep advancing "next" locally to fix this bug.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
move_native_irqs tries to do the right thing when migrating irqs
by disabling them. However disabling them is a software logical
thing, not a hardware thing. This has always been a little flaky
and after Ingo's latest round of changes it is guaranteed to not
mask the apic.
So this patch fixes move_native_irq to directly call the mask and
unmask chip methods to guarantee that we mask the irq when we
are migrating it. We must do this as it is required by
all code that call into the path.
Since we don't know the masked status when IRQ_DISABLED is
set so we will not be able to restore it. The patch makes the code
just give up and trying again the next time this routing is called.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This reverts commit c353c3fb0700a3c17ea2b0237710a184232ccd7f.
It turns out that we end up with a loop trying to load the unix
module and calling netfilter to do that. Will redo the patch
later to not have this loop.
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
This patch makes the needlessly global struct kmod_mk static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Change /sys/power/state to not advertise any valid states (except for disk
if SOFTWARE_SUSPEND is enabled) when no pm_ops have been set so userspace
can easily discover what states should be available.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Macheck <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Fix a reference counting bug exposed by commit
725522b5453dd680412f2b6463a988e4fd148757. If driver.mod_name exists, we
take a reference in module_add_driver(), and never release it. Undo that
reference in module_remove_driver().
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
In __lock_acquire check_chain_key can turn off debug_locks, so check is
needed to assure proper return code.
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch lists all active probes in the system by scanning through
kprobe_table[]. It takes care of aggregate handlers and prints the type of
the probe. Letter "k" for kprobes, "j" for jprobes, "r" for kretprobes.
It also lists address of the instruction,its symbolic name(function name +
offset) and the module name. One can access this file through
/sys/kernel/debug/kprobes/list.
Output looks like this
=====================
llm40:~/a # cat /sys/kernel/debug/kprobes/list
c0169ae3 r sys_read+0x0
c0169ae3 k sys_read+0x0
c01694c8 k vfs_write+0x0
c0167d20 r sys_open+0x0
f8e658a6 k reiserfs_delete_inode+0x0 reiserfs
c0120f4a k do_fork+0x0
c0120f4a j do_fork+0x0
c0169b4a r sys_write+0x0
c0169b4a k sys_write+0x0
c0169622 r vfs_read+0x0
=================================
[akpm@linux-foundation.org: cleanup]
[ananth@in.ibm.com: sparc build fix]
Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The BUG_ON() in tick_nohz_stop_sched_tick() triggers on some boxen.
Remove the BUG_ON and print information about the pending softirq
to allow better debugging of the problem.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When a CPU is needed for RCU the tick has to continue even when it was
stopped before.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b37' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
[PATCH] AUDIT_FD_PAIR
[PATCH] audit config lockdown
[PATCH] minor update to rule add/delete messages (ver 2)
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits)
Documentation/kernel-docs.txt update.
arch/cris: typo in KERN_INFO
Storage class should be before const qualifier
kernel/printk.c: comment fix
update I/O sched Kconfig help texts - CFQ is now default, not AS.
Remove duplicate listing of Cris arch from README
kbuild: more doc. cleanups
doc: make doc. for maxcpus= more visible
drivers/net/eexpress.c: remove duplicate comment
add a help text for BLK_DEV_GENERIC
correct a dead URL in the IP_MULTICAST help text
fix the BAYCOM_SER_HDX help text
fix SCSI_SCAN_ASYNC help text
trivial documentation patch for platform.txt
Fix typos concerning hierarchy
Fix comment typo "spin_lock_irqrestore".
Fix misspellings of "agressive".
drivers/scsi/a100u2w.c: trivial typo patch
Correct trivial typo in log2.h.
Remove useless FIND_FIRST_BIT() macro from cardbus.c.
...
|
|
Provide an audit record of the descriptor pair returned by pipe() and
socketpair(). Rewritten from the original posted to linux-audit by
John D. Ramsdell <ramsdell@mitre.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The following patch adds a new mode to the audit system. It uses the
audit_enabled config option to introduce the idea of audit enabled, but
configuration is immutable. Any attempt to change the configuration
while in this mode is audited. To change the audit rules, you'd need to
reboot the machine.
To use this option, you'd need a modified version of auditctl and use "-e 2".
This is intended to go at the end of the audit.rules file for people that
want an immutable configuration.
This patch also adds "res=" to a number of configuration commands that did not
have it before.
Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
I was looking at parsing some of these messages and found that I wanted what
it was doing next to an op= for the parser to key on. Also missing was the list
number and results.
Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Patrick Pletscher <pat@pletscher.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
This reverts commit d3228a887cae75ef2b8b1211c31c539bef5a5698.
DeBunk this code. We need it for compat_sys_rt_sigqueueinfo.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
|
|
Fix source files to build with CONFIG_SYSFS=n.
module_subsys is not available.
SYSFS=n, MODULES=y: T:y
SYSFS=n, MODULES=n: T:y
SYSFS=y, MODULES=y: T:y
SYSFS=y, MODULES=n: T:y
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Here is a patch that removes all redundant kobject_unregister argument checks.
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
On recent systems, calls to /sbin/modprobe are handled by udev depending
on the kind of device the kernel has discovered. This patch creates an
uevent for the kernels internal request_module(), to let udev take control
over the request, instead of forking the binary directly by the kernel.
The direct execution of /sbin/modprobe can be disabled by setting:
/sys/module/kmod/mod_request_helper (/proc/sys/kernel/modprobe)
to an empty string, the same way /proc/sys/kernel/hotplug is disabled on an
udev system.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Use mask_ack_irq() where possible.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix kernel-doc warnings in IRQ management.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Never mask interrupts immediately upon request. Disabling interrupts in
high-performance codepaths is rare, and on the other hand this change could
recover lost edges (or even other types of lost interrupts) by conservatively
only masking interrupts after they happen. (NOTE: with this change the
highlevel irq-disable code still soft-disables this IRQ line - and if such an
interrupt happens then the IRQ flow handler keeps the IRQ masked.)
Mark i8529A controllers as 'never loses an edge'.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Use RCU to avoid the need to acquire tasklist_lock in the single-threaded
case of clock_gettime(). It still acquires tasklist_lock when for a
(potentially multithreaded) process. This change allows realtime
applications to frequently monitor CPU consumption of individual tasks, as
requested (and now deployed) by some off-list users.
This has been in Ingo Molnar's -rt patchset since late 2005 with no
problems reported, and tests successfully on 2.6.20-rc6, so I believe that
it is long-since ready for mainline adoption.
[paulmck@linux.vnet.ibm.com: fix exit()/posix_cpu_clock_get() race spotted by Oleg]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In preparation for the x86_64 generic time conversion, this patch splits out
TSC and HPET related code from arch/x86_64/kernel/time.c into respective
hpet.c and tsc.c files.
[akpm@osdl.org: fix printk timestamps]
[akpm@osdl.org: cleanup]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Provides generic infrastructure for vsyscall-gtod.
[akpm@osdl.org: cleanup]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
add /proc/timer_list, which prints all currently pending (high-res) timers,
all clock-event sources and their parameters in a human-readable form.
Sample output:
Timer List Version: v0.1
HRTIMER_MAX_CLOCK_BASES: 2
now at 4246046273872 nsecs
cpu: 0
clock 0:
.index: 0
.resolution: 1 nsecs
.get_time: ktime_get_real
.offset: 1273998312645738432 nsecs
active timers:
clock 1:
.index: 1
.resolution: 1 nsecs
.get_time: ktime_get
.offset: 0 nsecs
active timers:
#0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_stop_sched_tick, swapper/0
# expires at 4246432689566 nsecs [in 386415694 nsecs]
#1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, pcscd/2050
# expires at 4247018194689 nsecs [in 971920817 nsecs]
#2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, irqbalance/1909
# expires at 4247351358392 nsecs [in 1305084520 nsecs]
#3: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, crond/2157
# expires at 4249097614968 nsecs [in 3051341096 nsecs]
#4: <f5a90ec8>, it_real_fn, do_setitimer, syslogd/1888
# expires at 4251329900926 nsecs [in 5283627054 nsecs]
.expires_next : 4246432689566 nsecs
.hres_active : 1
.check_clocks : 0
.nr_events : 31306
.idle_tick : 4246020791890 nsecs
.tick_stopped : 1
.idle_jiffies : 986504
.idle_calls : 40700
.idle_sleeps : 36014
.idle_entrytime : 4246019418883 nsecs
.idle_sleeptime : 4178181972709 nsecs
cpu: 1
clock 0:
.index: 0
.resolution: 1 nsecs
.get_time: ktime_get_real
.offset: 1273998312645738432 nsecs
active timers:
clock 1:
.index: 1
.resolution: 1 nsecs
.get_time: ktime_get
.offset: 0 nsecs
active timers:
#0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_restart_sched_tick, swapper/0
# expires at 4246050084568 nsecs [in 3810696 nsecs]
#1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, atd/2227
# expires at 4261010635003 nsecs [in 14964361131 nsecs]
#2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, smartd/2332
# expires at 5469485798970 nsecs [in 1223439525098 nsecs]
.expires_next : 4246050084568 nsecs
.hres_active : 1
.check_clocks : 0
.nr_events : 24043
.idle_tick : 4246046084568 nsecs
.tick_stopped : 0
.idle_jiffies : 986510
.idle_calls : 26360
.idle_sleeps : 22551
.idle_entrytime : 4246043874339 nsecs
.idle_sleeptime : 4170763761184 nsecs
tick_broadcast_mask: 00000003
event_broadcast_mask: 00000001
CPU#0's local event device:
Clock Event Device: lapic
capabilities: 0000000e
max_delta_ns: 807385544
min_delta_ns: 1443
mult: 44624025
shift: 32
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: hrtimer_interrupt
.installed: 1
.expires: 4246432689566 nsecs
CPU#1's local event device:
Clock Event Device: lapic
capabilities: 0000000e
max_delta_ns: 807385544
min_delta_ns: 1443
mult: 44624025
shift: 32
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: hrtimer_interrupt
.installed: 1
.expires: 4246050084568 nsecs
Clock Event Device: hpet
capabilities: 00000007
max_delta_ns: 2147483647
min_delta_ns: 3352
mult: 61496110
shift: 32
set_next_event: hpet_next_event
set_mode: hpet_set_mode
event_handler: handle_nextevt_broadcast
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add /proc/timer_stats support: debugging feature to profile timer expiration.
Both the starting site, process/PID and the expiration function is captured.
This allows the quick identification of timer event sources in a system.
Sample output:
# echo 1 > /proc/timer_stats
# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 4.010 s
24, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
11, 0 swapper sk_reset_timer (tcp_delack_timer)
6, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick)
2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
17, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick)
2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
4, 2050 pcscd do_nanosleep (hrtimer_wakeup)
5, 4179 sshd sk_reset_timer (tcp_write_timer)
4, 2248 yum-updatesd schedule_timeout (process_timeout)
18, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick)
3, 0 swapper sk_reset_timer (tcp_delack_timer)
1, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer)
2, 1 swapper e1000_up (e1000_watchdog)
1, 1 init schedule_timeout (process_timeout)
100 total events, 25.24 events/sec
[ cleanups and hrtimers support from Thomas Gleixner <tglx@linutronix.de> ]
[bunk@stusta.de: nr_entries can become static]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix potential setitimer DoS with high-res timers by pushing itimer rearm
processing to process context.
[Fixes from: Ingo Molnar <mingo@elte.hu>]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Implement high resolution timers on top of the hrtimers infrastructure and the
clockevents / tick-management framework. This provides accurate timers for
all hrtimer subsystem users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
With Ingo Molnar <mingo@elte.hu>
Add functions to provide dynamic ticks and high resolution timers. The code
which keeps track of jiffies and handles the long idle periods is shared
between tick based and high resolution timer based dynticks. The dyntick
functionality can be disabled on the kernel commandline. Provide also the
infrastructure to support high resolution timers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
With Ingo Molnar <mingo@elte.hu>
Add broadcast functionality, so per cpu clock event devices can be registered
as dummy devices or switched from/to broadcast on demand. The broadcast
function distributes the events via the broadcast function of the clock event
device. This is primarily designed to replace the switch apic timer to / from
IPI in power states, where the apic stops.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
With Ingo Molnar <mingo@elte.hu>
The tick-management code is the first user of the clockevents layer. It takes
clock event devices from the clock events core and uses them to provide the
periodic tick.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Architectures register their clock event devices, in the clock events core.
Users of the clockevents core can get clock event devices for their use. The
clockevents core code provides notification mechanisms for various clock
related management events.
This allows to control the clock event devices without the architectures
having to worry about the details of function assignment. This is also a
preliminary for high resolution timers and dynamic ticks to allow the core
code to control the clock functionality without intrusive changes to the
architecture code.
[Fixes-by: Ingo Molnar <mingo@elte.hu>]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|