aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2008-01-30time: track accurate idle time with tick_sched.idle_sleeptimeVenki Pallipadi
Current idle time in kstat is based on jiffies and is coarse grained. tick_sched.idle_sleeptime is making some attempt to keep track of idle time in a fine grained manner. But, it is not handling the time spent in interrupts fully. Make tick_sched.idle_sleeptime accurate with respect to time spent on handling interrupts and also add tick_sched.idle_lastupdate, which keeps track of last time when idle_sleeptime was updated. This statistics will be crucial for cpufreq-ondemand governor, which can shed some conservative gaurd band that is uses today while setting the frequency. The ondemand changes that uses the exact idle time is coming soon. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30NTP: correct inconsistent ntp interval/tick_length usagejohn stultz
I recently noticed on one of my boxes that when synched with an NTP server, the drift value reported for the system was ~283ppm. While in some cases, clock hardware can be that bad, it struck me as unusual as the system was using the acpi_pm clocksource, which is one of the more trustworthy and accurate clocksources on x86 hardware. I brought up another system and let it sync to the same NTP server, and I noticed a similar 280some ppm drift. In looking at the code, I found that the acpi_pm's constant frequency was being computed correctly at boot-up, however once the system was up, even without the ntp daemon running, the clocksource's frequency was being modified by the clocksource_adjust() function. Digging deeper, I realized that in the code that keeps track of how much the clocksource is skewing from the ntp desired time, we were using different lengths to establish how long an time interval was. The clocksource was being setup with the following interval: NTP_INTERVAL_LENGTH = NSEC_PER_SEC/NTP_INTERVAL_FREQ While the ntp code was using the tick_length_base value: tick_length_base ~= (tick_usec * NSEC_PER_USEC * USER_HZ) /NTP_INTERVAL_FREQ The subtle difference is: (tick_usec * NSEC_PER_USEC * USER_HZ) != NSEC_PER_SEC This difference in calculation was causing the clocksource correction code to apply a correction factor to the clocksource so the two intervals were the same, however this results in the actual frequency of the clocksource to be made incorrect. I believe this difference would affect all clocksources, although to differing degrees depending on the clocksource resolution. The issue was introduced when my HZ free ntp patch landed in 2.6.21-rc1, so my apologies for the mistake, and for not noticing it until now. The following patch, corrects the clocksource's initialization code so it uses the same interval length as the code in ntp.c. After applying this patch, the drift value for the same system went from ~283ppm to only 2.635ppm. I believe this patch to be good, however it does affect all arches and I've only tested on x86, so some caution is advised. I do think it would be a likely candidate for a stable 2.6.24.x release. Any thoughts or feedback would be appreciated. Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: make clockevents more robustIngo Molnar
detect zero event-device multiplicators - they then cause division-by-zero crashes if a clockevent has been initialized incorrectly. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30clocksource: add unregister function to disable unusable clocksourcesThomas Gleixner
On x86 the PIT might become an unusable clocksource. Add an unregister function to provide a possibilty to remove the PIT from the list of available clock sources. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30clocksource: make clocksource watchdog cycle through online CPUsAndi Kleen
This way it checks if the clocks are synchronized between CPUs too. This might be able to detect slowly drifting TSCs which only go wrong over longer time. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30clocksource.c: use init_timer_deferrable for clocksource_watchdogParag Warudkar
clocksource_watchdog can use a deferrable timer - reduces wakeups from idle per second. Signed-off-by: Parag Warudkar <parag.warudkar@gmail.com> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30time: fold __get_realtime_clock_ts() into getnstimeofday()Geert Uytterhoeven
- getnstimeofday() was just a wrapper around __get_realtime_clock_ts() - Replace calls to __get_realtime_clock_ts() by calls to getnstimeofday() - Fix bogus reference to get_realtime_clock_ts(), which never existed Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30timer: clean up tick-broadcast.cThomas Gleixner
clean up tick-broadcast.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30time: more timer related cleanupsPavel Machek
I was confused by FSEC = 10^15 NSEC statement, plus small whitespace fixes. When there's copyright, there should be GPL. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30time: timer cleanupsPavel Machek
Small cleanups to tick-related code. Wrong preempt count is followed by BUG(), so it is hardly KERN_WARNING. Signed-off-by: Pavel Machek <pavel@suse.cz> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30time: clean hungarian notation from timersPavel Machek
Clean up hungarian notation from timer code. Signed-off-by: Pavel Machek <pavel@suse.cz> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25: (1470 commits) [IPV6] ADDRLABEL: Fix double free on label deletion. [PPP]: Sparse warning fixes. [IPV4] fib_trie: remove unneeded NULL check [IPV4] fib_trie: More whitespace cleanup. [NET_SCHED]: Use nla_policy for attribute validation in ematches [NET_SCHED]: Use nla_policy for attribute validation in actions [NET_SCHED]: Use nla_policy for attribute validation in classifiers [NET_SCHED]: Use nla_policy for attribute validation in packet schedulers [NET_SCHED]: sch_api: introduce constant for rate table size [NET_SCHED]: Use typeful attribute parsing helpers [NET_SCHED]: Use typeful attribute construction helpers [NET_SCHED]: Use NLA_PUT_STRING for string dumping [NET_SCHED]: Use nla_nest_start/nla_nest_end [NET_SCHED]: Propagate nla_parse return value [NET_SCHED]: act_api: use PTR_ERR in tcf_action_init/tcf_action_get [NET_SCHED]: act_api: use nlmsg_parse [NET_SCHED]: act_api: fix netlink API conversion bug [NET_SCHED]: sch_netem: use nla_parse_nested_compat [NET_SCHED]: sch_atm: fix format string warning [NETNS]: Add namespace for ICMP replying code. ...
2008-01-29Module: check to see if we have a built in module with the same nameGreg Kroah-Hartman
When trying to load a module with the same name as a built-in one, a scary kobject backtrace comes up. Prevent that from checking for this condition and warning the user as to what exactly is going on. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-29module: add module taint on ndiswrapperJon Masters
The struct module taints member is supposed to store per-module taint data. The kernel knows about certain specific external modules that will taint the kernel, such as ndiswrapper. Use of ndiswrapper possibly should set the per-module taint in addition to the global kernel taint flag, unless we're arguing not because wrapper module itself is not what actually causes the kernel to be tainted as such? Signed-off-by: Jon Masters <jcm@jonmasters.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-29module: fix the module name length in param_sysfs_builtinDenis Cheng
the original code use KOBJ_NAME_LEN for built-in module name length, that's defined to 20 in linux/kobject.h, but this is not enough appearntly, many module names are longer than this; #define KOBJ_NAME_LEN 20 another macro is MODULE_NAME_LEN defined in linux/module.h, I think this is enough for module names: #define MODULE_NAME_LEN (64 - sizeof(unsigned long)) Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-29module: make module_address_lookup safeRusty Russell
module_address_lookup releases preemption then returns a pointer into the module space. The only user (kallsyms) copies the result, so just do that under the preempt disable. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-29module: better OOPS and lockdep coverage for loading modulesRusty Russell
If we put the module in the linked list *before* calling into to, we get the module name and functions in the OOPS (is_module_address can find the module). It also helps lockdep in a similar way. Acked-and-tested-by: Joern Engel <joern@lazybastard.org> Tested-by: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-29module: Fix gratuitous sprintf in module.cRusty Russell
Andrew sent an older version of this patch: we shouldn't use sprintf to copy a string. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-29module: wait for dependent modules doing init.Rusty Russell
There have been reports of modules failing to load because the modules they depend on are still loading. This changes the modules to wait for a reasonable length of time in that case. We time out eventually, because there can be module loops or broken modules. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-29module: Don't report discarded init pages as kernel text.Rusty Russell
Current code could cause a bug in symbol_put_addr() if an arch used kmalloc module text: we might think the symbol belongs to the core kernel. The downside is that this might make backtraces through (discarded) init functions harder to read on some archs, but we already have that issue for modules and noone has complained. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-28[NET]: Remove the empty net_tablePavel Emelyanov
I have removed all the entries from this table (core_table, ipv4_table and tr_table), so now we can safely drop it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28sysctl: Infrastructure for per namespace sysctlsEric W. Biederman
This patch implements the basic infrastructure for per namespace sysctls. A list of lists of sysctl headers is added, allowing each namespace to have it's own list of sysctl headers. Each list of sysctl headers has a lookup function to find the first sysctl header in the list, allowing the lists to have a per namespace instance. register_sysct_root is added to tell sysctl.c about additional lists of sysctl_headers. As all of the users are expected to be in kernel no unregister function is provided. sysctl_head_next is updated to walk through the list of lists. __register_sysctl_paths is added to add a new sysctl table on a non-default sysctl list. The only intrusive part of this patch is propagating the information to decided which list of sysctls to use for sysctl_check_table. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Daniel Lezcano <dlezcano@fr.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28sysctl: Remember the ctl_table we passed to register_sysctl_pathsEric W. Biederman
By doing this we allow users of register_sysctl_paths that build and dynamically allocate their ctl_table to be simpler. This allows them to just remember the ctl_table_header returned from register_sysctl_paths from which they can now find the ctl_table array they need to free. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Daniel Lezcano <dlezcano@fr.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28sysctl: Add register_sysctl_paths functionEric W. Biederman
There are a number of modules that register a sysctl table somewhere deeply nested in the sysctl hierarchy, such as fs/nfs, fs/xfs, dev/cdrom, etc. They all specify several dummy ctl_tables for the path name. This patch implements register_sysctl_path that takes an additional path name, and makes up dummy sysctl nodes for each component. This patch was originally written by Olaf Kirch and brought to my attention and reworked some by Olaf Hering. I have changed a few additional things so the bugs are mine. After converting all of the easy callers Olaf Hering observed allyesconfig ARCH=i386, the patch reduces the final binary size by 9369 bytes. .text +897 .data -7008 text data bss dec hex filename 26959310 4045899 4718592 35723801 2211a19 ../vmlinux-vanilla 26960207 4038891 4718592 35717690 221023a ../O-allyesconfig/vmlinux So this change is both a space savings and a code simplification. CC: Olaf Kirch <okir@suse.de> CC: Olaf Hering <olaf@aepfle.de> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Daniel Lezcano <dlezcano@fr.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28kernel: add CLONE_IO to specifically request sharing of IO contextsJens Axboe
syslets (or other threads/processes that want io context sharing) can set this to enforce sharing of io context. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28io context sharing: preliminary supportJens Axboe
Detach task state from ioc, instead keep track of how many processes are accessing the ioc. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28ioprio: move io priority from task_struct to io_contextJens Axboe
This is where it belongs and then it doesn't take up space for a process that doesn't do IO. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-27printk: revert ktime_get() timestampsIngo Molnar
revert 19ef9309273d26cb005cb23e6a370353dca91099. Kevin Winchester reported a lockup during X startup an bisected it to this commit. Reported-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-26[S390] Remove appldata include from sysctl_check.cHeiko Carstens
Forgot to remove this when removing the appldata binary sysctls. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-01-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 commits) [SCSI] usbstorage: use last_sector_bug flag universally [SCSI] libsas: abstract STP task status into a function [SCSI] ultrastor: clean up inline asm warnings [SCSI] aic7xxx: fix firmware build [SCSI] aacraid: fib context lock for management ioctls [SCSI] ch: remove forward declarations [SCSI] ch: fix device minor number management bug [SCSI] ch: handle class_device_create failure properly [SCSI] NCR5380: fix section mismatch [SCSI] sg: fix /proc/scsi/sg/devices when no SCSI devices [SCSI] IB/iSER: add logical unit reset support [SCSI] don't use __GFP_DMA for sense buffers if not required [SCSI] use dynamically allocated sense buffer [SCSI] scsi.h: add macro for enclosure bit of inquiry data [SCSI] sd: add fix for devices with last sector access problems [SCSI] fix pcmcia compile problem [SCSI] aacraid: add Voodoo Lite class of cards. [SCSI] aacraid: add new driver features flags [SCSI] qla2xxx: Update version number to 8.02.00-k7. [SCSI] qla2xxx: Issue correct MBC_INITIALIZE_FIRMWARE command. ...
2008-01-25sched: keep total / count stats in addition to the max forArjan van de Ven
Right now, the linux kernel (with scheduler statistics enabled) keeps track of the maximum time a process is waiting to be scheduled. While the maximum is a very useful metric, tracking average and total is equally useful (at least for latencytop) to figure out the accumulated effect of scheduler delays. The accumulated effect is important to judge the performance impact of scheduler tuning/behavior. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: fix: don't take a mutex from interrupt contextPeter Zijlstra
print_cfs_stats is callable from interrupt context (sysrq), hence it should not take mutexes. Change it to use RCU since the task group data is RCU freed anyway. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: print backtrace of running tasks tooNick Piggin
The attached patch is something really simple that can sometimes help in getting more info out of a hung system. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25printk: use ktime_get()Ingo Molnar
printk timestamps: use ktime_get(). Some platforms have a functioning clocksource function only after they are done with early bootup, so delay this until out of SYSTEM_BOOTING state. it's also inherently safe now, as any bugs in this area will be caught by the printk recursion checks. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25softlockup: fix signednessIngo Molnar
fix softlockup tunables signedness. mark tunables read-mostly. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: latencytop supportArjan van de Ven
LatencyTOP kernel infrastructure; it measures latencies in the scheduler and tracks it system wide and per process. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: fix goto retry in pick_next_task_rt()Dmitry Adamushko
looking at it one more time: (1) it looks to me that there is no need to call sched_rt_ratio_exceeded() from pick_next_rt_entity() - [ for CONFIG_FAIR_GROUP_SCHED ] queues with rt_rq->rt_throttled are not within this 'tree-like hierarchy' (or whatever we should call it :-) - there is also no need to re-check 'rt_rq->rt_time > ratio' at this point as 'rt_rq->rt_time' couldn't have been increased since the last call to update_curr_rt() (which obviously calls sched_rt_ratio_esceeded()) well, it might be that 'ratio' for this rt_rq has been re-configured (and the period over which this rt_rq was active has not yet been finished)... but I don't think we should really take this into account. (2) now pick_next_rt_entity() must never return NULL, so let's change pick_next_task_rt() accordingly. Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: monitor clock underflows in /proc/sched_debugGuillaume Chazarain
We monitor clock overflows, let's also monitor clock underflows. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: fix rq->clock warps on frequency changesGuillaume Chazarain
sched: fix rq->clock warps on frequency changes Fix 2bacec8c318ca0418c0ee9ac662ee44207765dd4 (sched: touch softlockup watchdog after idling) that reintroduced warps on frequency changes. touch_softlockup_watchdog() calls __update_rq_clock that checks rq->clock for warps, so call it after adjusting rq->clock. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: fix, always create kernel threads with normal priorityMichal Schmidt
Ensure that the kernel threads are created with the usual nice level and affinity even if kthreadd's properties were changed from the default by root. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25debug: clean up kernel/profile.cPaolo Ciarrocchi
Before: total: 25 errors, 13 warnings, 602 lines checked After: total: 0 errors, 2 warnings, 601 lines checked No code changed: kernel/profile.o: text data bss dec hex filename 3048 236 24 3308 cec profile.o.before 3048 236 24 3308 cec profile.o.after md5: 2501d64748a4d350dffb11203e2a5182 profile.o.before.asm 2501d64748a4d350dffb11203e2a5182 profile.o.after.asm Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: remove the !PREEMPT_BKL codeIngo Molnar
remove the !PREEMPT_BKL code. this removes 160 lines of legacy code. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: make PREEMPT_BKL the defaultIngo Molnar
make PREEMPT_BKL the default. precursor to removal of the !PREEMPT_BKL code. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25debug: track and print last unloaded module in the oops traceArjan van de Ven
Based on a suggestion from Andi: In various cases, the unload of a module may leave some bad state around that causes a kernel crash AFTER a module is unloaded; and it's then hard to find which module caused that. This patch tracks the last unloaded module, and prints this as part of the module list in the oops trace. Right now, only the last 1 module is tracked; I expect that this is enough for the vast majority of cases where this information matters; if it turns out that tracking more is important, we can always extend it to that. [ mingo@elte.hu: build fix ] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25debug: show being-loaded/being-unloaded indicator for modulesArjan van de Ven
It's rather common that an oops/WARN_ON/BUG happens during the load or unload of a module. Unfortunatly, it's not always easy to see directly which module is being loaded/unloaded from the oops itself. Worse, it's not even always possible to ask the bug reporter, since there are so many components (udev etc) that auto-load modules that there's a good chance that even the reporter doesn't know which module this is. This patch extends the existing "show if it's tainting" print code, which is used as part of printing the modules in the oops/BUG/WARN_ON to include a "+" for "being loaded" and a "-" for "being unloaded". As a result this extension, the "taint_flags()" function gets renamed to "module_flags()" (and takes a module struct as argument, not a taint flags int). Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: rt-watchdog: fix .rlim_max = RLIM_INFINITYPeter Zijlstra
Remove the curious logic to set it_sched_expires in the future. It useless because rt.timeout wouldn't be incremented anyway. Explicity check for RLIM_INFINITY as a test programm that had a 1s soft limit and a inf hard limit would SIGKILL at 1s. This is because RLIM_INFINITY+d-1 is d-2. Signed-off-by: Peter Zijlsta <a.p.zijlstra@chello.nl> CC: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: rt-group: reduce reschedulingPeter Zijlstra
Only reschedule if the new group has a higher prio task. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25hrtimer: unlock hrtimer_wakeupPeter Zijlstra
hrtimer_wakeup creates a base->lock rq->lock lock dependancy. Avoid this by switching to HRTIMER_CB_IRQSAFE_NO_SOFTIRQ which doesn't hold base->lock. This fully untangles hrtimer locks from the scheduler locks, and allows hrtimer usage in the scheduler proper. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25hrtimer: fixup the HRTIMER_CB_IRQSAFE_NO_SOFTIRQ fallbackPeter Zijlstra
Currently all highres=off timers are run from softirq context, but HRTIMER_CB_IRQSAFE_NO_SOFTIRQ timers expect to run from irq context. Fix this up by splitting it similar to the highres=on case. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25hrtimer: clean up cpu->base locking tricksPeter Zijlstra
In order to more easily allow for the scheduler to use timers, clean up the locking a bit. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>