aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2006-12-10[PATCH] md: remove some old ifdefed-out code from raid5.cNeilBrown
There are some vestiges of old code that was used for bypassing the stripe cache on reads in raid5.c. This was never updated after the change from buffer_heads to bios, but was left as a reminder. That functionality has nowe been implemented in a completely different way, so the old code can go. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] MD: conditionalize some codeJeff Garzik
The autorun code is only used if this module is built into the static kernel image. Adjust #ifdefs accordingly. Signed-off-by: Jeff Garzik <jeff@garzik.org> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] md: fix innocuous bug in raid6 stripe_to_pdidxNeilBrown
stripe_to_pdidx finds the index of the parity disk for a given stripe. It assumes raid5 in that it uses "disks-1" to determine the number of data disks. This is incorrect for raid6 but fortunately the two usages cancel each other out. The only way that 'data_disks' affects the calculation of pd_idx in raid5_compute_sector is when it is divided into the sector number. But as that sector number is calculated by multiplying in the wrong value of 'data_disks' the division produces the right value. So it is innocuous but needs to be fixed. Also change the calculation of raid_disks in compute_blocknr to make it more obviously correct (it seems at first to always use disks-1 too). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] md: enable bypassing cache for readsRaz Ben-Jehuda(caro)
Call the chunk_aligned_read where appropriate. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] md: allow reads that have bypassed the cache to be retried on failureRaz Ben-Jehuda(caro)
If a bypass-the-cache read fails, we simply try again through the cache. If it fails again it will trigger normal recovery precedures. update 1: From: NeilBrown <neilb@suse.de> 1/ chunk_aligned_read and retry_aligned_read assume that data_disks == raid_disks - 1 which is not true for raid6. So when an aligned read request bypasses the cache, we can get the wrong data. 2/ The cloned bio is being used-after-free in raid5_align_endio (to test BIO_UPTODATE). 3/ We forgot to add rdev->data_offset when submitting a bio for aligned-read 4/ clone_bio calls blk_recount_segments and then we change bi_bdev, so we need to invalidate the segment counts. 5/ We don't de-reference the rdev when the read completes. This means we need to record the rdev to so it is still available in the end_io routine. Fortunately bi_next in the original bio is unused at this point so we can stuff it in there. 6/ We leak a cloned bio if the target rdev is not usable. From: NeilBrown <neilb@suse.de> update 2: 1/ When aligned requests fail (read error) they need to be retried via the normal method (stripe cache). As we cannot be sure that we can process a single read in one go (we may not be able to allocate all the stripes needed) we store a bio-being-retried and a list of bioes-that-still-need-to-be-retried. When find a bio that needs to be retried, we should add it to the list, not to single-bio... 2/ We were never incrementing 'scnt' when resubmitting failed aligned requests. [akpm@osdl.org: build fix] Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] md: handle bypassing the read cache (assuming nothing fails)Raz Ben-Jehuda(caro)
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] md: define raid5_mergeable_bvecRaz Ben-Jehuda(caro)
This will encourage read request to be on only one device, so we will often be able to bypass the cache for read requests. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] md: tidy up device-change notification when an md array is stoppedNeilBrown
An md array can be stopped leaving all the setting still in place, or it can torn down and destroyed. set_capacity and other change notifications only happen in the latter case, but should happen in both. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] Fbdev driver for IBM GXT4500P videocardsPaul Mackerras
This is an fbdev driver for the IBM GXT4500P display card found in some IBM System P (pSeries) machines. These cards have hardware 2D and 3D capabilities, but the driver does not use them; it just exports a dumb framebuffer. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: James Simmons <jsimmons@infradead.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] ide-cd: Handle strange interrupt on the Intel ESB2Alan Cox
The ESB2 appears to emit spurious DMA interrupts when configured for native mode and handling ATAPI devices. Stratus were able to pin this bug down and produce a patch. This is a rework which applies the fixup only to the ESB2 (for now). We can apply it to other chips later if the same problem is found. This code has been tested and confirmed to fix the problem on the tested systems. Signed-off-by: Alan Cox <alan@redhat.com> (Most of the hard work done by Stratus however) Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] kernel/sched.c: whitespace cleanupsMiguel Ojeda Sandonis
[akpm@osdl.org: additional cleanups] Signed-off-by: Miguel Ojeda Sandonis <maxextreme@gmail.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: optimize activate_task for RT taskChen, Kenneth W
RT task does not participate in interactiveness priority and thus shouldn't be bothered with timestamp and p->sleep_type manipulation when task is being put on run queue. Bypass all of the them with a single if (rt_task) test. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: remove lb_stopbalance counterChen, Kenneth W
Remove scheduler stats lb_stopbalance counter. This counter can be calculated by: lb_balanced - lb_nobusyg - lb_nobusyq. There is no need to create gazillion counters while we can derive the value. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: decrease number of load balancesSiddha, Suresh B
Currently at a particular domain, each cpu in the sched group will do a load balance at the frequency of balance_interval. More the cores and threads, more the cpus will be in each sched group at SMP and NUMA domain. And we endup spending quite a bit of time doing load balancing in those domains. Fix this by making only one cpu(first idle cpu or first cpu in the group if all the cpus are busy) in the sched group do the load balance at that particular sched domain and this load will slowly percolate down to the other cpus with in that group(when they do load balancing at lower domains). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Christoph Lameter <clameter@engr.sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: improve migration accuracyMike Galbraith
Co-opt rq->timestamp_last_tick to maintain a cache_hot_time evaluation reference timestamp at both tick and sched times to prevent said reference, formerly rq->timestamp_last_tick, from being behind task->last_ran at evaluation time, and to move said reference closer to current time on the remote processor, intent being to improve cache hot evaluation and timestamp adjustment accuracy for task migration. Fix minor sched_time double accounting error which occurs when a task passing through schedule() does not schedule off, and takes the next timer tick. [kenneth.w.chen@intel.com: cleanup] Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Ken Chen <kenneth.w.chen@intel.com> Cc: Don Mullis <dwm@meer.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: add option to serialize load balancingChristoph Lameter
Large sched domains can be very expensive to scan. Add an option SD_SERIALIZE to the sched domain flags. If that flag is set then we make sure that no other such domain is being balanced. [akpm@osdl.org: build fix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: call tasklet less frequentlyChristoph Lameter
Trigger softirq less frequently We trigger the softirq before this patch using offset of sd->interval. However, if the queue is busy then it is sufficient to schedule the softirq with sd->interval * busy_factor. So we modify the calculation of the next time to balance by taking the interval added to last_balance again. This is only the right value if the idle/busy situation continues as is. There are two potential trouble spots: - If the queue was idle and now gets busy then we call rebalance early. However, that is not a problem because we will then use the longer interval for the next period. - If the queue was busy and becomes idle then we potentially wait too long before rebalancing. However, when the task goes idle then idle_balance is called. We add another calculation of the next balance time based on sd->interval in idle_balance so that we will rebalance soon. V2->V3: - Calculate rebalance time based on current jiffies and not based on the jiffies at the last time we load balanced. We no longer rely on staggering and therefore we can affort to do this now. V3->V4: - Use functions to do jiffy comparisons. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: use softirq for load balancingChristoph Lameter
Call rebalance_tick (renamed to run_rebalance_domains) from a newly introduced softirq. We calculate the earliest time for each layer of sched domains to be rescanned (this is the rescan time for idle) and use the earliest of those to schedule the softirq via a new field "next_balance" added to struct rq. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: move idle status calculation into rebalance_tick()Christoph Lameter
Perform the idle state determination in rebalance_tick. If we separate balancing from sched_tick then we also need to determine the idle state in rebalance_tick. V2->V3 Remove useless idlle != 0 check. Checking nr_running seems to be sufficient. Thanks Suresh. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: extract load calculation from rebalance_tickChristoph Lameter
A load calculation is always done in rebalance_tick() in addition to the real load balancing activities that only take place when certain jiffie counts have been reached. Move that processing into a separate function and call it directly from scheduler_tick(). Also extract the time slice handling from scheduler_tick and put it into a separate function. Then we can clean up scheduler_tick significantly. It will no longer have any gotos. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: disable interrupts for locking in load_balance()Christoph Lameter
Interrupts must be disabled for request queue locks if we want to run load_balance() with interrupts enabled. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: remove staggering of load balancingChristoph Lameter
Timer interrupts already are staggered. We do not need an additional layer of time staggering for short load balancing actions that take a reasonably small portion of the time slice. For load balancing on large sched_domains we will add a serialization later that avoids concurrent load balance operations and thus has the same effect as load staggering. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched: avoid taking rq lock in wake_priority_sleeperChristoph Lameter
Avoid taking the request queue lock in wake_priority_sleeper if there are no running processes. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched domain: increase the SMT busy rebalance intervalSiddha, Suresh B
With SMT, if the logical processor is busy, load balance happens for every 8msec(min)-16msec(max). There is no need to do this often, as this is just for fairness(to maintain uniform runqueue lengths) and default time slice anyhow is 100msec. Appended patch increases this interval to 64msec(min)-128msec(max) when the logical processor is busy. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] move_task_off_dead_cpu() should be called with disabled intsKirill Korotaev
move_task_off_dead_cpu() requires interrupts to be disabled, while migrate_dead() calls it with enabled interrupts. Added appropriate comments to functions and added BUG_ON(!irqs_disabled()) into double_rq_lock() and double_lock_balance() which are the origin sources of such bugs. Signed-off-by: Kirill Korotaev <dev@openvz.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] ched domain: move sched group allocations to percpu areaSiddha, Suresh B
Move the sched group allocations to percpu area. This will minimize cross node memory references and also cleans up the sched groups allocation for allnodes sched domain. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sched.c: correct comment for this_rq_lock()Robert P. J. Day
Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] Don't build some broken ISDN drivers on big endian MIPSRalf Baechle
Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] io-accounting: add to getdelaysAndrew Morton
Wire up the IO accounting into getdelays.c. Usage: To display I/O stats for each exitting task: vmm:/home/akpm> ./getdelays -m0,1,2,3 -i -l cpumask 0 maskset 1 printing IO accounting listen forever rm: read=8192, write=0, cancelled_write=0 cvs: read=733184, write=4255744, cancelled_write=4096 make: read=217088, write=0, cancelled_write=0 cc1: read=4263936, write=12288, cancelled_write=0 as: read=811008, write=8192, cancelled_write=0 gcc: read=323584, write=0, cancelled_write=12288 cc1: read=0, write=8192, cancelled_write=0 as: read=4096, write=4096, cancelled_write=0 gcc: read=16384, write=0, cancelled_write=4096 as: read=4096, write=4096, cancelled_write=0 gcc: read=16384, write=0, cancelled_write=8192 ld: read=1011712, write=16384, cancelled_write=0 collect2: read=626688, write=0, cancelled_write=0 gcc: read=204800, write=0, cancelled_write=0 cc1: read=0, write=8192, cancelled_write=0 as: read=4096, write=4096, cancelled_write=0 gcc: read=16384, write=0, cancelled_write=8192 ld: read=8192, write=16384, cancelled_write=0 collect2: read=49152, write=0, cancelled_write=0 gcc: read=0, write=0, cancelled_write=0 cc1: read=0, write=4096, cancelled_write=0 ld: read=4096, write=12288, cancelled_write=0 collect2: read=49152, write=0, cancelled_write=0 gcc: read=0, write=0, cancelled_write=0 To display I/O stats for a particular presently-running task: vmm:/home/akpm> ./getdelays -i -p $(pidof crond) printing IO accounting crond: read=61440, write=0, cancelled_write=0 Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] getdelays: various fixesAndrew Morton
- Various cleanups - Report errors to stderr, not stdout - A printf was missing a \n and was hiding from me. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] io-accounting: via taskstatsAndrew Morton
Deliver IO accounting via taskstats. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] cleanup taskstats.hAndrew Morton
Fix weird whitespace mangling in taskstats.h Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] io-accounting: report in procfsAndrew Morton
Add a simple /proc/pid/io to show the IO accounting fields. Maybe this shouldn't be merged in mainline - the preferred reporting channel is taskstats. But given the poor state of our userspace support for taskstats, this is useful for developer-testing, at least. And it improves the changes that the procps developers will wire it up into top(1). Opinions are sought. The patch also wires up the existing IO-accounting fields. It's a bit racy on 32-bit machines: if process A reads process B's /proc/pid/io while process B is updating one of those 64-bit counters, process A could see an intermediate result. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] io-accounting: direct-ioAndrew Morton
Account for direct-io. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] io-accounting-read-accounting cifs fixAndrew Morton
CIFS implements ->readpages and doesn't use read_cache_pages(). So wire the read IO accounting up within CIFS. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: Steven French <sfrench@us.ibm.com> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] io-accounting-read-accounting nfs fixAndrew Morton
nfs's ->readpages uses read_cache_pages(). Wire it up there. [wfg@mail.ustc.edu.cn: account only successful nfs/fuse reads] Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] io-accounting: read accountingAndrew Morton
Wire up read accounting for block devices, within submit_bio(). Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] io-accounting: write-cancel accountingAndrew Morton
Account for the number of byte writes which this process caused to not happen after all. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] io-accounting: write accountingAndrew Morton
Accounting writes is fairly simple: whenever a process flips a page from clean to dirty, we accuse it of having caused a write to underlying storage of PAGE_CACHE_SIZE bytes. This may overestimate the amount of writing: the page-dirtying may cause only one buffer_head's worth of writeout. Fixing that is possible, but probably a bit messy and isn't obviously important. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] clean up __set_page_dirty_nobuffers()Andrew Morton
Save a tabstop in __set_page_dirty_nobuffers() and __set_page_dirty_buffers() and a few other places. No functional changes. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] io-accounting: core statisticsAndrew Morton
The present per-task IO accounting isn't very useful. It simply counts the number of bytes passed into read() and write(). So if a process reads 1MB from an already-cached file, it is accused of having performed 1MB of I/O, which is wrong. (David Wright had some comments on the applicability of the present logical IO accounting: For billing purposes it is useless but for workload analysis it is very useful read_bytes/read_calls average read request size write_bytes/write_calls average write request size read_bytes/read_blocks ie logical/physical can indicate hit rate or thrashing write_bytes/write_blocks ie logical/physical guess since pdflush writes can be missed I often look for logical larger than physical to see filesystem cache problems. And the bytes/cpusec can help find applications that are dominating the cache and causing slow interactive response from page cache contention. I want to find the IO intensive applications and make sure they are doing efficient IO. Thus the acctcms(sysV) or csacms command would give the high IO commands). This patchset adds new accounting which tries to be more accurate. We account for three things: reads: attempt to count the number of bytes which this process really did cause to be fetched from the storage layer. Done at the submit_bio() level, so it is accurate for block-backed filesystems. I also attempt to wire up NFS and CIFS. writes: attempt to count the number of bytes which this process caused to be sent to the storage layer. This is done at page-dirtying time. The big inaccuracy here is truncate. If a process writes 1MB to a file and then deletes the file, it will in fact perform no writeout. But it will have been accounted as having caused 1MB of write. So... cancelled_writes: account the number of bytes which this process caused to not happen, by truncating pagecache. We _could_ just subtract this from the process's `write' accounting. But that means that some processes would be reported to have done negative amounts of write IO, which is silly. So we just report the raw number and punt this decision up to userspace. Now, we _could_ account for writes at the physical I/O level. But - This would require that we track memory-dirtying tasks at the per-page level (would require a new pointer in struct page). - It would mean that IO statistics for a process are usually only available long after that process has exitted. Which means that we probably cannot communicate this info via taskstats. This patch: Wire up the kernel-private data structures and the accessor functions to manipulate them. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] pdc202xx_new: fix PLL/timing issuesSergei Shtylyov
Fix the CRC errors in the higher UltraDMA modes with the Promise PDC20268 and newer chips that always occur on non-x86 machines and when there are more than 2 adapters on x86 machines. Fix the overclocking issue for PDC20269 and newer chips that occurs when an UltraDMA/133 capable drive is connected. Here's the summary of changes: - add code to detect the PLL input clock detection and setup it output clock, remove the PowerMac hacks; - replace the macros accessing the indexed regiters with functions, switch to using them where appropriate, gather the PIO/MWDMA/UDMA timings into tables; - rewrite the speedproc() handler to set the drive's transfer mode first, and then override the timing registers set by hardware on UltraDMA/133 chips; - use better criterion for determining higher UltraDMA modes, and add comment concerning the doubtful value of the code enabling IORDY/prefetch; - replace the stupid 'pdcnew_new_' prefixes with mere 'pdcnew_'; - get rid of unneded spaces, parens and type casts, clean up some printk's, add some new lines here and there... This work is loosely based on these former patches by Albert Lee: [1] http://marc.theaimsgroup.com/?l=linux-ide&m=110992442032300 [2] http://marc.theaimsgroup.com/?l=linux-ide&m=110992457729382 [3] http://marc.theaimsgroup.com/?l=linux-ide&m=110992474205555 [4] http://marc.theaimsgroup.com/?l=linux-ide&m=111019224802939 Some PLL clock detection code was backported from his pata_pdc2027x driver... This code has been successfully tested by me on PDC2026[89] chips. I tried to keep this rework as several patches but it made no sense: [2] was largely a modification of the non-working timing override code, [3] by itself extended the overclocking issue to the case of non-UltraDMA/133 drives, and finally, the cleanup patch based on [1] ended up rejected... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Albert Lee <albertcc@tw.ibm.com> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] Fix noise in futex.hDavid Woodhouse
There are some kernel-only bits in the middle of <linux/futex.h> which should be removed in what we export to userspace. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sysctl: remove unused "context" paramAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andi Kleen <ak@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] sysctl: remove some OPsAlexey Dobriyan
kernel.cap-bound uses only OP_SET and OP_AND Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] IPMI: misc fixesCorey Minyard
Fix various problems pointed out by Andrew Morton and others: * platform_device_unregister checks for NULL, no need to check here. * Formatting fixes. * Remove big macro and convert to a function. * Use strcmp instead of defining a broken case-insensitive comparison, and make the output parameter info match the case of the input one (change "I/O" to "i/o"). * Return the length instead of 0 from the hotmod parameter handler. * Remove some unused cruft. * The trydefaults parameter only has to do with scanning the "standard" addresses, don't check for that on ACPI. Signed-off-by: Corey Minyard <cminyard@acm.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] IPMI: remove zero initsRandy Dunlap
Remove all =0 and =NULL from static initializers. They are not needed and removing them saves space in the object files. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Corey Minyard <minyard@acm.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] update MAINTAINERS with rtc-linux mailing list infoAlessandro Zummo
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] AT91RM9200 RTCAndrew Victor
The new Atmel AT91SAM9261 and AT91SAM9260 processors do not have the internal RTC peripheral. This RTC driver is therefore AT91RM9200-specific. This patch renames rtc-at91.c to rtc-at91rm9200.c, and changes the name of the configuration option. Signed-off-by: Andrew Victor <andrew@sanpeople.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-10[PATCH] RTCs don't use i2c_adapter.devDavid Brownell
Update more I2C drivers that live outside drivers/i2c to understand that using adapter->dev is not The Way. When actually referring to the adapter hardware, adapter->class_dev.dev is the answer. When referring to a device connected to it, client->dev.dev is the answer. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>