Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2006-08-29	Merge branch 'master' into gfs2	Steven Whitehouse

2006-08-27	[PATCH] swsusp: Fix swap_type_of	Rafael J. Wysocki
	There is a bug in mm/swapfile.c#swap_type_of() that makes swsusp only be able to use the first active swap partition as the resume device. Fix it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Hugh Dickins <hugh@veritas.com> Acked-by: Pavel Machek <pavel@suse.cz> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-14	[PATCH] fuse: fix error case in fuse_readpages	Alexander Zarochentsev
	Don't let fuse_readpages leave the @pages list not empty when exiting on error. [akpm@osdl.org: kernel-doc fixes] Signed-off-by: Alexander Zarochentsev <zam@namesys.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-07	Merge branch 'master'	Steven Whitehouse

2006-08-06	[PATCH] memory hotadd fixes: enhance collision check	KAMEZAWA Hiroyuki
	This patch is for collision check enhancement for memory hot add. It's better to do resouce collision check before doing memory hot add, which will touch memory management structures. And add_section() should check section exists or not before calling sparse_add_one_section(). (sparse_add_one_section() will do another check anyway. but checking in memory_hotplug.c will be easy to understand.) Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: keith mannthey <kmannth@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06	[PATCH] memory hotadd fixes: find_next_system_ram catch range fix	KAMEZAWA Hiroyuki
	find_next_system_ram() is used to find available memory resource at onlining newly added memory. This patch fixes following problem. find_next_system_ram() cannot catch this case. Resource: (start)-------------(end) Section : (start)-------------(end) Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Keith Mannthey <kmannth@gmail.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06	[PATCH] memory hotadd fixes: not-aligned memory hotadd handling fix	KAMEZAWA Hiroyuki
	ioresouce handling code in memory hotplug allows not-aligned memory hot add. But when memmap and other memory structures are initialized, parameters should be aligned. (if not aligned, initialization of mem_map will do wrong, it assumes parameters are aligned.) This patch fix it. And this patch allows ioresource collision check to handle -EEXIST. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Keith Mannthey <kmannth@gmail.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06	[PATCH] fadvise() make POSIX_FADV_NOREUSE a no-op	Andrew Morton
	The POSIX_FADV_NOREUSE hint means "the application will use this range of the file a single time". It seems to be intended that the implementation will use this hint to perform drop-behind of that part of the file when the application gets around to reading or writing it. However for reasons which aren't obvious (or sane?) I mapped POSIX_FADV_NOREUSE onto POSIX_FADV_WILLNEED. ie: it does readahead. That's daft. So for now, make POSIX_FADV_NOREUSE a no-op. This is a non-back-compatible change. If someone was using POSIX_FADV_NOREUSE to perform readahead, they lose. The likelihood is low. If/when we later implement POSIX_FADV_NOREUSE things will get interesting - to do it fully we'll need to maintain file offset/length ranges and peform all sorts of complex tricks, and managing the lifetime of those ranges' data structures will be interesting.. A sensible implementation would probably ignore the file range and would simply mark the entire file as needing some form of drop-behind treatment. Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31	[PATCH] Fix kmem_cache_alloc() been documented twice	Rolf Eike Beer
	kmem_cache_alloc() was documented twice, but kmem_cache_zalloc() never. Fix this obvious typo to get things right. Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31	[PATCH] cpu hotplug: replace __devinit* with __cpuinit* for cpu notifications	Chandra Seetharaman
	Few of the callback functions and notifier blocks that are associated with cpu notifications incorrectly have __devinit and __devinitdata. They should be __cpuinit and __cpuinitdata instead. It makes no functional difference but wastes text area when CONFIG_HOTPLUG is enabled and CONFIG_HOTPLUG_CPU is not. This patch fixes all those instances. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31	Merge branch 'master'	Steven Whitehouse

2006-07-29	[PATCH] MM: Remove rogue readahead printk	Andi Kleen
	For some reason it triggers always with NFS root and spams the kernel logs of my nfs root boxes a lot. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-25	[GFS2] Alter direct I/O path	Steven Whitehouse
	As per comments received, alter the GFS2 direct I/O path so that it uses the standard read functions "out of the box". Needs a small change to one of the VFS functions. This reduces the size of the code quite a lot and also removes the need for one new export. Some more work remains to be done, but this is the bones of the thing. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-17	Merge branch 'master'	Steven Whitehouse

2006-07-14	[PATCH] per-task-delay-accounting: sync block I/O and swapin delay collection	Shailabh Nagar
	Unlike earlier iterations of the delay accounting patches, now delays are only collected for the actual I/O waits rather than try and cover the delays seen in I/O submission paths. Account separately for block I/O delays incurred as a result of swapin page faults whose frequency can be affected by the task/process' rss limit. Hence swapin delays can act as feedback for rss limit changes independent of I/O priority changes. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14	[PATCH] nommu: export two symbols for drivers to use	Luke Yang
	nommu.c needs to export two more symbols for drivers to use: remap_pfn_range and unmap_mapping_range. Signed-off-by: Luke Yang <luke.adi@gmail.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14	[PATCH] ia64: race flushing icache in COW path	Anil Keshavamurthy
	There is a race condition that showed up in a threaded JIT environment. The situation is that a process with a JIT code page forks, so the page is marked read-only, then some threads are created in the child. One of the threads attempts to add a new code block to the JIT page, so a copy-on-write fault is taken, and the kernel allocates a new page, copies the data, installs the new pte, and then calls lazy_mmu_prot_update() to flush caches to make sure that the icache and dcache are in sync. Unfortunately, the other thread runs right after the new pte is installed, but before the caches have been flushed. It tries to execute some old JIT code that was already in this page, but it sees some garbage in the i-cache from the previous users of the new physical page. Fix: we must make the caches consistent before installing the pte. This is an ia64 only fix because lazy_mmu_prot_update() is a no-op on all other architectures. Signed-off-by: Anil Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14	[PATCH] mm: fix oom roll-back of __vmalloc_area_node	Jan Kiszka
	__vunmap must not rely on area->nr_pages when picking the release methode for area->pages. It may be too small when __vmalloc_area_node failed early due to lacking memory. Instead, use a flag in vmstruct to differentiate. Signed-off-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-13	[PATCH] revert slab.c locking change	Ingo Molnar
	Chandra Seetharaman reported SLAB crashes caused by the slab.c lock annotation patch. There is only one chunk of that patch that has a material effect on the slab logic - this patch undoes that chunk. This was confirmed to fix the slab problem by Chandra. Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-13	[PATCH] lockdep: annotate mm/slab.c	Arjan van de Ven
	mm/slab.c uses nested locking when dealing with 'off-slab' caches, in that case it allocates the slab header from the (on-slab) kmalloc caches. Teach the lock validator about this by putting all on-slab caches into a separate class. this patch has no effect on non-lockdep kernels. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-13	[PATCH] lockdep: undo mm/slab.c annotation	Ingo Molnar
	undo existing mm/slab.c lock-validator annotations, in preparation of a new, less intrusive annotation patch. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10	[PATCH] vmstat: export all_vm_events()	Heiko Carstens
	Add missing EXPORT_SYMBOL for all_vm_events(). Git commit f8891e5e1f93a128c3900f82035e8541357896a7 caused this: Building modules, stage 2. MODPOST WARNING: "all_vm_events" [arch/s390/appldata/appldata_mem.ko] undefined! CC arch/s390/appldata/appldata_mem.mod.o Cc: Christoph Lameter <christoph@lameter.com> Cc: Gerald Schaefer <geraldsc@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10	[PATCH] mm/mmzone.c: EXPORT_UNUSED_SYMBOL	Adrian Bunk
	This patch marks three unused exports as EXPORT_UNUSED_SYMBOL. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10	[PATCH] mm/memory.c: EXPORT_UNUSED_SYMBOL	Adrian Bunk
	This patch marks an unused export as EXPORT_UNUSED_SYMBOL. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10	[PATCH] mm/bootmem.c: EXPORT_UNUSED_SYMBOL	Adrian Bunk
	This patch marks an unused export as EXPORT_UNUSED_SYMBOL. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10	[PATCH] fadvise: remove dead comments	Andrew Morton
	Cc: "Michael Kerrisk" <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-05	Merge branch 'master'	Steven Whitehouse

2006-07-03	[PATCH] sched: cleanup, remove task_t, convert to struct task_struct	Ingo Molnar
	cleanup: remove task_t and convert all the uses to struct task_struct. I introduced it for the scheduler anno and it was a mistake. Conversion was mostly scripted, the result was reviewed and all secondary whitespace and style impact (if any) was fixed up by hand. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03	[PATCH] lockdep: annotate SLAB code	Ingo Molnar
	Teach special (recursive) locking code to the lock validator. Has no effect on non-lockdep kernels. Fix initialize-locks-via-memcpy assumptions. Effects on non-lockdep kernels: the subclass nesting parameter is passed into cache_free_alien() and __cache_free(), and turns one internal kmem_cache_free() call into an open-coded __cache_free() call. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <clameter@engr.sgi.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03	[PATCH] lockdep: annotate mm	Ingo Molnar
	Teach special (recursive) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03	[PATCH] lockdep: locking init debugging improvement	Ingo Molnar
	Locking init improvement: - introduce and use __SPIN_LOCK_UNLOCKED for array initializations, to pass in the name string of locks, used by debugging Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03	[PATCH] lockdep: better lock debugging	Ingo Molnar
	Generic lock debugging: - generalized lock debugging framework. For example, a bug in one lock subsystem turns off debugging in all lock subsystems. - got rid of the caller address passing (__IP__/__IP_DECL__/etc.) from the mutex/rtmutex debugging code: it caused way too much prototype hackery, and lockdep will give the same information anyway. - ability to do silent tests - check lock freeing in vfree too. - more finegrained debugging options, to allow distributions to turn off more expensive debugging features. There's no separate 'held mutexes' list anymore - but there's a 'held locks' stack within lockdep, which unifies deadlock detection across all lock classes. (this is independent of the lockdep validation stuff - lockdep first checks whether we are holding a lock already) Here are the current debugging options: CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y which do: config DEBUG_MUTEXES bool "Mutex debugging, basic checks" config DEBUG_LOCK_ALLOC bool "Detect incorrect freeing of live mutexes" Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03	[PATCH] ZVC/zone_reclaim: Leave 1% of unmapped pagecache pages for file I/O	Christoph Lameter
	It turns out that it is advantageous to leave a small portion of unmapped file backed pages if all of a zone's pages (or almost all pages) are allocated and so the page allocator has to go off-node. This allows recently used file I/O buffers to stay on the node and reduces the times that zone reclaim is invoked if file I/O occurs when we run out of memory in a zone. The problem is that zone reclaim runs too frequently when the page cache is used for file I/O (read write and therefore unmapped pages!) alone and we have almost all pages of the zone allocated. Zone reclaim may remove 32 unmapped pages. File I/O will use these pages for the next read/write requests and the unmapped pages increase. After the zone has filled up again zone reclaim will remove it again after only 32 pages. This cycle is too inefficient and there are potentially too many zone reclaim cycles. With the 1% boundary we may still remove all unmapped pages for file I/O in zone reclaim pass. However. it will take a large number of read and writes to get back to 1% again where we trigger zone reclaim again. The zone reclaim 2.6.16/17 does not show this behavior because we have a 30 second timeout. [akpm@osdl.org: rename the /proc file and the variable] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03	Merge rsync://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	Steven Whitehouse
	Conflicts: include/linux/kernel.h
2006-06-30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: Remove obsolete #include <linux/config.h> remove obsolete swsusp_encrypt arch/arm26/Kconfig typos Documentation/IPMI typos Kconfig: Typos in net/sched/Kconfig v9fs: do not include linux/version.h Documentation/DocBook/mtdnand.tmpl: typo fixes typo fixes: specfic -> specific typo fixes in Documentation/networking/pktgen.txt typo fixes: occuring -> occurring typo fixes: infomation -> information typo fixes: disadvantadge -> disadvantage typo fixes: aquire -> acquire typo fixes: mecanism -> mechanism typo fixes: bandwith -> bandwidth fix a typo in the RTC_CLASS help text smb is no longer maintained Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S
2006-06-30	[PATCH] slab: consolidate code to free slabs from freelist	Christoph Lameter
	Post and discussion: http://marc.theaimsgroup.com/?t=115074342800003&r=1&w=2 Code in __shrink_node() duplicates code in cache_reap() Add a new function drain_freelist that removes slabs with objects that are already free and use that in various places. This eliminates the __node_shrink() function and provides the interrupt holdoff reduction from slab_free to code that used to call __node_shrink. [akpm@osdl.org: build fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] Light weight event counters	Christoph Lameter
	The remaining counters in page_state after the zoned VM counter patches have been applied are all just for show in /proc/vmstat. They have no essential function for the VM. We use a simple increment of per cpu variables. In order to avoid the most severe races we disable preempt. Preempt does not prevent the race between an increment and an interrupt handler incrementing the same statistics counter. However, that race is exceedingly rare, we may only loose one increment or so and there is no requirement (at least not in kernel) that the vm event counters have to be accurate. In the non preempt case this results in a simple increment for each counter. For many architectures this will be reduced by the compiler to a single instruction. This single instruction is atomic for i386 and x86_64. And therefore even the rare race condition in an interrupt is avoided for both architectures in most cases. The patchset also adds an off switch for embedded systems that allows a building of linux kernels without these counters. The implementation of these counters is through inline code that hopefully results in only a single instruction increment instruction being emitted (i386, x86_64) or in the increment being hidden though instruction concurrency (EPIC architectures such as ia64 can get that done). Benefits: - VM event counter operations usually reduce to a single inline instruction on i386 and x86_64. - No interrupt disable, only preempt disable for the preempt case. Preempt disable can also be avoided by moving the counter into a spinlock. - Handling is similar to zoned VM counters. - Simple and easily extendable. - Can be omitted to reduce memory use for embedded use. References: RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=113512330605497&w=2 RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=114988082814934&w=2 local_t http://marc.theaimsgroup.com/?l=linux-kernel&m=114991748606690&w=2 V2 http://marc.theaimsgroup.com/?t=115014808400007&r=1&w=2 V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767022346&w=2 V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115047968808926&w=2 Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] Use Zoned VM Counters for NUMA statistics	Christoph Lameter
	The numa statistics are really event counters. But they are per node and so we have had special treatment for these counters through additional fields on the pcp structure. We can now use the per zone nature of the zoned VM counters to realize these. This will shrink the size of the pcp structure on NUMA systems. We will have some room to add additional per zone counters that will all still fit in the same cacheline. Bits Prior pcp size Size after patch We can add ------------------------------------------------------------------ 64 128 bytes (16 words) 80 bytes (10 words) 48 32 76 bytes (19 words) 56 bytes (14 words) 8 (64 byte cacheline) 72 (128 byte) Remove the special statistics for numa and replace them with zoned vm counters. This has the side effect that global sums of these events now show up in /proc/vmstat. Also take the opportunity to move the zone_statistics() function from page_alloc.c into vmstat.c. Discussions: V2 http://marc.theaimsgroup.com/?t=115048227000002&r=1&w=2 Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned-vm-counters: remove read_page_state()	Andrew Morton
	No callers. Cc: Christoph Lameter <clameter@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: remove useless struct wbs	Christoph Lameter
	Remove writeback state We can remove some functions now that were needed to calculate the page state for writeback control since these statistics are now directly available. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_bounce to per zone counter	Christoph Lameter
	Conversion of nr_bounce to a per zone counter nr_bounce is only used for proc output. So it could be left as an event counter. However, the event counters may not be accurate and nr_bounce is categorizing types of pages in a zone. So we really need this to also be a per zone counter. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_unstable to per zone counter	Christoph Lameter
	Conversion of nr_unstable to a per zone counter We need to do some special modifications to the nfs code since there are multiple cases of disposition and we need to have a page ref for proper accounting. This converts the last critical page state of the VM and therefore we need to remove several functions that were depending on GET_PAGE_STATE_LAST in order to make the kernel compile again. We are only left with event type counters in page state. [akpm@osdl.org: bugfixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_writeback to per zone counter	Christoph Lameter
	Conversion of nr_writeback to per zone counter. This removes the last page_state counter from arch/i386/mm/pgtable.c so we drop the page_state from there. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_dirty to per zone counter	Christoph Lameter
	This makes nr_dirty a per zone counter. Looping over all processors is avoided during writeback state determination. The counter aggregation for nr_dirty had to be undone in the NFS layer since we summed up the page counts from multiple zones. Someone more familiar with NFS should probably review what I have done. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_pagetables to per zone counter	Christoph Lameter
	Conversion of nr_page_table_pages to a per zone counter [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_slab to per zone counter	Christoph Lameter
	- Allows reclaim to access counter without looping over processor counts. - Allows accurate statistics on how many pages are used in a zone by the slab. This may become useful to balance slab allocations over various zones. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: zone_reclaim: remove ↵	Christoph Lameter
	/proc/sys/vm/zone_reclaim_interval The zone_reclaim_interval was necessary because we were not able to determine how many unmapped pages exist in a zone. Therefore we had to scan in intervals to figure out if any pages were unmapped. With the zoned counters and NR_ANON_PAGES we now know the number of pagecache pages and the number of mapped pages in a zone. So we can simply skip the reclaim if there is an insufficient number of unmapped pages. We use SWAP_CLUSTER_MAX as the boundary. Drop all support for /proc/sys/vm/zone_reclaim_interval. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: split NR_ANON_PAGES off from NR_FILE_MAPPED	Christoph Lameter
	The current NR_FILE_MAPPED is used by zone reclaim and the dirty load calculation as the number of mapped pagecache pages. However, that is not true. NR_FILE_MAPPED includes the mapped anonymous pages. This patch separates those and therefore allows an accurate tracking of the anonymous pages per zone. It then becomes possible to determine the number of unmapped pages per zone and we can avoid scanning for unmapped pages if there are none. Also it may now be possible to determine the mapped/unmapped ratio in get_dirty_limit. Isnt the number of anonymous pages irrelevant in that calculation? Note that this will change the meaning of the number of mapped pages reported in /proc/vmstat /proc/meminfo and in the per node statistics. This may affect user space tools that monitor these counters! NR_FILE_MAPPED works like NR_FILE_DIRTY. It is only valid for pagecache pages. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: remove NR_FILE_MAPPED from scan control structure	Christoph Lameter
	We can now access the number of pages in a mapped state in an inexpensive way in shrink_active_list. So drop the nr_mapped field from scan_control. [akpm@osdl.org: bugfix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30	[PATCH] zoned vm counters: conversion of nr_pagecache to per zone counter	Christoph Lameter
	Currently a single atomic variable is used to establish the size of the page cache in the whole machine. The zoned VM counters have the same method of implementation as the nr_pagecache code but also allow the determination of the pagecache size per zone. Remove the special implementation for nr_pagecache and make it a zoned counter named NR_FILE_PAGES. Updates of the page cache counters are always performed with interrupts off. We can therefore use the __ variant here. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>