aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel/asm-offsets.c
AgeCommit message (Collapse)Author
2007-10-18powerpc: add scaled time accountingMichael Neuling
This adds POWERPC specific hooks for scaled time accounting. POWER6 includes a SPURR register. The SPURR is based off the PURR register but is scaled based on CPU frequency and issue rates. This gives a more accurate account of the instructions used per task. The PURR and timebase will be constant relative to the wall clock, irrespective of the CPU frequency. This implementation reads the SPURR register in account_system_vtime which is only call called on context witch and hard and soft irq entry and exit. The percentage of user and system time is then estimated using the ratio of these accounted by the PURR. If the SPURR is not present, the PURR read. An earlier implementation of this patch read the SPURR whenever the PURR was read, which included the system call entry and exit path. Unfortunately this showed a performance regression on lmbench runs, so was re-implemented. I've included the lmbench results here when run bare metal on POWER6. 1st column is the unpatch results. 2nd column is the results using the below patch and the 3rd is the % diff of these results from the base. 4th and 5th columns are the results and % differnce from the base using the older patch (SPURR read in syscall entry/exit path). Base Scaled-Acct SPURR-in-syscall Result Result % diff Result % diff Simple syscall: 0.3086 0.3086 0.0000 0.3452 11.8600 Simple read: 0.4591 0.4671 1.7425 0.5044 9.86713 Simple write: 0.4364 0.4366 0.0458 0.4731 8.40971 Simple stat: 2.0055 2.0295 1.1967 2.0669 3.06158 Simple fstat: 0.5962 0.5876 -1.442 0.6368 6.80979 Simple open/close: 3.1283 3.1009 -0.875 3.2088 2.57328 Select on 10 fd's: 0.8554 0.8457 -1.133 0.8667 1.32101 Select on 100 fd's: 3.5292 3.6329 2.9383 3.6664 3.88756 Select on 250 fd's: 7.9097 8.1881 3.5197 8.2242 3.97613 Select on 500 fd's: 15.2659 15.836 3.7357 15.873 3.97814 Select on 10 tcp fd's: 0.9576 0.9416 -1.670 0.9752 1.83792 Select on 100 tcp fd's: 7.248 7.2254 -0.311 7.2685 0.28283 Select on 250 tcp fd's: 17.7742 17.707 -0.375 17.749 -0.1406 Select on 500 tcp fd's: 35.4258 35.25 -0.496 35.286 -0.3929 Signal handler installation: 0.6131 0.6075 -0.913 0.647 5.52927 Signal handler overhead: 2.0919 2.1078 0.7600 2.1831 4.35967 Protection fault: 0.7345 0.7478 1.8107 0.8031 9.33968 Pipe latency: 33.006 16.398 -50.31 33.475 1.42368 AF_UNIX sock stream latency: 14.5093 30.910 113.03 30.715 111.692 Process fork+exit: 219.8 222.8 1.3648 229.37 4.35623 Process fork+execve: 876.14 873.28 -0.32 868.66 -0.8533 Process fork+/bin/sh -c: 2830 2876.5 1.6431 2958 4.52296 File /var/tmp/XXX write bw: 1193497 1195536 0.1708 118657 -0.5799 Pagefaults on /var/tmp/XXX: 3.1272 3.2117 2.7020 3.2521 3.99398 Also, kernel compile times show no difference with this patch applied. [pbadari@us.ibm.com: Avoid unnecessary PURR reading] Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19[POWERPC] Size swapper_pg_dir correctlyStephen Rothwell
David Gibson pointed out that swapper_pg_dir actually need to be PGD_TABLE_SIZE bytes long not PAGE_SIZE. This actually saves 64k in the bss for a kernel ppc64_defconfig built with CONFIG_PPC_64K_PAGES. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-08-22[POWERPC] iSeries: Clean up lparmap messStephen Rothwell
We need to have xLparMap in head_64.S so that it is at a fixed address (because the linker will not resolve (address & 0xffffffff) for us). But the assembler miscalculates the KERNEL_VSID() expressions. So put the confusing expressions into asm-offsets.c. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-09Merge branch 'master' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Further fixes for the removal of 4level-fixup hack from ppc32 [POWERPC] EEH: log all PCI-X and PCI-E AER registers [POWERPC] EEH: capture and log pci state on error [POWERPC] EEH: Split up long error msg [POWERPC] EEH: log error only after driver notification. [POWERPC] fsl_soc: Make mac_addr const in fs_enet_of_init(). [POWERPC] Don't use SLAB/SLUB for PTE pages [POWERPC] Spufs support for 64K LS mappings on 4K kernels [POWERPC] Add ability to 4K kernel to hash in 64K pages [POWERPC] Introduce address space "slices" [POWERPC] Small fixes & cleanups in segment page size demotion [POWERPC] iSeries: Make HVC_ISERIES the default [POWERPC] iSeries: suppress build warning in lparmap.c [POWERPC] Mark pages that don't exist as nosave [POWERPC] swsusp: Introduce register_nosave_region_late
2007-05-09rename thread_info to stackRoman Zippel
This finally renames the thread_info field in task structure to stack, so that the assumptions about this field are gone and archs have more freedom about placing the thread_info structure. Nonbroken archs which have a proper thread pointer can do the access to both current thread and task structure via a single pointer. It'll allow for a few more cleanups of the fork code, from which e.g. ia64 could benefit. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> [akpm@linux-foundation.org: build fix] Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Andi Kleen <ak@muc.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09[POWERPC] Introduce address space "slices"Benjamin Herrenschmidt
The basic issue is to be able to do what hugetlbfs does but with different page sizes for some other special filesystems; more specifically, my need is: - Huge pages - SPE local store mappings using 64K pages on a 4K base page size kernel on Cell - Some special 4K segments in 64K-page kernels for mapping a dodgy type of powerpc-specific infiniband hardware that requires 4K MMU mappings for various reasons I won't explain here. The main issues are: - To maintain/keep track of the page size per "segment" (as we can only have one page size per segment on powerpc, which are 256MB divisions of the address space). - To make sure special mappings stay within their allotted "segments" (including MAP_FIXED crap) - To make sure everybody else doesn't mmap/brk/grow_stack into a "segment" that is used for a special mapping Some of the necessary mechanisms to handle that were present in the hugetlbfs code, but mostly in ways not suitable for anything else. The patch relies on some changes to the generic get_unmapped_area() that just got merged. It still hijacks hugetlb callbacks here or there as the generic code hasn't been entirely cleaned up yet but that shouldn't be a problem. So what is a slice ? Well, I re-used the mechanism used formerly by our hugetlbfs implementation which divides the address space in "meta-segments" which I called "slices". The division is done using 256MB slices below 4G, and 1T slices above. Thus the address space is divided currently into 16 "low" slices and 16 "high" slices. (Special case: high slice 0 is the area between 4G and 1T). Doing so simplifies significantly the tracking of segments and avoids having to keep track of all the 256MB segments in the address space. While I used the "concepts" of hugetlbfs, I mostly re-implemented everything in a more generic way and "ported" hugetlbfs to it. Slices can have an associated page size, which is encoded in the mmu context and used by the SLB miss handler to set the segment sizes. The hash code currently doesn't care, it has a specific check for hugepages, though I might add a mechanism to provide per-slice hash mapping functions in the future. The slice code provide a pair of "generic" get_unmapped_area() (bottomup and topdown) functions that should work with any slice size. There is some trickiness here so I would appreciate people to have a look at the implementation of these and let me know if I got something wrong. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-05-07[POWERPC] powermac: Suspend to disk on G5Johannes Berg
Powermac G5 suspend to disk implementation. The code is platform agnostic but only tested on powermac, no other 64-bit powerpc machines. Because nvidiafb still breaks suspend I have marked it EXPERIMENTAL on powermac and because I can't test it and some lowlevel code will need changes it is BROKEN on all other 64-bit platforms. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-24[POWERPC] Save trap number in bad_stackOlof Johansson
Save the trap number in the case of getting a bad stack in an exception handler. It is sometimes useful to know what exception it was that caused this to happen. Without this, no trap number is reported. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-03-22[POWERPC] Remove last_syscallAnton Blanchard
Remove last_syscall from 32bit powerpc, its been gone in 64bit for years. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-01-09[POWERPC] Fix manual assembly WARN_ON() in enter_rtas().David Woodhouse
When we switched over to the generic BUG mechanism we forgot to change the assembly code which open-codes a WARN_ON() in enter_rtas(), so the bug table got corrupted. This patch provides an EMIT_BUG_ENTRY macro for use in assembly code, and uses it in entry_64.S. Tested with CONFIG_DEBUG_BUGVERBOSE on ppc64 but not without -- I tried to turn it off but it wouldn't go away; I suspect Aunt Tillie probably needed it. This version gets __FILE__ and __LINE__ right in the assembly version -- rather than saying include/asm-powerpc/bug.h line 21 every time which is a little suboptimal. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-10-16[POWERPC] Lazy interrupt disabling for 64-bit machinesPaul Mackerras
This implements a lazy strategy for disabling interrupts. This means that local_irq_disable() et al. just clear the 'interrupts are enabled' flag in the paca. If an interrupt comes along, the interrupt entry code notices that interrupts are supposed to be disabled, and clears the EE bit in SRR1, clears the 'interrupts are hard-enabled' flag in the paca, and returns. This means that interrupts only actually get disabled in the processor when an interrupt comes along. When interrupts are enabled by local_irq_enable() et al., the code sets the interrupts-enabled flag in the paca, and then checks whether interrupts got hard-disabled. If so, it also sets the EE bit in the MSR to hard-enable the interrupts. This has the potential to improve performance, and also makes it easier to make a kernel that can boot on iSeries and on other 64-bit machines, since this lazy-disable strategy is very similar to the soft-disable strategy that iSeries already uses. This version renames paca->proc_enabled to paca->soft_enabled, and changes a couple of soft-disables in the kexec code to hard-disables, which should fix the crash that Michael Ellerman saw. This doesn't yet use a reserved CR field for the soft_enabled and hard_enabled flags. This applies on top of Stephen Rothwell's patches to make it possible to build a combined iSeries/other kernel. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-09-14[POWERPC] Fix non-smp buildOlof Johansson
This fixes a compile error that only surfaces on CONFIG_SMP=n builds; <asm/hvcall.h> seems to get pulled in through another header file for SMP builds. This problem was introduced by the hvcall stats patch. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-09-13[POWERPC] powerpc: Instrument Hypervisor CallsMike Kravetz
Add instrumentation for hypervisor calls on pseries. Call statistics include number of calls, wall time and cpu cycles (if available) and are made available via debugfs. Instrumentation code is behind the HCALL_STATS config option and has no impact if not enabled. Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-08-25[POWERPC] Cleanup CPU initsOlof Johansson
Cleanup CPU inits a bit more, Geoff Levand already did some earlier. * Move CPU state save to cpu_setup, since cpu_setup is only ever done on cpu 0 on 64-bit and save is never done more than once. * Rename __restore_cpu_setup to __restore_cpu_ppc970 and add function pointers to the cputable to use instead. Powermac always has 970 so no need to check there. * Rename __970_cpu_preinit to __cpu_preinit_ppc970 and check PVR before calling it instead of in it, it's too early to use cputable. * Rename pSeries_secondary_smp_init to generic_secondary_smp_init since everyone but powermac and iSeries use it. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-08-25[POWERPC] SLB shadow buffer cleanupMichael Neuling
Cleanup some of the #define magic as suggested by Milton. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-08-08[POWERPC] Implement SLB shadow bufferMichael Neuling
This adds a shadow buffer for the SLBs and regsiters it with PHYP. Only the bolted SLB entries (top 3) are shadowed. The SLB shadow buffer tells the hypervisor what the kernel needs to have in the SLB for the kernel to be able to function. The hypervisor can use this information to speed up partition context switches. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-07-13[POWERPC] iseries: Remove unnecessary include of iseries/hv_lp_event.hStephen Rothwell
Also remove unnecessary reference to struct HvLpEvent. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
2006-06-30Remove obsolete #include <linux/config.h>Jörn Engel
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-15powerpc: Use 64k pages without needing cache-inhibited large pagesPaul Mackerras
Some POWER5+ machines can do 64k hardware pages for normal memory but not for cache-inhibited pages. This patch lets us use 64k hardware pages for most user processes on such machines (assuming the kernel has been configured with CONFIG_PPC_64K_PAGES=y). User processes start out using 64k pages and get switched to 4k pages if they use any non-cacheable mappings. With this, we use 64k pages for the vmalloc region and 4k pages for the imalloc region. If anything creates a non-cacheable mapping in the vmalloc region, the vmalloc region will get switched to 4k pages. I don't know of any driver other than the DRM that would do this, though, and these machines don't have AGP. When a region gets switched from 64k pages to 4k pages, we do not have to clear out all the 64k HPTEs from the hash table immediately. We use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page was hashed in as a 64k page or a set of 4k pages. If hash_page is trying to insert a 4k page for a Linux PTE and it sees that it has already been inserted as a 64k page, it first invalidates the 64k HPTE before inserting the 4k HPTE. The hash invalidation routines also use the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a set of 4k HPTEs to remove. With those two changes, we can tolerate a mix of 4k and 64k HPTEs in the hash table, and they will all get removed when the address space is torn down. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-12powerpc: Remove unused paca->pgdir fieldPaul Mackerras
The pgdir field in the paca was a leftover from the dynamic VSIDs patch, and is not used in the current kernel code. This removes it. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-04-18powerpc: Use correct sequence for putting CPU into nap modePaul Mackerras
We weren't using the recommended sequence for putting the CPU into nap mode. When I changed the idle loop, for some reason 7447A cpus started hanging when we put them into nap mode. Changing to the recommended sequence fixes that. The complexity here is that the recommended sequence is a loop that keeps putting the cpu back into nap mode. Clearly we need some way to break out of the loop when an interrupt (external interrupt, decrementer, performance monitor) occurs. Here we use a bit in the thread_info struct to indicate that we need this, and the exception entry code notices this and arranges for the exception to return to the value in the link register, thus breaking out of the loop. We use a new `local_flags' field in the thread_info which we can alter without needing to use an atomic update sequence. The PPC970 has the same recommended sequence, so we do the same thing there too. This also fixes a bug in the kernel stack overflow handling code on 32-bit, since it was causing a value that we needed in a register to get trashed. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-28[PATCH] powerpc: Kill _machine and hard-coded platform numbersBenjamin Herrenschmidt
This removes statically assigned platform numbers and reworks the powerpc platform probe code to use a better mechanism. With this, board support files can simply declare a new machine type with a macro, and implement a probe() function that uses the flattened device-tree to detect if they apply for a given machine. We now have a machine_is() macro that replaces the comparisons of _machine with the various PLATFORM_* constants. This commit also changes various drivers to use the new macro instead of looking at _machine. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-09Merge ../linux-2.6Paul Mackerras
2006-03-08powerpc: Fix various syscall/signal/swapcontext bugsPaul Mackerras
A careful reading of the recent changes to the system call entry/exit paths revealed several problems, plus some things that could be simplified and improved: * 32-bit wasn't testing the _TIF_NOERROR bit in the syscall fast exit path, so it was only doing anything with it once it saw some other bit being set. In other words, the noerror behaviour would apply to the next system call where we had to reschedule or deliver a signal, which is not necessarily the current system call. * 32-bit wasn't doing the call to ptrace_notify in the syscall exit path when the _TIF_SINGLESTEP bit was set. * _TIF_RESTOREALL was in both _TIF_USER_WORK_MASK and _TIF_PERSYSCALL_MASK, which is odd since _TIF_RESTOREALL is only set by system calls. I took it out of _TIF_USER_WORK_MASK. * On 64-bit, _TIF_RESTOREALL wasn't causing the non-volatile registers to be restored (unless perhaps a signal was delivered or the syscall was traced or single-stepped). Thus the non-volatile registers weren't restored on exit from a signal handler. We probably got away with it mostly because signal handlers written in C wouldn't alter the non-volatile registers. * On 32-bit I simplified the code and made it more like 64-bit by making the syscall exit path jump to ret_from_except to handle preemption and signal delivery. * 32-bit was calling do_signal unnecessarily when _TIF_RESTOREALL was set - but I think because of that 32-bit was actually restoring the non-volatile registers on exit from a signal handler. * I changed the order of enabling interrupts and saving the non-volatile registers before calling do_syscall_trace_leave; now we enable interrupts first. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-24powerpc: Implement accurate task and CPU time accountingPaul Mackerras
This implements accurate task and cpu time accounting for 64-bit powerpc kernels. Instead of accounting a whole jiffy of time to a task on a timer interrupt because that task happened to be running at the time, we now account time in units of timebase ticks according to the actual time spent by the task in user mode and kernel mode. We also count the time spent processing hardware and software interrupts accurately. This is conditional on CONFIG_VIRT_CPU_ACCOUNTING. If that is not set, we do tick-based approximate accounting as before. To get this accurate information, we read either the PURR (processor utilization of resources register) on POWER5 machines, or the timebase on other machines on * each entry to the kernel from usermode * each exit to usermode * transitions between process context, hard irq context and soft irq context in kernel mode * context switches. On POWER5 systems with shared-processor logical partitioning we also read both the PURR and the timebase at each timer interrupt and context switch in order to determine how much time has been taken by the hypervisor to run other partitions ("steal" time). Unfortunately, since we need values of the PURR on both threads at the same time to accurately calculate the steal time, and since we can only calculate steal time on a per-core basis, the apportioning of the steal time between idle time (time which we ceded to the hypervisor in the idle loop) and actual stolen time is somewhat approximate at the moment. This is all based quite heavily on what s390 does, and it uses the generic interfaces that were added by the s390 developers, i.e. account_system_time(), account_user_time(), etc. This patch doesn't add any new interfaces between the kernel and userspace, and doesn't change the units in which time is reported to userspace by things such as /proc/stat, /proc/<pid>/stat, getrusage(), times(), etc. Internally the various task and cpu times are stored in timebase units, but they are converted to USER_HZ units (1/100th of a second) when reported to userspace. Some precision is therefore lost but there should not be any accumulating error, since the internal accumulation is at full precision. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-13[PATCH] powerpc: Remove lppaca structure from the PACADavid Gibson
At present the lppaca - the structure shared with the iSeries hypervisor and phyp - is contained within the PACA, our own low-level per-cpu structure. This doesn't have to be so, the patch below removes it, making a separate array of lppaca structures. This saves approximately 500*NR_CPUS bytes of image size and kernel memory, because we don't need aligning gap between the Linux and hypervisor portions of every PACA. On the other hand it means an extra level of dereference in many accesses to the lppaca. The patch also gets rid of several places where we assign the paca address to a local variable for no particular reason. Signed-off-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Remove some unneeded fields from the pacaDavid Gibson
This patch removes several unnecessary fields from the paca: - next_jiffy_update_tb was simply unused. Remove trivially. - The exdsi exception save area was not used. There were plans to use it, but they never seem to have gone anywhere. If they ever do, we can put it back. Remove from the paca, and from asm-offsets.c - The default_decr field was used from asm, but was only ever assigned the value of tb_ticks_per_jiffy. Just access tb_ticks_per_jiffy from asm directly instead. Built and booted on POWER5 LPAR and iSeries RS64. Signed-off-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] syscall entry/exit revampDavid Woodhouse
This cleanup patch speeds up the null syscall path on ppc64 by about 3%, and brings the ppc32 and ppc64 code slightly closer together. The ppc64 code was checking current_thread_info()->flags twice in the syscall exit path; once for TIF_SYSCALL_T_OR_A before disabling interrupts, and then again for TIF_SIGPENDING|TIF_NEED_RESCHED etc after disabling interrupts. Now we do the same as ppc32 -- check the flags only once in the fast path, and re-enable interrupts if necessary in the ptrace case. The patch abolishes the 'syscall_noerror' member of struct thread_info and replaces it with a TIF_NOERROR bit in the flags, which is handled in the slow path. This shortens the syscall entry code, which no longer needs to clear syscall_noerror. The patch adds a TIF_SAVE_NVGPRS flag which causes the syscall exit slow path to save the non-volatile GPRs into a signal frame. This removes the need for the assembly wrappers around sys_sigsuspend(), sys_rt_sigsuspend(), et al which existed solely to save those registers in advance. It also means I don't have to add new wrappers for ppoll() and pselect(), which is what I was supposed to be doing when I got distracted into this... Finally, it unifies the ppc64 and ppc32 methods of handling syscall exit directly into a signal handler (as required by sigsuspend et al) by introducing a TIF_RESTOREALL flag which causes _all_ the registers to be reloaded from the pt_regs by taking the ret_from_exception path, instead of the normal syscall exit path which stomps on the callee-saved GPRs. It appears to pass an LTP test run on ppc64, and passes basic testing on ppc32 too. Brief tests of ptrace functionality with strace and gdb also appear OK. I wouldn't send it to Linus for 2.6.15 just yet though :) Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-14[PATCH] powerpc: vdso fixes (take #2)Benjamin Herrenschmidt
This fixes various errors in the new functions added in the vDSO's, I've now verified all functions on both 32 and 64 bits vDSOs. It also fix a sign extension bug getting the initial time of day at boot that could cause the monotonic clock value to be completely on bogus for 64 bits applications (with either the vDSO or the syscall) on powermacs. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-11[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernelBenjamin Herrenschmidt
This patch moves the vdso's to arch/powerpc, adds support for the 32 bits vdso to the 32 bits kernel, rename systemcfg (finally !), and adds some new (still untested) routines to both vdso's: clock_gettime() with support for CLOCK_REALTIME and CLOCK_MONOTONIC, clock_getres() (same clocks) and get_tbfreq() for glibc to retreive the timebase frequency. Tom,Steve: The implementation of get_tbfreq() I've done for 32 bits returns a long long (r3, r4) not a long. This is such that if we ever add support for >4Ghz timebases on ppc32, the userland interface won't have to change. I have tested gettimeofday() using some glibc patches in both ppc32 and ppc64 kernels using 32 bits userland (I haven't had a chance to test a 64 bits userland yet, but the implementation didn't change and was tested earlier). I haven't tested yet the new functions. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-10[PATCH] powerpc: merge code values for identifying platformsPaul Mackerras
This patch merges platform codes. systemcfg->platform is no longer used, systemcfg use in general is deprecated as much as possible (and renamed _systemcfg before it gets completely moved elsewhere in a future patch), _machine is now used on ppc64 along as ppc32. Platform codes aren't gone yet but we are getting a step closer. A bunch of asm code in head[_64].S is also turned into C code. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-06[PATCH] ppc64: support 64k pagesBenjamin Herrenschmidt
Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel base page size to 64K. The resulting kernel still boots on any hardware. On current machines with 4K pages support only, the kernel will maintain 16 "subpages" for each 64K page transparently. Note that while real 64K capable HW has been tested, the current patch will not enable it yet as such hardware is not released yet, and I'm still verifying with the firmware architects the proper to get the information from the newer hypervisors. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-02merge filename and modify references to iseries/hv_lp_event.hKelly Daly
Signed-off-by: Kelly Daly <kelly@au.ibm.com>
2005-10-28powerpc: Rename asm offset TRAP to _TRAP for 32-bitPaul Mackerras
... for consistency with 64-bit. Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-26powerpc: Merge rtas.c into arch/powerpc/kernelPaul Mackerras
This splits arch/ppc64/kernel/rtas.c into arch/powerpc/kernel/rtas.c, which contains generic RTAS functions useful on any CHRP platform, and arch/powerpc/platforms/pseries/rtas-fw.[ch], which contain some pSeries-specific firmware flashing bits. The parts of rtas.c that are to do with pSeries-specific error logging are protected by a new CONFIG_RTAS_ERROR_LOGGING symbol. The inclusion of rtas.o is controlled by the CONFIG_PPC_RTAS symbol, and the relevant platforms select that. Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-21[PATCH] powerpc: Merge thread_info.hDavid Gibson
Merge ppc32 and ppc64 versions of thread_info.h. They were pretty similar already, the chief changes are: - Instead of inline asm to implement current_thread_info(), which needs to be different for ppc32 and ppc64, we use C with an asm("r1") register variable. gcc turns it into the same asm as we used to have for both platforms. - We replace ppc32's 'local_flags' with the ppc64 'syscall_noerror' field. The noerror flag was in fact the only thing in the local_flags field anyway, so the ppc64 approach is simpler, and means we only need a load-immediate/store instead of load/mask/store when clearing the flag. - In readiness for 64k pages, when THREAD_SIZE will be less than a page, ppc64 used kmalloc() rather than get_free_pages() to allocate the kernel stack. With this patch we do the same for ppc32, since there's no strong reason not to. - For ppc64, we no longer export THREAD_SHIFT and THREAD_SIZE via asm-offsets, thread_info.h can now be safely included in asm, as on ppc32. Built and booted on G4 Powerbook (ARCH=ppc and ARCH=powerpc) and Power5 (ARCH=ppc64 and ARCH=powerpc). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-11ppc: Various minor compile fixesPaul Mackerras
This fixes up a variety of minor problems in compiling with ARCH=ppc arising from using the merged versions of various header files. A lot of the changes are just adding #include <asm/machdep.h> to files that use ppc_md or smp_ops_t. This also arranges for us to use semaphore.c, vecemu.c, vector.S and fpu.S from arch/powerpc/kernel when compiling with ARCH=ppc. Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-10powerpc: Get 64-bit configs to compile with ARCH=powerpcPaul Mackerras
This is a bunch of mostly small fixes that are needed to get ARCH=powerpc to compile for 64-bit. This adds setup_64.c from arch/ppc64/kernel/setup.c and locks.c from arch/ppc64/lib/locks.c. Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-09-30powerpc: merge asm-offsets.cStephen Rothwell
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
2005-09-26powerpc: Merge enough to start building in arch/powerpc.Paul Mackerras
This creates the directory structure under arch/powerpc and a bunch of Kconfig files. It does a first-cut merge of arch/powerpc/mm, arch/powerpc/lib and arch/powerpc/platforms/powermac. This is enough to build a 32-bit powermac kernel with ARCH=powerpc. For now we are getting some unmerged files from arch/ppc/kernel and arch/ppc/syslib, or arch/ppc64/kernel. This makes some minor changes to files in those directories and files outside arch/powerpc. The boot directory is still not merged. That's going to be interesting. Signed-off-by: Paul Mackerras <paulus@samba.org>