aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2007-05-02[PATCH] x86: PARAVIRT: Jeremy Fitzhardinge <jeremy@goop.org>Jeremy Fitzhardinge
The other symbols used to delineate the alt-instructions sections have the form __foo/__foo_end. Rename parainstructions to match. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02[PATCH] i386: Convert VMI timer to use clock eventsZachary Amsden
Convert VMI timer to use clock events, making it properly able to use the NO_HZ infrastructure. On UP systems, with no local APIC, we just continue to route these events through the PIT. On systems with a local APIC, or SMP, we provide a single source interrupt chip which creates the local timer IRQ. It actually gets delivered by the APIC hardware, but we don't want to use the same local APIC clocksource processing, so we create our own handler here. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de> CC: Dan Hecht <dhecht@vmware.com> CC: Ingo Molnar <mingo@elte.hu> CC: Thomas Gleixner <tglx@linutronix.de>
2007-05-02[PATCH] i386: Implement vmi_kmap_atomic_pteZachary Amsden
Implement vmi_kmap_atomic_pte in terms of the backend set_linear_mapping operation. The conversion is rather straighforward; call kmap_atomic and then inform the hypervisor of the page mapping. The _flush_tlb damage is due to macros being pulled in from highmem.h. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: Now that the VDSO can be relocated, we can support it in VMI ↵Zachary Amsden
configurations. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: Clean up arch/i386/kernel/cpu/mcheck/p4.cZachary Amsden
No, just no. You do not use goto to skip a code block. You do not return an obvious variable from a singly-inlined function and give the function a return value. You don't put unexplained comments about kmalloc in code which doesn't do dynamic allocation. And you don't leave stray warnings around for no good reason. Also, when possible, it is better to use block scoped variables because gcc can sometime generate better code. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: PARAVIRT: Allow boot-time disable of paravirt_ops patchingJeremy Fitzhardinge
Add "noreplace-paravirt" to disable paravirt_ops patching. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02[PATCH] i386: In compat mode, the return value here was uninitialized.Zachary Amsden
Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] x86: update for i386 and x86-64 check_bugsJeremy Fitzhardinge
Remove spurious comments, headers and keywords from x86-64 bugs.[ch]. Use identify_boot_cpu() AK: merged with other patch Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: map enough initial memory to create lowmem mappingsJeremy Fitzhardinge
head.S creates the very initial pagetable for the kernel. This just maps enough space for the kernel itself, and an allocation bitmap. The amount of mapped memory is rounded up to 4Mbytes, and so this typically ends up mapping 8Mbytes of memory. When booting, pagetable_init() needs to create mappings for all lowmem, and the pagetables for these mappings are allocated from the free pages around the kernel in low memory. If the number of pagetable pages + kernel size exceeds head.S's initial mapping, it will end up faulting on an unmapped page. This will only happen with specific combinations of kernel size and memory size. This patch makes sure that head.S also maps enough space to fit the kernel pagetables as well as the kernel itself. It ends up using an additional two pages of unreclaimable memory. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Andi Kleen <ak@suse.de> Cc: Zachary Amsden <zach@vmware.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>,
2007-05-02[PATCH] i386: Fix UP gdt bugsJeremy Fitzhardinge
Fixes two problems with the GDT when compiling for uniprocessor: - There's no percpu segment, so trying to load its selector into %fs fails. Use a null selector instead. - The real gdt needs to be loaded at some point. Do it in cpu_init(). Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au>
2007-05-02[PATCH] i386: Define per_cpu_offsetJeremy Fitzhardinge
Define per_cpu_offset in asm-i386/percpu.h when SMP defined, like asm-generic/percpu.h does for UP. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: cleanups to help using per-cpu variables from asmJeremy Fitzhardinge
This patch does a few small cleanups: - use PER_CPU_NAME to generate the names of per-cpu variables - use lea to add the per_cpu offset in PER_CPU(), because it doesn't affect condition flags - add PER_CPU_VAR which allows direct access to pre-cpu variables with the %fs: prefix on SMP. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: Convert PDA into the percpu sectionJeremy Fitzhardinge
Currently x86 (similar to x84-64) has a special per-cpu structure called "i386_pda" which can be easily and efficiently referenced via the %fs register. An ELF section is more flexible than a structure, allowing any piece of code to use this area. Indeed, such a section already exists: the per-cpu area. So this patch: (1) Removes the PDA and uses per-cpu variables for each current member. (2) Replaces the __KERNEL_PDA segment with __KERNEL_PERCPU. (3) Creates a per-cpu mirror of __per_cpu_offset called this_cpu_off, which can be used to calculate addresses for this CPU's variables. (4) Simplifies startup, because %fs doesn't need to be loaded with a special segment at early boot; it can be deferred until the first percpu area is allocated (or never for UP). The result is less code and one less x86-specific concept. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: Page-align the GDTJeremy Fitzhardinge
Xen wants a dedicated page for the GDT. I believe VMI likes it too. lguest, KVM and native don't care. Simple transformation to page-aligned "struct gdt_page". Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-05-02[PATCH] x86-64: deflate inflate_dynamic tooJeremy Fitzhardinge
inflate_dynamic() has piggy stack usage too, so heap allocate it too. I'm not sure it actually gets used, but it shows up large in "make checkstack". Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] x86: deflate stack usage in lib/inflate.cJeremy Fitzhardinge
inflate_fixed and huft_build together use around 2.7k of stack. When using 4k stacks, I saw stack overflows from interrupts arriving while unpacking the root initrd: do_IRQ: stack overflow: 384 [<c0106b64>] show_trace_log_lvl+0x1a/0x30 [<c01075e6>] show_trace+0x12/0x14 [<c010763f>] dump_stack+0x16/0x18 [<c0107ca4>] do_IRQ+0x6d/0xd9 [<c010202b>] xen_evtchn_do_upcall+0x6e/0xa2 [<c0106781>] xen_hypervisor_callback+0x25/0x2c [<c010116c>] xen_restore_fl+0x27/0x29 [<c0330f63>] _spin_unlock_irqrestore+0x4a/0x50 [<c0117aab>] change_page_attr+0x577/0x584 [<c0117b45>] kernel_map_pages+0x8d/0xb4 [<c016a314>] cache_alloc_refill+0x53f/0x632 [<c016a6c2>] __kmalloc+0xc1/0x10d [<c0463d34>] malloc+0x10/0x12 [<c04641c1>] huft_build+0x2a7/0x5fa [<c04645a5>] inflate_fixed+0x91/0x136 [<c04657e2>] unpack_to_rootfs+0x5f2/0x8c1 [<c0465acf>] populate_rootfs+0x1e/0xe4 (This was under Xen, but there's no reason it couldn't happen on bare hardware.) This patch mallocs the local variables, thereby reducing the stack usage to sane levels. Also, up the heap size for the kernel decompressor to deal with the extra allocation. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Tim Yamin <plasmaroo@gentoo.org> Cc: Andi Kleen <ak@suse.de> Cc: Matt Mackall <mpm@selenic.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com>
2007-05-02[PATCH] i386: PARAVIRT: drop unused ptep_get_and_clearJeremy Fitzhardinge
In shadow mode hypervisors, ptep_get_and_clear achieves the desired purpose of keeping the shadows in sync by issuing a native_get_and_clear, followed by a call to pte_update, which indicates the PTE has been modified. Direct mode hypervisors (Xen) have no need for this anyway, and will trap the update using writable pagetables. This means no hypervisor makes use of ptep_get_and_clear; there is no reason to have it in the paravirt-ops structure. Change confusing terminology about raw vs. native functions into consistent use of native_pte_xxx for operations which do not invoke paravirt-ops. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: PARAVIRT: Clean up paravirt patchable wrappersJeremy Fitzhardinge
Replace all the open-coded macros for generating calls with a pair of more general macros (__PVOP_CALL/VCALL), and redefine all the PVOP_V?CALL[0-4] in terms of them. [ Andrew, Andi: this should slot in immediately after "Document asm-i386/paravirt.h" (paravirt_ops-document-asm-i386-paravirth.patch) ] Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Ingo Molnar <mingo@elte.hu>
2007-05-02[PATCH] i386: PARAVIRT: Use enums for paravirt lazy flush modiJeremy Fitzhardinge
Remove #defines, add enum for PARAVIRT_LAZY_FLUSH. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: PARAVIRT: flush lazy mmu updates on kunmap_atomicJeremy Fitzhardinge
kunmap_atomic should flush any pending lazy mmu updates, mainly to be consistent with kmap_atomic, and to preserve its normal behaviour. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: PARAVIRT: add kmap_atomic_pte for mapping highpte pagesJeremy Fitzhardinge
Xen and VMI both have special requirements when mapping a highmem pte page into the kernel address space. These can be dealt with by adding a new kmap_atomic_pte() function for mapping highptes, and hooking it into the paravirt_ops infrastructure. Xen specifically wants to map the pte page RO, so this patch exposes a helper function, kmap_atomic_prot, which maps the page with the specified page protections. This also adds a kmap_flush_unused() function to clear out the cached kmap mappings. Xen needs this to clear out any potential stray RW mappings of pages which will become part of a pagetable. [ Zach - vmi.c will need some attention after this patch. It wasn't immediately obvious to me what needs to be done. ] Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Zachary Amsden <zach@vmware.com>
2007-05-02[PATCH] i386: PARAVIRT: revert map_pt_hook.Jeremy Fitzhardinge
Back out the map_pt_hook to clear the way for kmap_atomic_pte. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Zachary Amsden <zach@vmware.com>
2007-05-02[PATCH] i386: PARAVIRT: add flush_tlb_others paravirt_opJeremy Fitzhardinge
This patch adds a pv_op for flush_tlb_others. Linux running on native hardware uses cross-CPU IPIs to flush the TLB on any CPU which may have a particular mm's pagetable entries cached in its TLB. This is inefficient in a paravirtualized environment, since the hypervisor knows which real CPUs actually contain cached mappings, which may be a small subset of a guest's VCPUs. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: PARAVIRT: add common patching machineryJeremy Fitzhardinge
Implement the actual patching machinery. paravirt_patch_default() contains the logic to automatically patch a callsite based on a few simple rules: - if the paravirt_op function is paravirt_nop, then patch nops - if the paravirt_op function is a jmp target, then jmp to it - if the paravirt_op function is callable and doesn't clobber too much for the callsite, call it directly paravirt_patch_default is suitable as a default implementation of paravirt_ops.patch, will remove most of the expensive indirect calls in favour of either a direct call or a pile of nops. Backends may implement their own patcher, however. There are several helper functions to help with this: paravirt_patch_nop nop out a callsite paravirt_patch_ignore leave the callsite as-is paravirt_patch_call patch a call if the caller and callee have compatible clobbers paravirt_patch_jmp patch in a jmp paravirt_patch_insns patch some literal instructions over the callsite, if they fit This patch also implements more direct patches for the native case, so that when running on native hardware many common operations are implemented inline. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Zachary Amsden <zach@vmware.com> Cc: Anthony Liguori <anthony@codemonkey.ws> Acked-by: Ingo Molnar <mingo@elte.hu>
2007-05-02[PATCH] i386: PARAVIRT: Document asm-i386/paravirt.hJeremy Fitzhardinge
Clean things up, and broadly document: - the paravirt_ops functions themselves - the patching mechanism Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au>
2007-05-02[PATCH] i386: PARAVIRT: Consistently wrap paravirt ops callsites to make ↵Jeremy Fitzhardinge
them patchable Wrap a set of interesting paravirt_ops calls in a wrapper which makes the callsites available for patching. Unfortunately this is pretty ugly because there's no way to get gcc to generate a function call, but also wrap just the callsite itself with the necessary labels. This patch supports functions with 0-4 arguments, and either void or returning a value. 64-bit arguments must be split into a pair of 32-bit arguments (lower word first). Small structures are returned in registers. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Zachary Amsden <zach@vmware.com> Cc: Anthony Liguori <anthony@codemonkey.ws>
2007-05-02[PATCH] i386: PARAVIRT: Fix patch site clobbers to include return registerJeremy Fitzhardinge
Fix a few clobbers to include the return register. The clobbers set is the set of all registers modified (or may be modified) by the code snippet, regardless of whether it was deliberate or accidental. Also, make sure that callsites which are used in contexts which don't allow clobbers actually save and restore all clobberable registers. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Zachary Amsden <zach@vmware.com>
2007-05-02[PATCH] i386: PARAVIRT: Use patch site IDs computed from offset in ↵Jeremy Fitzhardinge
paravirt_ops structure Use patch type identifiers derived from the offset of the operation in the paravirt_ops structure. This avoids having to maintain a separate enum for patch site types. Also, since the identifier is derived from the offset into paravirt_ops, the offset can be derived from the identifier. This is used to remove replicated information in the various callsite macros, which has been a source of bugs in the past. This patch also drops the fused save_fl+cli operation, which doesn't really add much and makes things more complex - specifically because it breaks the 1:1 relationship between identifiers and offsets. If this operation turns out to be particularly beneficial, then the right answer is to define a new entrypoint for it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Zachary Amsden <zach@vmware.com>
2007-05-02[PATCH] i386: PARAVIRT: rename struct paravirt_patch to paravirt_patch_site ↵Jeremy Fitzhardinge
for clarity Rename struct paravirt_patch to paravirt_patch_site, so that it clearly refers to a callsite, and not the patch which may be applied to that callsite. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Zachary Amsden <zach@vmware.com>
2007-05-02[PATCH] x86: PARAVIRT: add hooks to intercept mm creation and destructionJeremy Fitzhardinge
Add hooks to allow a paravirt implementation to track the lifetime of an mm. Paravirtualization requires three hooks, but only two are needed in common code. They are: arch_dup_mmap, which is called when a new mmap is created at fork arch_exit_mmap, which is called when the last process reference to an mm is dropped, which typically happens on exit and exec. The third hook is activate_mm, which is called from the arch-specific activate_mm() macro/function, and so doesn't need stub versions for other architectures. It's called when an mm is first used. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: linux-arch@vger.kernel.org Cc: James Bottomley <James.Bottomley@SteelEye.com> Acked-by: Ingo Molnar <mingo@elte.hu>
2007-05-02[PATCH] i386: PARAVIRT: Allow paravirt backend to choose kernel PMD sharingJeremy Fitzhardinge
Normally when running in PAE mode, the 4th PMD maps the kernel address space, which can be shared among all processes (since they all need the same kernel mappings). Xen, however, does not allow guests to have the kernel pmd shared between page tables, so parameterize pgtable.c to allow both modes of operation. There are several side-effects of this. One is that vmalloc will update the kernel address space mappings, and those updates need to be propagated into all processes if the kernel mappings are not intrinsically shared. In the non-PAE case, this is done by maintaining a pgd_list of all processes; this list is used when all process pagetables must be updated. pgd_list is threaded via otherwise unused entries in the page structure for the pgd, which means that the pgd must be page-sized for this to work. Normally the PAE pgd is only 4x64 byte entries large, but Xen requires the PAE pgd to page aligned anyway, so this patch forces the pgd to be page aligned+sized when the kernel pmd is unshared, to accomodate both these requirements. Also, since there may be several distinct kernel pmds (if the user/kernel split is below 3G), there's no point in allocating them from a slab cache; they're just allocated with get_free_page and initialized appropriately. (Of course the could be cached if there is just a single kernel pmd - which is the default with a 3G user/kernel split - but it doesn't seem worthwhile to add yet another case into this code). [ Many thanks to wli for review comments. ] Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Zachary Amsden <zach@vmware.com> Cc: Christoph Lameter <clameter@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02[PATCH] i386: PARAVIRT: Allocate a fixmap slotJeremy Fitzhardinge
Allocate a fixmap slot for use by a paravirt_ops implementation. This is intended for early-boot bootstrap mappings. Once the zones and allocator have been set up, it would be better to use get_vm_area() to allocate some virtual space. Xen uses this to map the hypervisor's shared info page, which doesn't have a pseudo-physical page number, and therefore can't be mapped ordinarily. It is needed early because it contains the vcpu state, including the interrupt mask. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu>
2007-05-02[PATCH] i386: PARAVIRT: Hooks to set up initial pagetableJeremy Fitzhardinge
This patch introduces paravirt_ops hooks to control how the kernel's initial pagetable is set up. In the case of a native boot, the very early bootstrap code creates a simple non-PAE pagetable to map the kernel and physical memory. When the VM subsystem is initialized, it creates a proper pagetable which respects the PAE mode, large pages, etc. When booting under a hypervisor, there are many possibilities for what paging environment the hypervisor establishes for the guest kernel, so the constructon of the kernel's pagetable depends on the hypervisor. In the case of Xen, the hypervisor boots the kernel with a fully constructed pagetable, which is already using PAE if necessary. Also, Xen requires particular care when constructing pagetables to make sure all pagetables are always mapped read-only. In order to make this easier, kernel's initial pagetable construction has been changed to only allocate and initialize a pagetable page if there's no page already present in the pagetable. This allows the Xen paravirt backend to make a copy of the hypervisor-provided pagetable, allowing the kernel to establish any more mappings it needs while keeping the existing ones. A slightly subtle point which is worth highlighting here is that Xen requires all kernel mappings to share the same pte_t pages between all pagetables, so that updating a kernel page's mapping in one pagetable is reflected in all other pagetables. This makes it possible to allocate a page and attach it to a pagetable without having to explicitly enumerate that page's mapping in all pagetables. And: +From: "Eric W. Biederman" <ebiederm@xmission.com> If we don't set the leaf page table entries it is quite possible that will inherit and incorrect page table entry from the initial boot page table setup in head.S. So we need to redo the effort here, so we pick up PSE, PGE and the like. Hypervisors like Xen require that their page tables be read-only, which is slightly incompatible with our low identity mappings, however I discussed this with Jeremy he has modified the Xen early set_pte function to avoid problems in this area. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: William Irwin <bill.irwin@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>
2007-05-02[PATCH] i386: PARAVIRT: Add pagetable accessors to pack and unpack pagetable ↵Jeremy Fitzhardinge
entries Add a set of accessors to pack, unpack and modify page table entries (at all levels). This allows a paravirt implementation to control the contents of pgd/pmd/pte entries. For example, Xen uses this to convert the (pseudo-)physical address into a machine address when populating a pagetable entry, and converting back to pphys address when an entry is read. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu>
2007-05-02[PATCH] i386: PARAVIRT: use paravirt_nop to consistently mark no-op operationsJeremy Fitzhardinge
Add a _paravirt_nop function for use as a stub for no-op operations, and paravirt_nop #defined void * version to make using it easier (since all its uses are as a void *). This is useful to allow the patcher to automatically identify noop operations so it can simply nop out the callsite. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu> [mingo] but only as a cleanup of the current open-coded (void *) casts. My problem with this is that it loses the types. Not that there is much to check for, but still, this adds some assumptions about how function calls look like
2007-05-02[PATCH] i386: PARAVIRT: Remove CONFIG_DEBUG_PARAVIRTJeremy Fitzhardinge
Remove CONFIG_DEBUG_PARAVIRT. When inlining code, this option attempts to trash registers in the patch-site's "clobber" field, on the grounds that this should find bugs with incorrect clobbers. Unfortunately, the clobber field really means "registers modified by this patch site", which includes return values. Because of this, this option has outlived its usefulness, so remove it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au>
2007-05-02[PATCH] x86-64: update MAINTAINERSJeremy Fitzhardinge
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Zachary Amsden <zach@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au>
2007-05-02[PATCH] i386: i386 separate hardware-defined TSS from Linux additionsRusty Russell
On Thu, 2007-03-29 at 13:16 +0200, Andi Kleen wrote: > Please clean it up properly with two structs. Not sure about this, now I've done it. Running it here. If you like it, I can do x86-64 as well. == lguest defines its own TSS struct because the "struct tss_struct" contains linux-specific additions. Andi asked me to split the struct in processor.h. Unfortunately it makes usage a little awkward. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] x86-64: x86-64 system crashes when no memory populating Node 0James Puthukattukaran
I have a 4 socket AMD Operton system. The 2.6.18 kernel I have crashes when there is no memory in node0. AK: changed call to _nopanic Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] x86-64: Fix x86_64 compilation with DEBUG_SIG onGlauber de Oliveira Costa
Setting the DEBUG_SIG flag breaks compilation due to a wrong struct access. Aditionally, it raises two warnings. This is one patch to fix them all. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: Allow boot-time disable of SMP altinstructionsJeremy Fitzhardinge
Add "noreplace-smp" to disable SMP instruction replacement. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: Remove smp_alt_instructionsJeremy Fitzhardinge
The .smp_altinstructions section and its corresponding symbols are completely unused, so remove them. Also, remove stray #ifdef __KENREL__ in asm-i386/alternative.h Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] x86: Clean up x86 control register and MSR macros (corrected)H. Peter Anvin
This patch is based on Rusty's recent cleanup of the EFLAGS-related macros; it extends the same kind of cleanup to control registers and MSRs. It also unifies these between i386 and x86-64; at least with regards to MSRs, the two had definitely gotten out of sync. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] x86: Allow percpu variables to be page-alignedJeremy Fitzhardinge
Let's allow page-alignment in general for per-cpu data (wanted by Xen, and Ingo suggested KVM as well). Because larger alignments can use more room, we increase the max per-cpu memory to 64k rather than 32k: it's getting a little tight. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02[PATCH] i386: Enable bank 0 on non K7 AthlonAndi Kleen
As a bug workaround bank 0 on K7s is normally disabled, but no need to do that on other AMD CPUs. Cc: davej@redhat.com Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] i386: Update smp_call_function* commentsJeremy Fitzhardinge
Update documentation for i386 smp_call_function* functions. As reported by Randy Dunlap <rdunlap@xenotime.net> [ I've posted this before but it seems to have been lost along the way. ] Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Randy Dunlap <rdunlap@xenotime.net>
2007-05-02[PATCH] i386: Use menuconfig objects - APMJan Engelhardt
(I hope Andi is the right one to Cc, otherwise please add, thanks!) Use menuconfigs instead of menus, so the whole menu can be disabled at once instead of going through all options. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] x86: Don't use MWAIT on AMD Family 10Andi Kleen
It doesn't put the CPU into deeper sleep states, so it's better to use the standard idle loop to save power. But allow to reenable it anyways for benchmarking. I also removed the obsolete idle=halt on i386 Cc: andreas.herrmann@amd.com Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02[PATCH] x86-64: Clean up asm-x86_64/bugs.hJeremy Fitzhardinge
Most of asm-x86_64/bugs.h is code which should be in a C file, so put it there. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-02[PATCH] i386: Make COMPAT_VDSO runtime selectable.Jeremy Fitzhardinge
Now that relocation of the VDSO for COMPAT_VDSO users is done at runtime rather than compile time, it is possible to enable/disable compat mode at runtime. This patch allows you to enable COMPAT_VDSO mode with "vdso=2" on the kernel command line, or via sysctl. (Switching on a running system shouldn't be done lightly; any process which was relying on the compat VDSO will be upset if it goes away.) The COMPAT_VDSO config option still exists, but if enabled it just makes vdso_enabled default to VDSO_COMPAT. +From: Hugh Dickins <hugh@veritas.com> Fix oops from i386-make-compat_vdso-runtime-selectable.patch. Even mingetty at system startup finds it easy to trigger an oops while reading /proc/PID/maps: though it has a good hold on the mm itself, that cannot stop exit_mm() from resetting tsk->mm to NULL. (It is usually show_map()'s call to get_gate_vma() which oopses, and I expect we could change that to check priv->tail_vma instead; but no matter, even m_start()'s call just after get_task_mm() is racy.) Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Zachary Amsden <zach@vmware.com> Cc: "Jan Beulich" <JBeulich@novell.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Andi Kleen <ak@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roland McGrath <roland@redhat.com>