aboutsummaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2008-10-15KVM: x86: unhalt vcpu0 on resetMarcelo Tosatti
Since "KVM: x86: do not execute halted vcpus", HLT by vcpu0 before system reset by the IO thread will hang the guest. Mark vcpu as runnable in such case. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: x86 emulator: Add call near absolute instruction (opcode 0xff/2)Mohammed Gamal
Add call near absolute instruction. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: x86: do not execute halted vcpusMarcelo Tosatti
Offline or uninitialized vcpu's can be executed if requested to perform userspace work. Follow Avi's suggestion to handle halted vcpu's in the main loop, simplifying kvm_emulate_halt(). Introduce a new vcpu->requests bit to indicate events that promote state from halted to running. Also standardize vcpu wake sites. Signed-off-by: Marcelo Tosatti <mtosatti <at> redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: x86 emulator: Add in/out instructions (opcodes 0xe4-0xe7, 0xec-0xef)Mohammed Gamal
The patch adds in/out instructions to the x86 emulator. The instruction was encountered while running the BIOS while using the invalid guest state emulation patch. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Add statistics for guest irq injectionsAvi Kivity
These can help show whether a guest is making progress or not. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Modify kvm_shadow_walk.entry to accept u64 addrSheng Yang
EPT is 4 level by default in 32pae(48 bits), but the addr parameter of kvm_shadow_walk->entry() only accept unsigned long as virtual address, which is 32bit in 32pae. This result in SHADOW_PT_INDEX() overflow when try to fetch level 4 index. Fix it by extend kvm_shadow_walk->entry() to accept 64bit addr in parameter. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: x86 emulator: Add std and cld instructions (opcodes 0xfc-0xfd)Mohammed Gamal
This adds the std and cld instructions to the emulator. Encountered while running the BIOS with invalid guest state emulation enabled. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: add MC5_MISC msr read supportJoerg Roedel
Currently KVM implements MC0-MC4_MISC read support. When booting Linux this results in KVM warnings in the kernel log when the guest tries to read MC5_MISC. Fix this warnings with this patch. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: SVM: No need to unprotect memory during event injection when using nptAvi Kivity
No memory is protected anyway. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Fix setting the accessed bit on non-speculative sptesAvi Kivity
The accessed bit was accidentally turned on in a random flag word, rather than, the spte itself, which was lucky, since it used the non-EPT compatible PT_ACCESSED_MASK. Fix by turning the bit on in the spte and changing it to use the portable accessed mask. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Flush tlbs after clearing write permission when accessing dirty logAvi Kivity
Otherwise, the cpu may allow writes to the tracked pages, and we lose some display bits or fail to migrate correctly. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Add locking around kvm_mmu_slot_remove_write_access()Avi Kivity
It was generally safe due to slots_lock being held for write, but it wasn't very nice. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Account for npt/ept/realmode page faultsAvi Kivity
Now that two-dimensional paging is becoming common, account for tdp page faults. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: x86 emulator: Add mov r, imm instructions (opcodes 0xb0-0xbf)Mohammed Gamal
The emulator only supported one instance of mov r, imm instruction (opcode 0xb8), this adds the rest of these instructions. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Allocate guest memory as MAP_PRIVATE, not MAP_SHAREDAvi Kivity
There is no reason to share internal memory slots with fork()ed instances. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Convert the paging mode shadow walk to use the generic walkerAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Convert direct maps to use the generic shadow walkerAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Add generic shadow walkerAvi Kivity
We currently walk the shadow page tables in two places: direct map (for real mode and two dimensional paging) and paging mode shadow. Since we anticipate requiring a third walk (for invlpg), it makes sense to have a generic facility for shadow walk. This patch adds such a shadow walker, walks the page tables and calls a method for every spte encountered. The method can examine the spte, modify it, or even instantiate it. The walk can be aborted by returning nonzero from the method. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Infer shadow root level in direct_map()Avi Kivity
In all cases the shadow root level is available in mmu.shadow_root_level, so there is no need to pass it as a parameter. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Unify direct map 4K and large page pathsAvi Kivity
The two paths are equivalent except for one argument, which is already available. Merge the two codepaths. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Move SHADOW_PT_INDEX to mmu.cAvi Kivity
It is not specific to the paging mode, so can be made global (and reusable). Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: x86 emulator: remove bad ByteOp specifier from NEG descriptorAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: x86 emulator: remove duplicate SrcImmroel kluin
Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Load real mode segments correctlyAvi Kivity
Real mode segments to not reference the GDT or LDT; they simply compute base = selector * 16. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Change segment dpl at reset to 3Avi Kivity
This is more emulation friendly, if not 100% correct. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Change cs reset state to be a data segmentAvi Kivity
Real mode cs is a data segment, not a code segment. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: make irq ack notifier functions staticHarvey Harrison
sparse says: arch/x86/kvm/x86.c:107:32: warning: symbol 'kvm_find_assigned_dev' was not declared. Should it be static? arch/x86/kvm/i8254.c:225:6: warning: symbol 'kvm_pit_ack_irq' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Use kvm_set_irq to inject interruptsAmit Shah
... instead of using the pic and ioapic variants Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: SVM: Fix typoAmit Shah
Fix typo in as-yet unused macro definition. Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Modify mode switching and vmentry functionsMohammed Gamal
This patch modifies mode switching and vmentry function in order to drive invalid guest state emulation. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Add invalid guest state handlerMohammed Gamal
This adds the invalid guest state handler function which invokes the x86 emulator until getting the guest to a VMX-friendly state. [avi: leave atomic context if scheduling] [guillaume: return to atomic context correctly] Signed-off-by: Laurent Vivier <laurent.vivier@bull.net> Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Add module parameter and emulation flag.Mohammed Gamal
The patch adds the module parameter required to enable emulating invalid guest state, as well as the emulation_required flag used to drive emulation whenever needed. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Add Guest State Validity ChecksMohammed Gamal
This patch adds functions to check whether guest state is VMX compliant. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Device assignment: Check for privileges before assigning irqAmit Shah
Even though we don't share irqs at the moment, we should ensure regular user processes don't try to allocate system resources. We check for capability to access IO devices (CAP_SYS_RAWIO) before we request_irq on behalf of the guest. Noticed by Avi. Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Handle spurious acks for PIT interruptsAvi Kivity
Spurious acks can be generated, for example if the PIC is being reset. Handle those acks gracefully rather than flooding the log with warnings. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: fix i8259 reset irq ackingMarcelo Tosatti
The irq ack during pic reset has three problems: - Ignores slave/master PIC, using gsi 0-8 for both. - Generates an ACK even if the APIC is in control. - Depends upon IMR being clear, which is broken if the irq was masked at the time it was generated. The last one causes the BIOS to hang after the first reboot of Windows installation, since PIT interrupts stop. [avi: fix check whether pic interrupts are seen by cpu] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Use interrupt queue for !irqchip_in_kernelAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: set debug registers after "schedulable" sectionMarcelo Tosatti
The vcpu thread can be preempted after the guest_debug_pre() callback, resulting in invalid debug registers on the new vcpu. Move it inside the non-preemptable section. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Clean up magic number 0x66 in init_rmode_tssSheng Yang
Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Reduce stack usage in kvm_pv_mmu_op()Dave Hansen
We're in a hot path. We can't use kmalloc() because it might impact performance. So, we just stick the buffer that we need into the kvm_vcpu_arch structure. This is used very often, so it is not really a waste. We also have to move the buffer structure's definition to the arch-specific x86 kvm header. Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Reduce stack usage in kvm_arch_vcpu_ioctl()Dave Hansen
[sheng: fix KVM_GET_LAPIC using wrong size] Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Reduce kvm stack usage in kvm_arch_vm_ioctl()Dave Hansen
On my machine with gcc 3.4, kvm uses ~2k of stack in a few select functions. This is mostly because gcc fails to notice that the different case: statements could have their stack usage combined. It overflows very nicely if interrupts happen during one of these large uses. This patch uses two methods for reducing stack usage. 1. dynamically allocate large objects instead of putting on the stack. 2. Use a union{} member for all of the case variables. This tricks gcc into combining them all into a single stack allocation. (There's also a comment on this) Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: pci device assignmentBen-Ami Yassour
Based on a patch from: Amit Shah <amit.shah@qumranet.com> This patch adds support for handling PCI devices that are assigned to the guest. The device to be assigned to the guest is registered in the host kernel and interrupt delivery is handled. If a device is already assigned, or the device driver for it is still loaded on the host, the device assignment is failed by conveying a -EBUSY reply to the userspace. Devices that share their interrupt line are not supported at the moment. By itself, this patch will not make devices work within the guest. The VT-d extension is required to enable the device to perform DMA. Another alternative is PVDMA. Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15x86: KVM guest: use paravirt function to calculate cpu khzGlauber Costa
We're currently facing timing problems in guests that do calibration under heavy load, and then the load vanishes. This means we'll have a much lower lpj than we actually should, and delays end up taking less time than they should, which is a nasty bug. Solution is to pass on the lpj value from host to guest, and have it preset. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15x86: paravirt: factor out cpu_khz to common codeGlauber Costa
KVM intends to use paravirt code to calibrate khz. Xen current code will do just fine. So as a first step, factor out code to pvclock.c. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: PIT: fix injection logic and countMarcelo Tosatti
The PIT injection logic is problematic under the following cases: 1) If there is a higher priority vector to be delivered by the time kvm_pit_timer_intr_post is invoked ps->inject_pending won't be set. This opens the possibility for missing many PIT event injections (say if guest executes hlt at this point). 2) ps->inject_pending is racy with more than two vcpus. Since there's no locking around read/dec of pt->pending, two vcpu's can inject two interrupts for a single pt->pending count. Fix 1 by using an irq ack notifier: only reinject when the previous irq has been acked. Fix 2 with appropriate locking around manipulation of pending count and irq_ack by the injection / ack paths. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: irq ack notificationMarcelo Tosatti
Based on a patch from: Ben-Ami Yassour <benami@il.ibm.com> which was based on a patch from: Amit Shah <amit.shah@qumranet.com> Notify IRQ acking on PIC/APIC emulation. The previous patch missed two things: - Edge triggered interrupts on IOAPIC - PIC reset with IRR/ISR set should be equivalent to ack (LAPIC probably needs something similar). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> CC: Amit Shah <amit.shah@qumranet.com> CC: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Add irq ack notifier listAvi Kivity
This can be used by kvm subsystems that are interested in when interrupts are acked, for example time drift compensation. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Ignore DEBUGCTL MSRs with no effectAlexander Graf
Netware writes to DEBUGCTL and reads from the DEBUGCTL and LAST*IP MSRs without further checks and is really confused to receive a #GP during that. To make it happy we should just make them stubs, which is exactly what SVM already does. Writes to DEBUGCTL that are vendor-specific are resembled to behave as if the virtual CPU does not know them. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Avoid vmwrite(HOST_RSP) when possibleAvi Kivity
Usually HOST_RSP retains its value across guest entries. Take advantage of this and avoid a vmwrite() when this is so. Signed-off-by: Avi Kivity <avi@qumranet.com>