aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kvm/x86.c
AgeCommit message (Collapse)Author
2008-07-20KVM: Apply the kernel sigmask to vcpus blocked due to being uninitializedAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: move slots_lock acquision down to vapic_exitMarcelo Tosatti
There is no need to grab slots_lock if the vapic_page will not be touched. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: x86 emulator: lazily evaluate segment registersAvi Kivity
Instead of prefetching all segment bases before emulation, read them at the last moment. Since most of them are unneeded, we save some cycles on Intel machines where this is a bit expensive. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: Use printk_rlimit() instead of reporting emulation failures just onceAvi Kivity
Emulation failure reports are useful, so allow more than one per the lifetime of the module. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: Do not calculate linear rip in emulation failure reportGlauber Costa
If we're not gonna do anything (case in which failure is already reported), we do not need to even bother with calculating the linear rip. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: Add coalesced MMIO support (x86 part)Laurent Vivier
This patch enables coalesced MMIO for x86 architecture. It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO. It enables the compilation of coalesced_mmio.c. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: kvm_io_device: extend in_range() to manage len and write attributeLaurent Vivier
Modify member in_range() of structure kvm_io_device to pass length and the type of the I/O (write or read). This modification allows to use kvm_io_device with coalesced MMIO. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: Prefixes segment functions that will be exported with "kvm_"Guillaume Thouvenin
Prefixes functions that will be exported with kvm_. We also prefixed set_segment() even if it still static to be coherent. signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Laurent Vivier <laurent.vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: MTRR supportAvi Kivity
Add emulation for the memory type range registers, needed by VMware esx 3.5, and by pci device assignment. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: VMX: Enable NMI with in-kernel irqchipSheng Yang
Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: IOAPIC/LAPIC: Enable NMI supportSheng Yang
[avi: fix ia64 build breakage] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: Remove unnecessary ->decache_regs() callAvi Kivity
Since we aren't modifying any register, there's no need to decache the register state. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: Remove decache_vcpus_on_cpu() and related callbacksAvi Kivity
Obsoleted by the vmx-specific per-cpu list. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: VMX: Add list of potentially locally cached vcpusAvi Kivity
VMX hardware can cache the contents of a vcpu's vmcs. This cache needs to be flushed when migrating a vcpu to another cpu, or (which is the case that interests us here) when disabling hardware virtualization on a cpu. The current implementation of decaching iterates over the list of all vcpus, picks the ones that are potentially cached on the cpu that is being offlined, and flushes the cache. The problem is that it uses mutex_trylock() to gain exclusive access to the vcpu, which fires off a (benign) warning about using the mutex in an interrupt context. To avoid this, and to make things generally nicer, add a new per-cpu list of potentially cached vcus. This makes the decaching code much simpler. The list is vmx-specific since other hardware doesn't have this issue. [andrea: fix crash on suspend/resume] Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: add missing kvmtrace bitsJoerg Roedel
This patch adds some kvmtrace bits to the generic x86 code where it is instrumented from SVM. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20KVM: add statics were possible, function definition in lapic.hHarvey Harrison
Noticed by sparse: arch/x86/kvm/vmx.c:1583:6: warning: symbol 'vmx_disable_intercept_for_msr' was not declared. Should it be static? arch/x86/kvm/x86.c:3406:5: warning: symbol 'kvm_task_switch_16' was not declared. Should it be static? arch/x86/kvm/x86.c:3429:5: warning: symbol 'kvm_task_switch_32' was not declared. Should it be static? arch/x86/kvm/mmu.c:1968:6: warning: symbol 'kvm_mmu_remove_one_alloc_mmu_page' was not declared. Should it be static? arch/x86/kvm/mmu.c:2014:6: warning: symbol 'mmu_destroy_caches' was not declared. Should it be static? arch/x86/kvm/lapic.c:862:5: warning: symbol 'kvm_lapic_get_base' was not declared. Should it be static? arch/x86/kvm/i8254.c:94:5: warning: symbol 'pit_get_gate' was not declared. Should it be static? arch/x86/kvm/i8254.c:196:5: warning: symbol '__pit_timer_fn' was not declared. Should it be static? arch/x86/kvm/i8254.c:561:6: warning: symbol '__inject_pit_timer_intr' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-26smp_call_function: get rid of the unused nonatomic/retry argumentJens Axboe
It's never used and the comments refer to nonatomic and retry interchangably. So get rid of it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-06-24KVM: Make kvm host use the paravirt clocksource structsGerd Hoffmann
This patch updates the kvm host code to use the pvclock structs. It also makes the paravirt clock compatible with Xen. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24KVM: close timer injection race window in __vcpu_runMarcelo Tosatti
If a timer fires after kvm_inject_pending_timer_irqs() but before local_irq_disable() the code will enter guest mode and only inject such timer interrupt the next time an unrelated event causes an exit. It would be simpler if the timer->pending irq conversion could be done with IRQ's disabled, so that the above problem cannot happen. For now introduce a new vcpu requests bit to cancel guest entry. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24KVM: Fix race between timer migration and vcpu migrationMarcelo Tosatti
A guest vcpu instance can be scheduled to a different physical CPU between the test for KVM_REQ_MIGRATE_TIMER and local_irq_disable(). If that happens, the timer will only be migrated to the current pCPU on the next exit, meaning that guest LAPIC timer event can be delayed until a host interrupt is triggered. Fix it by cancelling guest entry if any vcpu request is pending. This has the side effect of nicely consolidating vcpu->requests checks. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-06KVM: migrate PIT timerMarcelo Tosatti
Migrate the PIT timer to the physical CPU which vcpu0 is scheduled on, similarly to what is done for the LAPIC timers, otherwise PIT interrupts will be delayed until an unrelated event causes an exit. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04KVM: avoid fx_init() schedule in atomicAndrea Arcangeli
This make sure not to schedule in atomic during fx_init. I also changed the name of fpu_init to fx_finit to avoid duplicating the name with fpu_init that is already used in the kernel, this makes grep simpler if nothing else. Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04KVM: Avoid spurious execeptions after setting registersJan Kiszka
Clear pending exceptions when setting new register values. This avoids spurious exceptions after restoring a vcpu state or after reset-on-triple-fault. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04KVM: x86: task switch: fix wrong bit setting for the busy flagIzik Eidus
The busy bit is bit 1 of the type field, not bit 8. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04KVM: VMX: Prepare an identity page table for EPT in real modeSheng Yang
[aliguory: plug leak] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04KVM: MMU: Add EPT supportSheng Yang
Enable kvm_set_spte() to generate EPT entries. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: fix kvm_vcpu_kick vs __vcpu_run raceMarcelo Tosatti
There is a window open between testing of pending IRQ's and assignment of guest_mode in __vcpu_run. Injection of IRQ's can race with __vcpu_run as follows: CPU0 CPU1 kvm_x86_ops->run() vcpu->guest_mode = 0 SET_IRQ_LINE ioctl .. kvm_x86_ops->inject_pending_irq kvm_cpu_has_interrupt() apic_test_and_set_irr() kvm_vcpu_kick if (vcpu->guest_mode) send_ipi() vcpu->guest_mode = 1 So move guest_mode=1 assignment before ->inject_pending_irq, and make sure that it won't reorder after it. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: add ioctls to save/store mpstateMarcelo Tosatti
So userspace can save/restore the mpstate during migration. [avi: export the #define constants describing the value] [christian: add s390 stubs] [avi: ditto for ia64] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Rename VCPU_MP_STATE_* to KVM_MP_STATE_*Avi Kivity
We wish to export it to userspace, so move it into the kvm namespace. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add trace markersFeng (Eric) Liu
Trace markers allow userspace to trace execution of a virtual machine in order to monitor its performance. Signed-off-by: Feng (Eric) Liu <eric.e.liu@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Free apic access page on vm destructionAvi Kivity
Noticed by Marcelo Tosatti. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: unify slots_lock usageMarcelo Tosatti
Unify slots_lock acquision around vcpu_run(). This is simpler and less error-prone. Also fix some callsites that were not grabbing the lock properly. [avi: drop slots_lock while in guest mode to avoid holding the lock for indefinite periods] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: x86: hardware task switching supportIzik Eidus
This emulates the x86 hardware task switch mechanism in software, as it is unsupported by either vmx or svm. It allows operating systems which use it, like freedos, to run as kvm guests. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: hypercall based pte updates and TLB flushesMarcelo Tosatti
Hypercall based pte updates are faster than faults, and also allow use of the lazy MMU mode to batch operations. Don't report the feature if two dimensional paging is enabled. [avi: - one mmu_op hypercall instead of one per op - allow 64-bit gpa on hypercall - don't pass host errors (-ENOMEM) to guest] [akpm: warning fix on i386] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Provide unlocked version of emulator_write_phys()Avi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: add basic paravirt supportMarcelo Tosatti
Add basic KVM paravirt support. Avoid vm-exits on IO delays. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add save/restore supporting of in kernel PITSheng Yang
Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: In kernel PIT modelSheng Yang
The patch moves the PIT model from userspace to kernel, and increases the timer accuracy greatly. [marcelo: make last_injected_time per-guest] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Tested-and-Acked-by: Alex Davis <alex14641@yahoo.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: replace remaining __FUNCTION__ occurancesHarvey Harrison
__FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: detect if VCPU triple faultsJoerg Roedel
In the current inject_page_fault path KVM only checks if there is another PF pending and injects a DF then. But it has to check for a pending DF too to detect a shutdown condition in the VCPU. If this is not detected the VCPU goes to a PF -> DF -> PF loop when it should triple fault. This patch detects this condition and handles it with an KVM_SHUTDOWN exit to userspace. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Prefix control register accessors with kvm_ to avoid namespace pollutionAvi Kivity
Names like 'set_cr3()' look dangerously close to affecting the host. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: large page supportMarcelo Tosatti
Create large pages mappings if the guest PTE's are marked as such and the underlying memory is hugetlbfs backed. If the largepage contains write-protected pages, a large pte is not used. Gives a consistent 2% improvement for data copies on ram mounted filesystem, without NPT/EPT. Anthony measures a 4% improvement on 4-way kernbench, with NPT. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: ignore zapped root pagetablesMarcelo Tosatti
Mark zapped root pagetables as invalid and ignore such pages during lookup. This is a problem with the cr3-target feature, where a zapped root table fools the faulting code into creating a read-only mapping. The result is a lockup if the instruction can't be emulated. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Implement dummy values for MSR_PERF_STATUSAlexander Graf
Darwin relies on this and ceases to work without. Signed-off-by: Alexander Graf <alex@csgraf.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: sparse fixes for kvm/x86.cHarvey Harrison
In two case statements, use the ever popular 'i' instead of index: arch/x86/kvm/x86.c:1063:7: warning: symbol 'index' shadows an earlier one arch/x86/kvm/x86.c:1000:9: originally declared here arch/x86/kvm/x86.c:1079:7: warning: symbol 'index' shadows an earlier one arch/x86/kvm/x86.c:1000:9: originally declared here Make it static. arch/x86/kvm/x86.c:1945:24: warning: symbol 'emulate_ops' was not declared. Should it be static? Drop the return statements. arch/x86/kvm/x86.c:2878:2: warning: returning void-valued expression arch/x86/kvm/x86.c:2944:2: warning: returning void-valued expression Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add stat counter for hypercallsAmit Shah
Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Use x86's segment descriptor struct instead of private definitionAvi Kivity
The x86 desc_struct unification allows us to remove segment_descriptor.h. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add API for determining the number of supported memory slotsAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add API to retrieve the number of supported vcpus per vmAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: paravirtualized clocksource: host partGlauber de Oliveira Costa
This is the host part of kvm clocksource implementation. As it does not include clockevents, it is a fairly simple implementation. We only have to register a per-vcpu area, and start writing to it periodically. The area is binary compatible with xen, as we use the same shadow_info structure. [marcelo: fix bad_page on MSR_KVM_SYSTEM_TIME] [avi: save full value of the msr, even if enable bit is clear] [avi: clear previous value of time_page] Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>