aboutsummaryrefslogtreecommitdiff
path: root/drivers/kvm
AgeCommit message (Collapse)Author
2007-07-16KVM: VMX: Handle #SS faults from real modeNitin A Kamble
Instructions with address size override prefix opcode 0x67 Cause the #SS fault with 0 error code in VM86 mode. Forward them to the emulator. Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Use local labels in inline assemblyAvi Kivity
This makes oprofile dumps and disassebly easier to read. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Fix vmx I/O bitmap initialization on highmem systemsAvi Kivity
kunmap() expects a struct page, not a virtual address. Fixes an oops loading kvm-intel.ko on i386 with CONFIG_HIGHMEM. Thanks to Michael Ivanov <deruhu@peterstar.ru> for reporting. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Avoid corrupting tr in real modeAvi Kivity
The real mode tr needs to be set to a specific tss so that I/O instructions can function. Divert the new tr values to the real mode save area from where they will be restored on transition to protected mode. This fixes some crashes on reboot when the bios accesses an I/O instruction. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Only reload guest msrs if they are already loadedAvi Kivity
If we set an msr via an ioctl() instead of by handling a guest exit, we have the host state loaded, so reloading the msrs would clobber host state instead of guest state. This fixes a host oops (and loss of a cpu) on a guest reboot. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: MMU: Store shadow page tables as kernel virtual addresses, not physicalAvi Kivity
Simpifies things a bit. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: MMU: Simplify kvm_mmu_free_page() a tiny bitAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Implement IA32_EBL_CR_POWERON msrMatthew Gregan
Attempting to boot the default 'bsd' kernel of OpenBSD 4.1 i386 in a guest fails early in the kernel init inside p3_get_bus_clock while trying to read the IA32_EBL_CR_POWERON MSR. KVM logs an 'unhandled MSR' message and the guest kernel faults. This patch is sufficient to allow OpenBSD to boot, after which it seems to run fine. I'm not sure if this is the correct solution for dealing with this particular MSR, but it works for me. Signed-off-by: Matthew Gregan <kinetik@flim.org> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Set cr0.mp for guestsAvi Kivity
This allows fwait instructions to be trapped when the guest fpu is not loaded. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Consolidate guest fpu activation and deactivationAvi Kivity
Easier to keep track of where the fpu is this way. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Rationalize exception bitmap usageAvi Kivity
Everyone owns a piece of the exception bitmap, but they happily write to the entire thing like there's no tomorrow. Centralize handling in update_exception_bitmap() and have everyone call that. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Move some more msr mangling into vmx_save_host_state()Avi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Fix potential guest state leak into hostAvi Kivity
The lightweight vmexit path avoids saving and reloading certain host state. However in certain cases lightweight vmexit handling can schedule() which requires reloading the host state. So we store the host state in the vcpu structure, and reloaded it if we relinquish the vcpu. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Increase mmu shadow cache to 1024 pagesAvi Kivity
This improves kbuild times by about 10%, bringing it within a respectable 25% of native. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Update shadow pte on write to guest pteAvi Kivity
A typical demand page/copy on write pattern is: - page fault on vaddr - kvm propagates fault to guest - guest handles fault, updates pte - kvm traps write, clears shadow pte, resumes guest - guest returns to userspace, re-faults on same vaddr - kvm installs shadow pte, resumes guest - guest continues So, three vmexits for a single guest page fault. But if instead of clearing the page table entry, we update to correspond to the value that the guest has just written, we eliminate the third vmexit. This patch does exactly that, reducing kbuild time by about 10%. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: MMU: Respect nonpae pagetable quadrant when zapping ptesAvi Kivity
When a guest writes to a page that has an mmu shadow, we have to clear the shadow pte corresponding to the memory location touched by the guest. Now, in nonpae mode, a single guest page may have two or four shadow pages (because a nonpae page maps 4MB or 4GB, whereas the pae shadow maps 2MB or 1GB), so we when we look up the page we find up to three additional aliases for the page. Since we _clear_ the shadow pte, it doesn't matter except for a slight performance penalty, but if we want to _update_ the shadow pte instead of clearing it, it is vital that we don't modify the aliases. Fortunately, exactly which page is needed (the "quadrant") is easily computed, and is accessible in the shadow page header. All we need is to ignore shadow pages from the wrong quadrants. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Unify kvm_mmu_pre_write() and kvm_mmu_post_write()Avi Kivity
Instead of calling two functions and repeating expensive checks, call one function and provide it with before/after information. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Be more careful restoring fs on lightweight vmexitAvi Kivity
i386 wants fs for accessing the pda even on a lightweight exit, so ensure we can always restore it. This fixes a regression on i386 introduced by the lightweight vmexit patch. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Reduce misfirings of the fork detectorAvi Kivity
The kvm mmu tries to detects forks by looking for repeated writes to a page table. If it sees a fork, it unshadows the page table so the page table copying can proceed at native speed instead of being emulated. However, the detector also triggered on simple demand paging access patterns: a linear walk of memory would of course cause repeated writes to the same pagetable page, causing it to unshadow prematurely. Fix by resetting the fork detector if we detect a demand fault. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Unindent some codeAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Avoid saving and restoring some host CPU state on lightweight vmexitAvi Kivity
Many msrs and the like will only be used by the host if we schedule() or return to userspace. Therefore, we avoid saving them if we handle the exit within the kernel, and if a reschedule is not requested. Based on a patch from Eddie Dong <eddie.dong@intel.com> with a couple of fixes by me. Signed-off-by: Yaozu(Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Assume that writes smaller than 4 bytes are to non-pagetable pagesAvi Kivity
This allows us to remove write protection earlier than otherwise. Should some mad OS choose to use byte writes to update pagetables, it will suffer a performance hit, but still work correctly. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: SVM: Allow direct guest access to PC debug portAnthony Liguori
The PC debug port is used for IO delay and does not require emulation. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Enable io bitmaps to avoid IO port 0x80 VMEXITsHe, Qing
This patch enables IO bitmaps control on vmx and unmask the 0x80 port to avoid VMEXITs caused by accessing port 0x80. 0x80 is used as delays (see include/asm/io.h), and handling VMEXITs on its access is unnecessary but slows things down. This patch improves kernel build test at around 3%~5%. Because every VM uses the same io bitmap, it is shared between all VMs rather than a per-VM data structure. Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-06-15KVM: Prevent guest fpu state from leaking into the hostAvi Kivity
The lazy fpu changes did not take into account that some vmexit handlers can sleep. Move loading the guest state into the inner loop so that it can be reloaded if necessary, and move loading the host state into vmx_vcpu_put() so it can be performed whenever we relinquish the vcpu. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-06-01kvm: fix section mismatch warning in kvm-intel.oSam Ravnborg
Fix following section mismatch warning in kvm-intel.o: WARNING: o-i386/drivers/kvm/kvm-intel.o(.init.text+0xbd): Section mismatch: reference to .exit.text: (between 'hardware_setup' and 'vmx_disabled_by_bios') The function free_kvm_area is used in the function alloc_kvm_area which is marked __init. The __exit area is discarded by some archs during link-time if a module is built-in resulting in an oops. Note: This warning is only seen by my local copy of modpost but the change will soon hit upstream. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21Detach sched.h from mm.hAlexey Dobriyan
First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-10[S390] Kconfig: refine depends statements.Martin Schwidefsky
Refine some depends statements to limit their visibility to the environments that are actually supported. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-05-09Add suspend-related notifications for CPU hotplugRafael J. Wysocki
Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [oleg@tv-sign.ru: cleanups] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-03KVM: Remove unused 'instruction_length'Avi Kivity
As we no longer emulate in userspace, this is meaningless. We don't compute it on SVM anyway. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Don't require explicit indication of completion of mmio or pioAvi Kivity
It is illegal not to return from a pio or mmio request without completing it, as mmio or pio is an atomic operation. Therefore, we can simplify the userspace interface by avoiding the completion indication. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Remove extraneous guest entry on mmio readAvi Kivity
When emulating an mmio read, we actually emulate twice: once to determine the physical address of the mmio, and, after we've exited to userspace to get the mmio value, we emulate again to place the value in the result register and update any flags. But we don't really need to enter the guest again for that, only to take an immediate vmexit. So, if we detect that we're doing an mmio read, emulate a single instruction before entering the guest again. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: SVM: Only save/restore MSRs when neededAnthony Liguori
We only have to save/restore MSR_GS_BASE on every VMEXIT. The rest can be saved/restored when we leave the VCPU. Since we don't emulate the DEBUGCTL MSRs and the guest cannot write to them, we don't have to worry about saving/restoring them at all. This shaves a whopping 40% off raw vmexit costs on AMD. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: fix an if() conditionAdrian Bunk
It might have worked in this case since PT_PRESENT_MASK is 1, but let's express this correctly. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: VMX: Add lazy FPU support for VTAnthony Liguori
Only save/restore the FPU host state when the guest is actually using the FPU. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: VMX: Properly shadow the CR0 register in the vcpu structAnthony Liguori
Set all of the host mask bits for CR0 so that we can maintain a proper shadow of CR0. This exposes CR0.TS, paving the way for lazy fpu handling. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Don't complain about cpu erratum AA15Avi Kivity
It slows down Windows x64 horribly. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Lazy FPU support for SVMAnthony Liguori
Avoid saving and restoring the guest fpu state on every exit. This shaves ~100 cycles off the guest/host switch. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Allow passing 64-bit values to the emulated read/write APIAvi Kivity
This simplifies the API somewhat (by eliminating the special-case cmpxchg8b on i386). Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Per-vcpu statisticsAvi Kivity
Make the exit statistics per-vcpu instead of global. This gives a 3.5% boost when running one virtual machine per core on my two socket dual core (4 cores total) machine. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: VMX: Avoid unnecessary vcpu_load()/vcpu_put() cyclesYaozu Dong
By checking if a reschedule is needed, we avoid dropping the vcpu. [With changes by me, based on Anthony Liguori's observations] Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: MMU: Avoid heavy ASSERT at non debug mode.Yaozu Dong
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: VMX: Only save/restore MSR_K6_STAR if necessaryAvi Kivity
Intel hosts only support syscall/sysret in long more (and only if efer.sce is enabled), so only reload the related MSR_K6_STAR if the guest will actually be able to use it. This reduces vmexit cost by about 500 cycles (6400 -> 5870) on my setup. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Fold drivers/kvm/kvm_vmx.h into drivers/kvm/vmx.cAvi Kivity
No meat in that file. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: VMX: Don't switch 64-bit msrs for 32-bit guestsAvi Kivity
Some msrs are only used by x86_64 instructions, and are therefore not needed when the guest is legacy mode. By not bothering to switch them, we reduce vmexit latency by 2400 cycles (from about 8800) when running a 32-bt guest on a 64-bit host. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: VMX: Reduce unnecessary saving of host msrsAvi Kivity
THe automatically switched msrs are never changed on the host (with the exception of MSR_KERNEL_GS_BASE) and thus there is no need to save them on every vm entry. This reduces vmexit latency by ~400 cycles on i386 and by ~900 cycles (10%) on x86_64. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Handle guest page faults when emulating mmioAvi Kivity
Usually, guest page faults are detected by the kvm page fault handler, which detects if they are shadow faults, mmio faults, pagetable faults, or normal guest page faults. However, in ceratin circumstances, we can detect a page fault much later. One of these events is the following combination: - A two memory operand instruction (e.g. movsb) is executed. - The first operand is in mmio space (which is the fault reported to kvm) - The second operand is in an ummaped address (e.g. a guest page fault) The Windows 2000 installer does such an access, an promptly hangs. Fix by adding the missing page fault injection on that path. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: SVM: Report hardware exit reason to userspace instead of dmesgAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Retry sleeping allocation if atomic allocation failsAvi Kivity
This avoids -ENOMEM under memory pressure. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Use slab caches to allocate mmu data structuresAvi Kivity
Better leak detection, statistics, memory use, speed -- goodness all around. Signed-off-by: Avi Kivity <avi@qumranet.com>