Age | Commit message (Collapse) | Author |
|
One of vcpu_setup responsibilities is to do mmu initialization.
However, in case we fail in kvm_arch_vcpu_reset, before we get the
chance to init mmu. OTOH, vcpu_destroy will attempt to destroy mmu,
triggering a bug. Keeping track of whether or not mmu is initialized
would unnecessarily complicate things. Rather, we just make return,
making sure any needed uninitialization is done before we return, in
case we fail.
Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
The vcpu should process pending SIPI message before entering guest mode again.
kvm_arch_vcpu_runnable() returns true if the vcpu is in SIPI state, so
we can't call it here.
Signed-off-by: Gleb Natapov <gleb@qumranet.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Noticed by sparse:
arch/x86/kvm/x86.c:3591:5: warning: symbol 'kvm_load_realmode_segment' was not declared. Should it be static?
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Convert gfn_to_pfn to use get_user_pages_fast, which can do lockless
pagetable lookups on x86. Kernel compilation on 4-way guest is 3.7%
faster on VMX.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
kvm_vm_fault is invoked with mmap_sem held in read mode. Since gfn_to_page
will be converted to get_user_pages_fast, which requires this lock NOT
to be held, switch to opencoded get_user_pages.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
When an IRQ allocation fails, we free up the device structures and
disable the device so that we can unregister the device in the
userspace and not expose it to the guest at all.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Based on a patch by: Kay, Allen M <allen.m.kay@intel.com>
This patch enables PCI device assignment based on VT-d support.
When a device is assigned to the guest, the guest memory is pinned and
the mapping is updated in the VT-d IOMMU.
[Amit: Expose KVM_CAP_IOMMU so we can check if an IOMMU is present
and also control enable/disable from userspace]
Signed-off-by: Kay, Allen M <allen.m.kay@intel.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
Signed-off-by: Amit Shah <amit.shah@qumranet.com>
Acked-by: Mark Gross <mgross@linux.intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
This patch extends the VT-d driver to support KVM
[Ben: fixed memory pinning]
[avi: move dma_remapping.h as well]
Signed-off-by: Kay, Allen M <allen.m.kay@intel.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
Signed-off-by: Amit Shah <amit.shah@qumranet.com>
Acked-by: Mark Gross <mgross@linux.intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
For instruction 'and al,imm' we use DstAcc instead of doing
the emulation directly into the instruction's opcode.
Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Add decode entries for these opcodes; execution is already implemented.
Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Add DstAcc operand type. That means that there are 4 bits now for
DstMask.
"In the good old days cpus would have only one register that was able to
fully participate in arithmetic operations, typically called A for
Accumulator. The x86 retains this tradition by having special, shorter
encodings for the A register (like the cmp opcode), and even some
instructions that only operate on A (like mul).
SrcAcc and DstAcc would accommodate these instructions by decoding A
into the corresponding 'struct operand'."
-- Avi Kivity
Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
For MSR_IA32_FEATURE_CONTROL is already there.
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
jmp r/m64 doesn't require the rex.w prefix to indicate the operand size
is 64 bits. Set the Stack attribute (even though it doesn't involve the
stack, really) to indicate this.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Commit 1c0f4f5011829dac96347b5f84ba37c2252e1e08 left a useless access
of VM_ENTRY_INTR_INFO_FIELD in vmx_intr_assist behind. Clean this up.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
And it gets in the way of get_user_pages_fast().
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Since "KVM: x86: do not execute halted vcpus", HLT by vcpu0 before system
reset by the IO thread will hang the guest.
Mark vcpu as runnable in such case.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Add call near absolute instruction.
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Offline or uninitialized vcpu's can be executed if requested to perform
userspace work.
Follow Avi's suggestion to handle halted vcpu's in the main loop,
simplifying kvm_emulate_halt(). Introduce a new vcpu->requests bit to
indicate events that promote state from halted to running.
Also standardize vcpu wake sites.
Signed-off-by: Marcelo Tosatti <mtosatti <at> redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
The patch adds in/out instructions to the x86 emulator.
The instruction was encountered while running the BIOS while using
the invalid guest state emulation patch.
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
These can help show whether a guest is making progress or not.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
EPT is 4 level by default in 32pae(48 bits), but the addr parameter
of kvm_shadow_walk->entry() only accept unsigned long as virtual
address, which is 32bit in 32pae. This result in SHADOW_PT_INDEX()
overflow when try to fetch level 4 index.
Fix it by extend kvm_shadow_walk->entry() to accept 64bit addr in
parameter.
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Two ioctl arch functions are added to set vcpu's smp state.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
This adds the std and cld instructions to the emulator.
Encountered while running the BIOS with invalid guest
state emulation enabled.
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
The current help text for CONFIG_S390_GUEST is not very helpful.
Lets add more text.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Heiko Carstens pointed out, that its safer to activate working facilities
instead of disabling problematic facilities. The new code uses the host
facility bits and masks it with known good ones.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Currently KVM implements MC0-MC4_MISC read support. When booting Linux this
results in KVM warnings in the kernel log when the guest tries to read
MC5_MISC. Fix this warnings with this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
No memory is protected anyway.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
The accessed bit was accidentally turned on in a random flag word, rather
than, the spte itself, which was lucky, since it used the non-EPT compatible
PT_ACCESSED_MASK.
Fix by turning the bit on in the spte and changing it to use the portable
accessed mask.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Otherwise, the cpu may allow writes to the tracked pages, and we lose
some display bits or fail to migrate correctly.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
It was generally safe due to slots_lock being held for write, but it wasn't
very nice.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Now that two-dimensional paging is becoming common, account for tdp page
faults.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
The emulator only supported one instance of mov r, imm instruction
(opcode 0xb8), this adds the rest of these instructions.
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
This is esoteric and only needed to break COW on MAP_SHARED mappings. Since
KVM no longer does these sorts of mappings, breaking COW on them is no longer
necessary.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
There is no reason to share internal memory slots with fork()ed instances.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
We currently walk the shadow page tables in two places: direct map (for
real mode and two dimensional paging) and paging mode shadow. Since we
anticipate requiring a third walk (for invlpg), it makes sense to have
a generic facility for shadow walk.
This patch adds such a shadow walker, walks the page tables and calls a
method for every spte encountered. The method can examine the spte,
modify it, or even instantiate it. The walk can be aborted by returning
nonzero from the method.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
In all cases the shadow root level is available in mmu.shadow_root_level,
so there is no need to pass it as a parameter.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
kvm/ia64 uses the virtio drivers to optimize its I/O subsytem.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
The two paths are equivalent except for one argument, which is already
available. Merge the two codepaths.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
It is not specific to the paging mode, so can be made global (and reusable).
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Real mode segments to not reference the GDT or LDT; they simply compute
base = selector * 16.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
This is more emulation friendly, if not 100% correct.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Real mode cs is a data segment, not a code segment.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
|
Before enabling notify_acked_irq for ia64, leave the related APIs as
nop-op first.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|