aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/include
AgeCommit message (Collapse)Author
2009-12-08Merge branch 'timers-for-linus-hpet' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus-hpet' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: hpet: Make WARN_ON understandable x86: arch specific support for remapping HPET MSIs intr-remap: generic support for remapping HPET MSIs x86, hpet: Simplify the HPET code x86, hpet: Disable per-cpu hpet timer if ARAT is supported
2009-12-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: hwrng: core - Prevent too-small buffer sizes hwrng: virtio-rng - Convert to new API hwrng: core - Replace u32 in driver API with byte array crypto: ansi_cprng - Move FIPS functions under CONFIG_CRYPTO_FIPS crypto: testmgr - Add ghash algorithm test before provide to users crypto: ghash-clmulni-intel - Put proper .data section in place crypto: ghash-clmulni-intel - Use gas macro for PCLMULQDQ-NI and PSHUFB crypto: aesni-intel - Use gas macro for AES-NI instructions x86: Generate .byte code for some new instructions via gas macro crypto: ghash-intel - Fix irq_fpu_usable usage crypto: ghash-intel - Add PSHUFB macros crypto: ghash-intel - Hard-code pshufb crypto: ghash-intel - Fix building failure on x86_32 crypto: testmgr - Fix warning crypto: ansi_cprng - Fix test in get_prng_bytes crypto: hash - Remove cra_u.{digest,hash} crypto: api - Remove digest case from procfs show handler crypto: hash - Remove legacy hash/digest code crypto: ansi_cprng - Add FIPS wrapper crypto: ghash - Add PCLMULQDQ accelerated implementation
2009-12-08Merge branch 'x86-uv-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: UV RTC: Always enable RTC clocksource x86: UV RTC: Rename generic_interrupt to x86_platform_ipi x86: UV RTC: Clean up error handling x86: UV RTC: Add clocksource only boot option x86: UV RTC: Fix early expiry handling
2009-12-08Merge branch 'x86-process-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-process-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-64: merge the standard and compat start_thread() functions x86-64: make compat_start_thread() match start_thread()
2009-12-08Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits) x86, mm: Correct the implementation of is_untracked_pat_range() x86/pat: Trivial: don't create debugfs for memtype if pat is disabled x86, mtrr: Fix sorting of mtrr after subtracting x86: Move find_smp_config() earlier and avoid bootmem usage x86, platform: Change is_untracked_pat_range() to bool; cleanup init x86: Change is_ISA_range() into an inline function x86, mm: is_untracked_pat_range() takes a normal semiclosed range x86, mm: Call is_untracked_pat_range() rather than is_ISA_range() x86: UV SGI: Don't track GRU space in PAT x86: SGI UV: Fix BAU initialization x86, numa: Use near(er) online node instead of roundrobin for NUMA x86, numa, bootmem: Only free bootmem on NUMA failure path x86: Change crash kernel to reserve via reserve_early() x86: Eliminate redundant/contradicting cache line size config options x86: When cleaning MTRRs, do not fold WP into UC x86: remove "extern" from function prototypes in <asm/proto.h> x86, mm: Report state of NX protections during boot x86, mm: Clean up and simplify NX enablement x86, pageattr: Make set_memory_(x|nx) aware of NX support x86, sleep: Always save the value of EFER ... Fix up conflicts (added both iommu_shutdown and is_untracked_pat_range) to 'struct x86_platform_ops') in arch/x86/include/asm/x86_init.h arch/x86/kernel/x86_init.c
2009-12-08Merge branch 'x86-microcode-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: ucode-amd: Move family check to microcde_amd.c's init function x86, ucode-amd: Ensure ucode update on suspend/resume after CPU off/online cycle x86: ucode-amd: Convert printk(KERN_*...) to pr_*(...) x86: ucode-amd: Don't warn when no ucode is available for a CPU revision x86: ucode-amd: Load ucode-patches once and not separately of each CPU x86, amd-ucode: Remove needless log messages
2009-12-08Merge branch 'for-2.6.33' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-2.6.33' of git://git.kernel.dk/linux-2.6-block: (113 commits) cfq-iosched: Do not access cfqq after freeing it block: include linux/err.h to use ERR_PTR cfq-iosched: use call_rcu() instead of doing grace period stall on queue exit blkio: Allow CFQ group IO scheduling even when CFQ is a module blkio: Implement dynamic io controlling policy registration blkio: Export some symbols from blkio as its user CFQ can be a module block: Fix io_context leak after failure of clone with CLONE_IO block: Fix io_context leak after clone with CLONE_IO cfq-iosched: make nonrot check logic consistent io controller: quick fix for blk-cgroup and modular CFQ cfq-iosched: move IO controller declerations to a header file cfq-iosched: fix compile problem with !CONFIG_CGROUP blkio: Documentation blkio: Wait on sync-noidle queue even if rq_noidle = 1 blkio: Implement group_isolation tunable blkio: Determine async workload length based on total number of queues blkio: Wait for cfq queue to get backlogged if group is empty blkio: Propagate cgroup weight updation to cfq groups blkio: Drop the reference to queue once the task changes cgroup blkio: Provide some isolation between groups ...
2009-12-08Merge branch 'kvm-updates/2.6.33' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.33' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (84 commits) KVM: VMX: Fix comparison of guest efer with stale host value KVM: s390: Fix prefix register checking in arch/s390/kvm/sigp.c KVM: Drop user return notifier when disabling virtualization on a cpu KVM: VMX: Disable unrestricted guest when EPT disabled KVM: x86 emulator: limit instructions to 15 bytes KVM: s390: Make psw available on all exits, not just a subset KVM: x86: Add KVM_GET/SET_VCPU_EVENTS KVM: VMX: Report unexpected simultaneous exceptions as internal errors KVM: Allow internal errors reported to userspace to carry extra data KVM: Reorder IOCTLs in main kvm.h KVM: x86: Polish exception injection via KVM_SET_GUEST_DEBUG KVM: only clear irq_source_id if irqchip is present KVM: x86: disallow KVM_{SET,GET}_LAPIC without allocated in-kernel lapic KVM: x86: disallow multiple KVM_CREATE_IRQCHIP KVM: VMX: Remove vmx->msr_offset_efer KVM: MMU: update invlpg handler comment KVM: VMX: move CR3/PDPTR update to vmx_set_cr3 KVM: remove duplicated task_switch check KVM: powerpc: Fix BUILD_BUG_ON condition KVM: VMX: Use shared msr infrastructure ... Trivial conflicts due to new Kconfig options in arch/Kconfig and kernel/Makefile
2009-12-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits) mac80211: fix reorder buffer release iwmc3200wifi: Enable wimax core through module parameter iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter iwmc3200wifi: Coex table command does not expect a response iwmc3200wifi: Update wiwi priority table iwlwifi: driver version track kernel version iwlwifi: indicate uCode type when fail dump error/event log iwl3945: remove duplicated event logging code b43: fix two warnings ipw2100: fix rebooting hang with driver loaded cfg80211: indent regulatory messages with spaces iwmc3200wifi: fix NULL pointer dereference in pmkid update mac80211: Fix TX status reporting for injected data frames ath9k: enable 2GHz band only if the device supports it airo: Fix integer overflow warning rt2x00: Fix padding bug on L2PAD devices. WE: Fix set events not propagated b43legacy: avoid PPC fault during resume b43: avoid PPC fault during resume tcp: fix a timewait refcnt race ... Fix up conflicts due to sysctl cleanups (dead sysctl_check code and CTL_UNNUMBERED removed) in kernel/sysctl_check.c net/ipv4/sysctl_net_ipv4.c net/ipv6/addrconf.c net/sctp/sysctl.c
2009-12-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits) security/tomoyo: Remove now unnecessary handling of security_sysctl. security/tomoyo: Add a special case to handle accesses through the internal proc mount. sysctl: Drop & in front of every proc_handler. sysctl: Remove CTL_NONE and CTL_UNNUMBERED sysctl: kill dead ctl_handler definitions. sysctl: Remove the last of the generic binary sysctl support sysctl net: Remove unused binary sysctl code sysctl security/tomoyo: Don't look at ctl_name sysctl arm: Remove binary sysctl support sysctl x86: Remove dead binary sysctl support sysctl sh: Remove dead binary sysctl support sysctl powerpc: Remove dead binary sysctl support sysctl ia64: Remove dead binary sysctl support sysctl s390: Remove dead sysctl binary support sysctl frv: Remove dead binary sysctl support sysctl mips/lasat: Remove dead binary sysctl support sysctl drivers: Remove dead binary sysctl support sysctl crypto: Remove dead binary sysctl support sysctl security/keys: Remove dead binary sysctl support sysctl kernel: Remove binary sysctl logic ...
2009-12-05Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, msr, cpumask: Use struct cpumask rather than the deprecated cpumask_t x86, cpuid: Simplify the code in cpuid_open x86, cpuid: Remove the bkl from cpuid_open() x86, msr: Remove the bkl from msr_open() x86: AMD Geode LX optimizations x86, msr: Unify rdmsr_on_cpus/wrmsr_on_cpus
2009-12-05Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: include/linux/compiler-gcc4.h: Fix build bug - gcc-4.0.2 doesn't understand __builtin_object_size x86/alternatives: No need for alternatives-asm.h to re-invent stuff already in asm.h x86/alternatives: Check replacementlen <= instrlen at build time x86, 64-bit: Set data segments to null after switching to 64-bit mode x86: Clean up the loadsegment() macro x86: Optimize loadsegment() x86: Add missing might_fault() checks to copy_{to,from}_user() x86-64: __copy_from_user_inatomic() adjustments x86: Remove unused thread_return label from switch_to() x86, 64-bit: Fix bstep_iret jump x86: Don't use the strict copy checks when branch profiling is in use x86, 64-bit: Move K8 B step iret fixup to fault entry asm x86: Generate cmpxchg build failures x86: Add a Kconfig option to turn the copy_from_user warnings into errors x86: Turn the copy_from_user check into an (optional) compile time warning x86: Use __builtin_memset and __builtin_memcpy for memset/memcpy x86: Use __builtin_object_size() to validate the buffer size for copy_from_user()
2009-12-05Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) x86, apic: Enable lapic nmi watchdog on AMD Family 11h x86: Remove unnecessary mdelay() from cpu_disable_common() x86, ioapic: Document another case when level irq is seen as an edge x86, ioapic: Fix the EOI register detection mechanism x86, io-apic: Move the effort of clearing remoteIRR explicitly before migrating the irq x86: SGI UV: Map low MMR ranges x86: apic: Print out SRAT table APIC id in hex x86: Re-get cfg_new in case reuse/move irq_desc x86: apic: Remove not needed #ifdef x86: io-apic: IO-APIC MMIO should not fail on resource insertion x86: Remove asm/apicnum.h x86: apic: Do not use stacked physid_mask_t x86, apic: Get rid of apicid_to_cpu_present assign on 64-bit x86, ioapic: Use snrpintf while set names for IO-APIC resourses x86, apic: Use PAGE_SIZE instead of numbers x86: Remove local_irq_enable()/local_irq_disable() in fixup_irqs() x86: Use EOI register in io-apic on intel platforms x86: Force irq complete move during cpu offline x86: Remove move_cleanup_count from irq_cfg x86, intr-remap: Avoid irq_chip mask/unmask in fixup_irqs() for intr-remapping ...
2009-12-05Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (470 commits) x86: Fix comments of register/stack access functions perf tools: Replace %m with %a in sscanf hw-breakpoints: Keep track of user disabled breakpoints tracing/syscalls: Make syscall events print callbacks static tracing: Add DEFINE_EVENT(), DEFINE_SINGLE_EVENT() support to docbook perf: Don't free perf_mmap_data until work has been done perf_event: Fix compile error perf tools: Fix _GNU_SOURCE macro related strndup() build error trace_syscalls: Remove unused syscall_name_to_nr() trace_syscalls: Simplify syscall profile trace_syscalls: Remove duplicate init_enter_##sname() trace_syscalls: Add syscall_nr field to struct syscall_metadata trace_syscalls: Remove enter_id exit_id trace_syscalls: Set event_enter_##sname->data to its metadata trace_syscalls: Remove unused event_syscall_enter and event_syscall_exit perf_event: Initialize data.period in perf_swevent_hrtimer() perf probe: Simplify event naming perf probe: Add --list option for listing current probe events perf probe: Add argv_split() from lib/argv_split.c perf probe: Move probe event utility functions to probe-event.c ...
2009-12-05Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
Conflicts: drivers/net/pcmcia/fmvj18x_cs.c drivers/net/pcmcia/nmclan_cs.c drivers/net/pcmcia/xirc2ps_cs.c drivers/net/wireless/ray_cs.c
2009-12-05Merge branch 'core-iommu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits) x86, Calgary IOMMU quirk: Find nearest matching Calgary while walking up the PCI tree x86/amd-iommu: Remove amd_iommu_pd_table x86/amd-iommu: Move reset_iommu_command_buffer out of locked code x86/amd-iommu: Cleanup DTE flushing code x86/amd-iommu: Introduce iommu_flush_device() function x86/amd-iommu: Cleanup attach/detach_device code x86/amd-iommu: Keep devices per domain in a list x86/amd-iommu: Add device bind reference counting x86/amd-iommu: Use dev->arch->iommu to store iommu related information x86/amd-iommu: Remove support for domain sharing x86/amd-iommu: Rearrange dma_ops related functions x86/amd-iommu: Move some pte allocation functions in the right section x86/amd-iommu: Remove iommu parameter from dma_ops_domain_alloc x86/amd-iommu: Use get_device_id and check_device where appropriate x86/amd-iommu: Move find_protection_domain to helper functions x86/amd-iommu: Simplify get_device_resources() x86/amd-iommu: Let domain_for_device handle aliases x86/amd-iommu: Remove iommu specific handling from dma_ops path x86/amd-iommu: Remove iommu parameter from __(un)map_single x86/amd-iommu: Make alloc_new_range aware of multiple IOMMUs ...
2009-12-05x86: Convert BUG() to use unreachable()David Daney
Use the new unreachable() macro instead of for(;;);. When allyesconfig is built with a GCC-4.5 snapshot on i686 the size of the text segment is reduced by 3987 bytes (from 6827019 to 6823032). Signed-off-by: David Daney <ddaney@caviumnetworks.com> Acked-by: "H. Peter Anvin" <hpa@zytor.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: x86@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-03Merge branch 'perf/mce' into perf/coreIngo Molnar
Merge reason: It's ready for v2.6.33. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-03Merge branch 'master' into for-2.6.33Jens Axboe
2009-12-03KVM: VMX: Fix comparison of guest efer with stale host valueAvi Kivity
update_transition_efer() masks out some efer bits when deciding whether to switch the msr during guest entry; for example, NX is emulated using the mmu so we don't need to disable it, and LMA/LME are handled by the hardware. However, with shared msrs, the comparison is made against a stale value; at the time of the guest switch we may be running with another guest's efer. Fix by deferring the mask/compare to the actual point of guest entry. Noted by Marcelo. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: x86 emulator: limit instructions to 15 bytesAvi Kivity
While we are never normally passed an instruction that exceeds 15 bytes, smp games can cause us to attempt to interpret one, which will cause large latencies in non-preempt hosts. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: x86: Add KVM_GET/SET_VCPU_EVENTSJan Kiszka
This new IOCTL exports all yet user-invisible states related to exceptions, interrupts, and NMIs. Together with appropriate user space changes, this fixes sporadic problems of vmsave/restore, live migration and system reset. [avi: future-proof abi by adding a flags field] Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: x86 shared msr infrastructureAvi Kivity
The various syscall-related MSRs are fairly expensive to switch. Currently we switch them on every vcpu preemption, which is far too often: - if we're switching to a kernel thread (idle task, threaded interrupt, kernel-mode virtio server (vhost-net), for example) and back, then there's no need to switch those MSRs since kernel threasd won't be exiting to userspace. - if we're switching to another guest running an identical OS, most likely those MSRs will have the same value, so there's little point in reloading them. - if we're running the same OS on the guest and host, the MSRs will have identical values and reloading is unnecessary. This patch uses the new user return notifiers to implement last-minute switching, and checks the msr values to avoid unnecessary reloading. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: allow userspace to adjust kvmclock offsetGlauber Costa
When we migrate a kvm guest that uses pvclock between two hosts, we may suffer a large skew. This is because there can be significant differences between the monotonic clock of the hosts involved. When a new host with a much larger monotonic time starts running the guest, the view of time will be significantly impacted. Situation is much worse when we do the opposite, and migrate to a host with a smaller monotonic clock. This proposed ioctl will allow userspace to inform us what is the monotonic clock value in the source host, so we can keep the time skew short, and more importantly, never goes backwards. Userspace may also need to trigger the current data, since from the first migration onwards, it won't be reflected by a simple call to clock_gettime() anymore. [marcelo: future-proof abi with a flags field] [jan: fix KVM_GET_CLOCK by clearing flags field instead of checking it] Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: SVM: Cleanup NMI singlestepJan Kiszka
Push the NMI-related singlestep variable into vcpu_svm. It's dealing with an AMD-specific deficit, nothing generic for x86. Acked-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> arch/x86/include/asm/kvm_host.h | 1 - arch/x86/kvm/svm.c | 12 +++++++----- 2 files changed, 7 insertions(+), 6 deletions(-) Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03KVM: x86: Fix guest single-stepping while interruptibleJan Kiszka
Commit 705c5323 opened the doors of hell by unconditionally injecting single-step flags as long as guest_debug signaled this. This doesn't work when the guest branches into some interrupt or exception handler and triggers a vmexit with flag reloading. Fix it by saving cs:rip when user space requests single-stepping and restricting the trace flag injection to this guest code position. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03KVM: Xen PV-on-HVM guest supportEd Swierk
Support for Xen PV-on-HVM guests can be implemented almost entirely in userspace, except for handling one annoying MSR that maps a Xen hypercall blob into guest address space. A generic mechanism to delegate MSR writes to userspace seems overkill and risks encouraging similar MSR abuse in the future. Thus this patch adds special support for the Xen HVM MSR. I implemented a new ioctl, KVM_XEN_HVM_CONFIG, that lets userspace tell KVM which MSR the guest will write to, as well as the starting address and size of the hypercall blobs (one each for 32-bit and 64-bit) that userspace has loaded from files. When the guest writes to the MSR, KVM copies one page of the blob from userspace to the guest. I've tested this patch with a hacked-up version of Gerd's userspace code, booting a number of guests (CentOS 5.3 i386 and x86_64, and FreeBSD 8.0-RC1 amd64) and exercising PV network and block devices. [jan: fix i386 build warning] [avi: future proof abi with a flags field] Signed-off-by: Ed Swierk <eswierk@aristanetworks.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: SVM: Support Pause Filter in AMD processorsMark Langsdorf
New AMD processors (Family 0x10 models 8+) support the Pause Filter Feature. This feature creates a new field in the VMCB called Pause Filter Count. If Pause Filter Count is greater than 0 and intercepting PAUSEs is enabled, the processor will increment an internal counter when a PAUSE instruction occurs instead of intercepting. When the internal counter reaches the Pause Filter Count value, a PAUSE intercept will occur. This feature can be used to detect contended spinlocks, especially when the lock holding VCPU is not scheduled. Rescheduling another VCPU prevents the VCPU seeking the lock from wasting its quantum by spinning idly. Experimental results show that most spinlocks are held for less than 1000 PAUSE cycles or more than a few thousand. Default the Pause Filter Counter to 3000 to detect the contended spinlocks. Processor support for this feature is indicated by a CPUID bit. On a 24 core system running 4 guests each with 16 VCPUs, this patch improved overall performance of each guest's 32 job kernbench by approximately 3-5% when combined with a scheduler algorithm thati caused the VCPU to sleep for a brief period. Further performance improvement may be possible with a more sophisticated yield algorithm. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03KVM: VMX: Add support for Pause-Loop ExitingZhai, Edwin
New NHM processors will support Pause-Loop Exiting by adding 2 VM-execution control fields: PLE_Gap - upper bound on the amount of time between two successive executions of PAUSE in a loop. PLE_Window - upper bound on the amount of time a guest is allowed to execute in a PAUSE loop If the time, between this execution of PAUSE and previous one, exceeds the PLE_Gap, processor consider this PAUSE belongs to a new loop. Otherwise, processor determins the the total execution time of this loop(since 1st PAUSE in this loop), and triggers a VM exit if total time exceeds the PLE_Window. * Refer SDM volume 3b section 21.6.13 & 22.1.3. Pause-Loop Exiting can be used to detect Lock-Holder Preemption, where one VP is sched-out after hold a spinlock, then other VPs for same lock are sched-in to waste the CPU time. Our tests indicate that most spinlocks are held for less than 212 cycles. Performance tests show that with 2X LP over-commitment we can get +2% perf improvement for kernel build(Even more perf gain with more LPs). Signed-off-by: Zhai Edwin <edwin.zhai@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03KVM: x86: Rework guest single-step flag injection and filteringJan Kiszka
Push TF and RF injection and filtering on guest single-stepping into the vender get/set_rflags callbacks. This makes the whole mechanism more robust wrt user space IOCTL order and instruction emulations. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: x86: Refactor guest debug IOCTL handlingJan Kiszka
Much of so far vendor-specific code for setting up guest debug can actually be handled by the generic code. This also fixes a minor deficit in the SVM part /wrt processing KVM_GUESTDBG_ENABLE. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: Activate Virtualization On DemandAlexander Graf
X86 CPUs need to have some magic happening to enable the virtualization extensions on them. This magic can result in unpleasant results for users, like blocking other VMMs from working (vmx) or using invalid TLB entries (svm). Currently KVM activates virtualization when the respective kernel module is loaded. This blocks us from autoloading KVM modules without breaking other VMMs. To circumvent this problem at least a bit, this patch introduces on demand activation of virtualization. This means, that instead virtualization is enabled on creation of the first virtual machine and disabled on destruction of the last one. So using this, KVM can be easily autoloaded, while keeping other hypervisors usable. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: Move irq ack notifier list to arch independent codeGleb Natapov
Mask irq notifier list is already there. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: Maintain back mapping from irqchip/pin to gsiGleb Natapov
Maintain back mapping from irqchip/pin to gsi to speedup interrupt acknowledgment notifications. [avi: build fix on non-x86/ia64] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: Move irq sharing information to irqchip levelGleb Natapov
This removes assumptions that max GSIs is smaller than number of pins. Sharing is tracked on pin level not GSI level. [avi: no PIC on ia64] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03KVM: Don't pass kvm_run argumentsAvi Kivity
They're just copies of vcpu->run, which is readily accessible. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03Merge remote branch 'tip/x86/entry' into kvm-updates/2.6.33Avi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-02x86/alternatives: No need for alternatives-asm.h to re-invent stuff already ↵Jan Beulich
in asm.h This at once also gets the alignment specification right for x86-64. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4B0FF8F80200007800022708@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02x86/alternatives: Check replacementlen <= instrlen at build timeJan Beulich
Having run into the run-(boot-)time check a couple of times lately, I finally took time to find a build-time check so that one doesn't need to analyze the register/stack dump and resolve this (through manual lookup in vmlinux) to the offending construct. The assembler will emit a message like "Error: value of <num> too large for field of 1 bytes at <offset>", which while not pointing out the source location still makes analysis quite a bit easier. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4B0FF8AA0200007800022703@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-02x86: Fix comments of register/stack access functionsMasami Hiramatsu
Fix typos and some redundant comments of register/stack access functions in asm/ptrace.h. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Wenji Huang <wenji.huang@oracle.com> Cc: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> LKML-Reference: <20091201000222.7669.7477.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu> Suggested-by: Wenji Huang <wenji.huang@oracle.com>
2009-12-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6Herbert Xu
2009-11-30x86, mm: Correct the implementation of is_untracked_pat_range()H. Peter Anvin
The semantics the PAT code expect of is_untracked_pat_range() is "is this range completely contained inside the untracked region." This means that checkin 8a27138924f64d2f30c1022f909f74480046bc3f was technically wrong, because the implementation needlessly confusing. The sane interface is for it to take a semiclosed range like just about everything else (as evidenced by the sheer number of "- 1"'s removed by that patch) so change the actual implementation to match. Reported-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jack Steiner <steiner@sgi.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <20091119202341.GA4420@sgi.com>
2009-11-27x86/amd-iommu: Remove amd_iommu_pd_tableJoerg Roedel
The data that was stored in this table is now available in dev->archdata.iommu. So this table is not longer necessary. This patch removes the remaining uses of that variable and removes it from the code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-11-27x86/amd-iommu: Cleanup DTE flushing codeJoerg Roedel
This patch cleans up the code to flush device table entries in the IOMMU. With this chance the driver can get rid of the iommu_queue_inv_dev_entry() function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-11-27x86/amd-iommu: Keep devices per domain in a listJoerg Roedel
This patch introduces a list to each protection domain which keeps all devices associated with the domain. This can be used later to optimize certain functions and to completly remove the amd_iommu_pd_table. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-11-27x86/amd-iommu: Add device bind reference countingJoerg Roedel
This patch adds a reference count to each device to count how often the device was bound to that domain. This is important for single devices that act as an alias for a number of others. These devices must stay bound to their domains until all devices that alias to it are unbound from the same domain. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-11-27x86/amd-iommu: Use dev->arch->iommu to store iommu related informationJoerg Roedel
This patch changes IOMMU code to use dev->archdata->iommu to store information about the alias device and the domain the device is attached to. This allows the driver to get rid of the amd_iommu_pd_table in the future. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-11-27x86/amd-iommu: Remove support for domain sharingJoerg Roedel
This patch makes device isolation mandatory and removes support for the amd_iommu=share option. This simplifies the code in several places. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-11-27x86/amd-iommu: Make np-cache a global flagJoerg Roedel
The non-present cache flag was IOMMU local until now which doesn't make sense. Make this a global flag so we can remove the lase user of 'struct iommu' in the map/unmap path. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-11-27x86/amd-iommu: Implement protection domain listJoerg Roedel
This patch adds code to keep a global list of all protection domains. This allows to simplify the resume code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>