aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2007-10-13KVM: Convert vm lock to a mutexShaohua Li
This allows the kvm mmu to perform sleepy operations, such as memory allocation. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use the scheduler preemption notifiers to make kvm preemptibleAvi Kivity
Current kvm disables preemption while the new virtualization registers are in use. This of course is not very good for latency sensitive workloads (one use of virtualization is to offload user interface and other latency insensitive stuff to a container, so that it is easier to analyze the remaining workload). This patch re-enables preemption for kvm; preemption is now only disabled when switching the registers in and out, and during the switch to guest mode and back. Contains fixes from Shaohua Li <shaohua.li@intel.com>. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: add hypercall nr to kvm_runJeff Dike
Add the hypercall number to kvm_run and initialize it. This changes the ABI, but as this particular ABI was unusable before this no users are affected. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: VMX: Improve the method of writing vmcs controlYang, Sheng
Put cpu feature detecting part in hardware_setup, and stored the vmcs condition in global variable for further check. [glommer: fix for some i386-only machines not supporting CR8 load/store exiting] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Dynamically allocate vcpusRusty Russell
This patch converts the vcpus array in "struct kvm" to a pointer array, and changes the "vcpu_create" and "vcpu_setup" hooks into one "vcpu_create" call which does the allocation and initialization of the vcpu (calling back into the kvm_vcpu_init core helper). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Remove arch specific components from the general codeGregory Haskins
struct kvm_vcpu has vmx-specific members; remove them to a private structure. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: load_pdptrs() cleanupsRusty Russell
load_pdptrs can be handed an invalid cr3, and it should not oops. This can happen because we injected #gp in set_cr3() after we set vcpu->cr3 to the invalid value, or from kvm_vcpu_ioctl_set_sregs(), or memory configuration changes after the guest did set_cr3(). We should also copy the pdpte array once, before checking and assigning, otherwise an SMP guest can potentially alter the values between the check and the set. Finally one nitpick: ret = 1 should be done as late as possible: this allows GCC to check for unset "ret" should the function change in future. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Remove dead code in the cmpxchg instruction emulationAurelien Jarno
The writeback fixes (02c03a326a5df825cc01de426f72e160db2b9538) let some dead code in the cmpxchg instruction emulation. Remove it. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: VMX: Import some constants of vmcs from IA32 SDMYang, Sheng
This patch mainly imports some constants and rename two exist constants of vmcs according to IA32 SDM. It also adds two constants to indicate Lock bit and Enable bit in MSR_IA32_FEATURE_CONTROL, and replace the hardcode _5_ with these two bits. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Move gfn_to_page out of kmap/unmap pairsShaohua Li
gfn_to_page might sleep with swap support. Move it out of the kmap calls. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Hoist kvm_mmu_reload() out of the critical sectionShaohua Li
vmx_cpu_run doesn't handle error correctly and kvm_mmu_reload might sleep with mutex changes, so I move it above. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Return if the pdptrs are invalid when the guest turns on PAE.Rusty Russell
Don't fall through and turn on PAE in this case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: x86 emulator: fix faulty check for two-byte opcodeAvi Kivity
Right now, the bug is harmless as we never emulate one-byte 0xb6 or 0xb7. But things may change. Noted by the mysterious Gabriel C. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: x86 emulator: fix cmov for writeback changesAvi Kivity
The writeback fixes (02c03a326a5df825cc01de426f72e160db2b9538) broke cmov emulation. Fix. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use standard CR8 flags, and fix TPR definitionRusty Russell
Intel manual (and KVM definition) say the TPR is 4 bits wide. Also fix CR8_RESEVED_BITS typo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Set exit_reason to KVM_EXIT_MMIO where run->mmio is initialized.Jeff Dike
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Trivial: Use standard BITMAP macros, open-code userspace-exposed headerRusty Russell
Creating one's own BITMAP macro seems suboptimal: if we use manual arithmetic in the one place exposed to userspace, we can use standard macros elsewhere. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use standard CR4 flags, tighten checkingRusty Russell
On this machine (Intel), writing to the CR4 bits 0x00000800 and 0x00001000 cause a GPF. The Intel manual is a little unclear, but AFIACT they're reserved, too. Also fix spelling of CR4_RESEVED_BITS. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use standard CR3 flags, tighten checkingRusty Russell
The kernel now has asm/cpu-features.h: use those macros instead of inventing our own. Also spell out definition of CR3_RESEVED_BITS, fix spelling and tighten it for the non-PAE case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Trivial: Use standard CR0 flags macros from asm/cpu-features.hRusty Russell
The kernel now has asm/cpu-features.h: use those macros instead of inventing our own. Also spell out definition of CR0_RESEVED_BITS (no code change) and fix typo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Trivial: Avoid hardware_disable predeclarationRusty Russell
Don't pre-declare hardware_disable: shuffle the reboot hook down. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Trivial: Comment spelling may escape grepRusty Russell
Speling error in comment. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Trivial: Make decode_register() staticRusty Russell
I have shied away from touching x86_emulate.c (it could definitely use some love, but it is forked from the Xen code, and it would be more productive to cross-merge fixes). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Trivial: Remove unused struct cpu_user_regs declarationRusty Russell
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Trivial: /dev/kvm interface is no longer experimental.Rusty Russell
KVM interface is no longer experimental. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: In-kernel string pio write supportEddie Dong
Add string pio write support to support some version of Windows. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Future-proof the exit information union ABIAvi Kivity
Note that as the size of struct kvm_run is not part of the ABI, we can add things at the end. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: SMP: Add vcpu_id field in struct vcpuQing He
This patch adds a `vcpu_id' field in `struct vcpu', so we can differentiate BSP and APs without pointer comparison or arithmetic. Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Fix *nopage() in kvm_main.cNguyen Anh Quynh
*nopage() in kvm_main.c should only store the type of mmap() fault if the pointers are not NULL. This patch fixes the problem. Signed-off-by: Nguyen Anh Quynh <aquynh@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13i386: Expose IOAPIC register definitions even if CONFIG_X86_IO_APIC is not setAvi Kivity
KVM reuses the IOAPIC register definitions, and needs them even if the host is not compiled with IOAPIC support. Move the #ifdef below so that only the IOAPIC variables and functions are protected, and the register definitions are available to all. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-12x86/pci/acpi: fix DMI const-ification falloutJeff Garzik
Fix DMI const-ification fallout that appeared when merging subsystem trees. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-12x86: optimise barriersNick Piggin
According to latest memory ordering specification documents from Intel and AMD, both manufacturers are committed to in-order loads from cacheable memory for the x86 architecture. Hence, smp_rmb() may be a simple barrier. Also according to those documents, and according to existing practice in Linux (eg. spin_unlock doesn't enforce ordering), stores to cacheable memory are visible in program order too. Special string stores are safe -- their constituent stores may be out of order, but they must complete in order WRT surrounding stores. Nontemporal stores to WB memory can go out of order, and so they should be fenced explicitly to make them appear in-order WRT other stores. Hence, smp_wmb() may be a simple barrier. http://developer.intel.com/products/processor/manuals/318147.pdf http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf In userspace microbenchmarks on a core2 system, fence instructions range anywhere from around 15 cycles to 50, which may not be totally insignificant in performance critical paths (code size will go down too). However the primary motivation for this is to have the canonical barrier implementation for x86 architecture. smp_rmb on buggy pentium pros remains a locked op, which is apparently required. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-12x86: fix IO write barrierNick Piggin
wmb() on x86 must always include a barrier, because stores can go out of order in many cases when dealing with devices (eg. WC memory). Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-12x86: fence oostores on 64-bitNick Piggin
movnt* instructions are not strongly ordered with respect to other stores, so if we are to assume stores are strongly ordered in the rest of the 64 bit code, we must fence these off (see similar examples in 32 bit code). [ The AMD memory ordering document seems to say that nontemporal stores can also pass earlier regular stores, so maybe we need sfences _before_ movnt* everywhere too? ] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-12Only enable BLOCK_COMPAT if COMPAT is neededLinus Torvalds
IOW, it needs to depend on both CONFIG_BLOCK and CONFIG_COMPAT. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-12Merge branch 'upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (119 commits) [libata] struct pci_dev related cleanups libata: use ata_exec_internal() for PMP register access libata: implement ATA_PFLAG_RESETTING libata: add @timeout to ata_exec_internal[_sg]() ahci: fix notification handling ahci: clean up PORT_IRQ_BAD_PMP enabling ahci: kill leftover from enabling NCQ over PMP libata: wrap schedule_timeout_uninterruptible() in loop libata: skip suppress reporting if ATA_EHI_QUIET libata: clear ehi description after initial host report pata_jmicron: match vendor and class code only libata: add ST9160821AS / 3.ALD to NCQ blacklist pata_acpi: ACPI driver support libata-core: Expose gtm methods for driver use libata: add HDT722516DLA380 to NCQ blacklist libata: blacklist NCQ on Seagate Barracuda ST380817AS [libata] Turn on ACPI by default libata_scsi: Fix ATAPI transfer lengths libata: correct handling of SRST reset sequences libata: Integrate ACPI-based PATA/SATA hotplug - version 5 ...
2007-10-12Update maintainers fileAndi Kleen
Since there is no x86-64 architecture anymore it cannot be maintained. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-12Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (37 commits) PCI: merge almost all of pci_32.h and pci_64.h together PCI: X86: Introduce and enable PCI domain support PCI: Add 'nodomains' boot option, and pci_domains_supported global PCI: modify PCI bridge control ISA flag for clarity PCI: use _CRS for PCI resource allocation PCI: avoid P2P prefetch window for expansion ROMs PCI: skip ISA ioresource alignment on some systems PCI: remove transparent bridge sizing pci: write file size to inode on proc bus file write pci: use size stored in proc_dir_entry for proc bus files pci: implement "pci=noaer" PCI: fix IDE legacy mode resources MSI: Use correct data offset for 32-bit MSI in read_msi_msg() PCI: Fix incorrect argument order to list_add_tail() in PCI dynamic ID code PCI: i386: Compaq EVO N800c needs PCI bus renumbering PCI: Remove no longer correct documentation regarding MSI vector assignment PCI: re-enable onboard sound on "MSI K8T Neo2-FIR" PCI: quirk_vt82c586_acpi: Omit reading PCI revision ID PCI: quirk amd_8131_mmrbc: Omit reading pci revision ID cpqphp: Use PCI_CLASS_REVISION instead of PCI_REVISION_ID for read ...
2007-10-12Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (75 commits) PM: merge device power-management source files sysfs: add copyrights kobject: update the copyrights kset: add some kerneldoc to help describe what these strange things are Driver core: rename ktype_edd and ktype_efivar Driver core: rename ktype_driver Driver core: rename ktype_device Driver core: rename ktype_class driver core: remove subsystem_init() sysfs: move sysfs file poll implementation to sysfs_open_dirent sysfs: implement sysfs_open_dirent sysfs: move sysfs_dirent->s_children into sysfs_dirent->s_dir sysfs: make sysfs_root a regular directory dirent sysfs: open code sysfs_attach_dentry() sysfs: make s_elem an anonymous union sysfs: make bin attr open get active reference of parent too sysfs: kill unnecessary NULL pointer check in sysfs_release() sysfs: kill unnecessary sysfs_get() in open paths sysfs: reposition sysfs_dirent->s_mode. sysfs: kill sysfs_update_file() ...
2007-10-12Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6: (142 commits) USB: fix race in autosuspend reschedule atmel_usba_udc: Keep track of the device status USB: Nikon D40X unusual_devs entry USB: serial core should respect driver requirements USB: documentation for USB power management USB: skip autosuspended devices during system resume USB: mutual exclusion for EHCI init and port resets USB: allow usbstorage to have LUNS greater than 2Tb USB: Adding support for SHARP WS011SH to ipaq.c USB: add atmel_usba_udc driver USB: ohci SSB bus glue USB: ehci build fixes on au1xxx, ppc-soc USB: add runtime frame_no quirk for big-endian OHCI USB: funsoft: Fix termios USB: visor: termios bits USB: unusual_devs entry for Nikon DSC D2Xs USB: re-remove <linux/usb_sl811.h> USB: move <linux/usb_gadget.h> to <linux/usb/gadget.h> USB: Export URB statistics for powertop USB: serial gadget: Disable endpoints on unload ...
2007-10-12Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreqLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Don't take semaphore in cpufreq_quick_get() [CPUFREQ] Support different families in fid/did to frequency conversion [CPUFREQ] cpufreq_stats: misc cpuinit section annotations [CPUFREQ] implement !CONFIG_CPU_FREQ stub for cpufreq_unregister_notifier() [CPUFREQ] mark hotplug notifier callback as __cpuinit [CPUFREQ] Only check for transition latency on problematic governors (kconfig fix) [CPUFREQ] allow ondemand and conservative cpufreq governors to be used as default [CPUFREQ] move policy's governor initialisation out of low-level drivers into cpufreq core [CPUFREQ] Longhaul - Add support for PM133 northbridge [CPUFREQ] x86: use num_online_nodes to get physical cpus numbers for
2007-10-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86: (40 commits) x86: HPET add another ICH7 PCI id x86: HPET force enable ICH5 suspend/resume fix x86: HPET force enable for ICH5 x86: HPET try to activate force detected hpet x86: HPET force enable o ICH7 and later x86: HPET restructure hpet code for hpet force enable clock events: allow replacement of broadcast timer i386/x8664: cleanup the shared hpet code i386: Remove the useless #ifdef in i8253.h ACPI: remove the now unused ifdef code jiffies: remove unused macros x86_64: cleanup apic.c after clock events switch x86_64: remove now unused code x86: unify timex.h variants x86: kill 8253pit.h x86: disable apic timer for AMD C1E enabled CPUs x86: Fix irq0 / local apic timer accounting x86_64: convert to clock events x86_64: Add (not yet used) clock event functions x86_64: prepare idle loop for dynamic ticks ...
2007-10-12Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (23 commits) ocfs2: Optionally return filldir errors ocfs2: Write support for directories with inline data ocfs2: Read support for directories with inline data ocfs2: Write support for inline data ocfs2: Read support for inline data ocfs2: Structure updates for inline data ocfs2: Cleanup dirent size check ocfs2: Rename cleanups ocfs2: Provide convenience function for ino lookup ocfs2: Implement ocfs2_empty_dir() as a caller of ocfs2_dir_foreach() ocfs2: Remove open coded readdir() ocfs2: Pass raw u64 to filldir ocfs2: Abstract out core dir listing functionality ocfs2: Move directory manipulation code into dir.c ocfs2: Small refactor of truncate zeroing code ocfs2: move nonsparse hole-filling into ocfs2_write_begin() ocfs2: Sync ocfs2_fs.h with ocfs2-tools [PATCH] fs/ocfs2/: removed unneeded initial value and function's return value ocfs2: Implement show_options() ocfs2: Clear slot map when umounting a local volume ...
2007-10-12Merge branch 'isdn-cleanups' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6 * 'isdn-cleanups' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6: [ISDN] HiSax diva: split setup into three smaller functions [ISDN] HiSax sedlbauer: move ISAPNP and PCI code into functions of their own [ISDN] HiSax elsa: split huge setup function into four smaller functions [ISDN] HiSax avm_pci: split setup into three smaller functions [ISDN] Remove CONFIG_PCI ifdefs from 100% PCI source code
2007-10-12PCI: merge almost all of pci_32.h and pci_64.h togetherGreg Kroah-Hartman
It was just duplicated code... Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12PCI: X86: Introduce and enable PCI domain supportJeff Garzik
* fix bug in pci_read() and pci_write() which prevented PCI domain support from working (hardcoded domain 0). * unconditionally enable CONFIG_PCI_DOMAINS * implement pci_domain_nr() and pci_proc_domain(), as required of all arches when CONFIG_PCI_DOMAINS is enabled. * store domain in struct pci_sysdata, as assigned by ACPI * support "pci=nodomains" Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12PCI: Add 'nodomains' boot option, and pci_domains_supported globalJeff Garzik
* Introduce pci_domains_supported global, hardcoded to zero if !CONFIG_PCI_DOMAINS. * Introduce 'nodomains' boot option, which clears pci_domains_supported on platforms that enable it by default (x86, x86-64, and others when they are converted to use this). Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12PCI: modify PCI bridge control ISA flag for clarityGary Hade
Modify PCI Bridge Control ISA flag for clarity This patch changes PCI_BRIDGE_CTL_NO_ISA to PCI_BRIDGE_CTL_ISA and modifies it's clarifying comment and locations where used. The change reduces the chance of future confusion since it makes the set/unset meaning of the bit the same in both the bridge control register and bridge_ctl field of the pci_bus struct. Signed-off-by: Gary Hade <garyhade@us.ibm.com> Acked-by: Linas Vepstas <linas@austin.ibm.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12PCI: use _CRS for PCI resource allocationGary Hade
Use _CRS for PCI resource allocation This patch resolves an issue where incorrect PCI memory and i/o ranges are being assigned to hotplugged PCI devices on some IBM systems. The resource mis-allocation not only makes the PCI device unuseable but often makes the entire system unuseable due to resulting machine checks. The hotplug capable PCI slots on the affected systems are not located under a standard P2P bridge but are instead located under PCI root bridges or subtractive decode P2P bridges. For example, the IBM x3850 contains 2 hotplug capable PCI-X slots and 4 hotplug capable PCIe slots with the PCI-X slots each located under a PCI root bridge and the PCIe slots each located under a subtractive decode P2P bridge. The current i386/x86_64 PCI resource allocation code does not use _CRS returned resource information. No other resource information source is available for slots that are not below a standard P2P bridge so incorrect ranges are being allocated from e820 hole causing the bad result. This patch causes the kernel to use _CRS returned resource info. It is roughly based on a change provided by Matthew Wilcox for the ia64 kernel in 2005. Due to possible buggy BIOS factor and possible yet to be discovered kernel issues the function is disabled by default and can be enabled with pci=use_crs. Signed-off-by: Gary Hade <gary.hade@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12PCI: avoid P2P prefetch window for expansion ROMsGary Hade
Avoid creating P2P prefetch window for expansion ROMs Because of the future possibility that P2P prefetch windows will contain address ranges above 4GB some BIOSes are providing space in the P2P non-prefetch windows for expansion ROMs. This is due to expansion ROM BAR 32-bit limitation. When expansion ROM BARs without BIOS assigned address(es) are currently found behind a P2P bridge, the kernel attempts to create a P2P prefetch window for them even though space for them has already been provided in the non-prefetch window. _CRS on some systems with certain resource conservation conscious BIOSes may not provide the extra 1MB or more memory resource needed for the expansion ROM motivated prefetch window causing resource allocation errors. This change corrects the problem by removing IORESOURCE_PREFETCH from the expansion ROM flags initialization. It also removes IORESOURCE_CACHEABLE which seems inappropriate if only non-cacheable memory is available. Signed-off-by: Gary Hade <gary.hade@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>