aboutsummaryrefslogtreecommitdiff
path: root/include/asm-x86_64
AgeCommit message (Collapse)Author
2006-02-08[PATCH] x86-64: Add sys_unshareAndi Kleen
Add unshare syscall for x86-64 ppoll/pselect are not ready yet, but add reservations. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-07[PATCH] x86_64: Fix the node cpumask of a cpu going downRavikiran G Thirumalai
Currently, x86_64 and ia64 arches do not clear the corresponding bits in the node's cpumask when a cpu goes down or cpu bring up is cancelled. This is buggy since there are pieces of common code where the cpumask is checked in the cpu down code path to decide on things (like in the slab down path). PPC does the right thing, but x86_64 and ia64 don't (This was the reason Sonny hit upon a slab bug during cpu offline on ppc and could not reproduce on other arches). This patch fixes it for x86_64. I won't attempt ia64 as I cannot test it. Credit for spotting this should go to Alok. (akpm: this was applied, then reverted. But it's OK now because we now use for_each_cpu() in the right places). Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05[PATCH] Fix "value computed is not used" compile warnings with gcc-4.1Takashi Iwai
Fix gcc4.1 compile warnings "value computed is not used" with set_current_state() and set_task_state() on i386/SMP and x86-64. Signed-off-by: Takashi Iwai <tiwai@suse.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05Revert "[PATCH] x86_64: Fix the node cpumask of a cpu going down"Linus Torvalds
This reverts commit 10f4dc8b27ac42f930ac55adb8c521264dc997f8. Quoth Andi Kleen: "Kiran decided that it makes the problem worse than it was before. Fixing it fully requires more work which is too much for 2.6.16. So please revert that commit for now." Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] i386/x86-64: Don't ack the APIC for bad interrupts when the APIC is ↵Andi Kleen
not enabled It's bad juju to touch the APIC when it hasn't been enabled. I also moved ack_bad_irq for x86-64 out of line following i386. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Calibrate APIC timer using PM timerAndi Kleen
On some broken motherboards (at least one NForce3 based AMD64 laptop) the PIT timer runs at a incorrect frequency. This patch adds a new option "apicpmtimer" that allows to use the APIC timer and calibrate it using the PMTimer. It requires the earlier patch that allows to run the main timer from the APIC. Specifying apicpmtimer implies apicmaintimer. The option defaults to off for now. I tested it on a few systems and the resulting APIC timer frequencies were usually a bit off, but always <1%, which should be tolerable. TBD figure out heuristic to enable this automatically on the affected systems TBD perhaps do it on all NForce3s or using DMI? Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Fix the node cpumask of a cpu going downRavikiran G Thirumalai
Currently, x86_64 and ia64 arches do not clear the corresponding bits in the node's cpumask when a cpu goes down or cpu bring up is cancelled. This is buggy since there are pieces of common code where the cpumask is checked in the cpu down code path to decide on things (like in the slab down path). PPC does the right thing, but x86_64 and ia64 don't (This was the reason Sonny hit upon a slab bug during cpu offline on ppc and could not reproduce on other arches). This patch fixes it for x86_64. I won't attempt ia64 as I cannot test it. Credit for spotting this should go to Alok. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Undo the earlier changes to remove unrolled copy/memset ↵Andi Kleen
functions They cause quite bad performance regressions on Netburst This is temporary until we can get new optimized functions for these CPUs. This undoes changes that were done in 2.6.15 and in 2.6.16-rc1, essentially bringing the code back to 2.6.14 level. Only change is I renamed the X86_FEATURE_K8_C flag to X86_FEATURE_REP_GOOD and fixed the check for the flag and also fixed some comments. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: [PATCH] timer resumeShaohua Li
At resume time, TSC's value or something similar might be changed a lot against suspend time. This could make system gets a very big lost ticks. See http://bugzilla.kernel.org/show_bug.cgi?id=5825 Signed-off-by: Shaohua Li<shaohua.li@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Allow to run main time keeping from the local APIC interruptAndi Kleen
Another piece from the no-idle-tick patch. This can be enabled with the "apicmaintimer" option. This is mainly useful when the PIT/HPET interrupt is unreliable. Note there are some systems that are known to stop the APIC timer in C3. For those it will never work, but this case should be automatically detected. It also only works with PM timer right now. When HPET is used the way the main timer handler computes the delay doesn't work. It should be a bit more efficient because there is one less regular interrupt to process on the boot processor. Requires earlier bugfix from Venkatesh Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Define pmtmr_ioport to 0 when PM_TIMER is not availableAndi Kleen
Avoids some ifdef mess later. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-03[PATCH] Compilation of kexec/kdump brokenFernando Luis Vazquez Cao
The compilation of kexec/kdump seems to be broken for x86_64. Remove the dependency of kexec on CONFIG_IA32_EMULATION. Signed-off-by: Fernando Vazquez <fernando@intellilink.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-03[PATCH] Export cpu topology in sysfsZhang, Yanmin
The patch implements cpu topology exportation by sysfs. Items (attributes) are similar to /proc/cpuinfo. 1) /sys/devices/system/cpu/cpuX/topology/physical_package_id: represent the physical package id of cpu X; 2) /sys/devices/system/cpu/cpuX/topology/core_id: represent the cpu core id to cpu X; 3) /sys/devices/system/cpu/cpuX/topology/thread_siblings: represent the thread siblings to cpu X in the same core; 4) /sys/devices/system/cpu/cpuX/topology/core_siblings: represent the thread siblings to cpu X in the same physical package; To implement it in an architecture-neutral way, a new source file, driver/base/topology.c, is to export the 5 attributes. If one architecture wants to support this feature, it just needs to implement 4 defines, typically in file include/asm-XXX/topology.h. The 4 defines are: #define topology_physical_package_id(cpu) #define topology_core_id(cpu) #define topology_thread_siblings(cpu) #define topology_core_siblings(cpu) The type of **_id is int. The type of siblings is cpumask_t. To be consistent on all architectures, the 4 attributes should have deafult values if their values are unavailable. Below is the rule. 1) physical_package_id: If cpu has no physical package id, -1 is the default value. 2) core_id: If cpu doesn't support multi-core, its core id is 0. 3) thread_siblings: Just include itself, if the cpu doesn't support HT/multi-thread. 4) core_siblings: Just include itself, if the cpu doesn't support multi-core and HT/Multi-thread. So be careful when declaring the 4 defines in include/asm-XXX/topology.h. If an attribute isn't defined on an architecture, it won't be exported. Thank Nathan, Greg, Andi, Paul and Venki. The patch provides defines for i386/x86_64/ia64. Signed-off-by: Zhang, Yanmin <yanmin.zhang@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
2006-02-01[PATCH] prototypes for *at functions & typo fixUlrich Drepper
Here's the follow-up patch which introduces the prototypes for the new syscalls. There was also a typo in one of the new symbols. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-24[ACPI] merge 3549 4320 4485 4588 4980 5483 5651 acpica asus fops pnpacpi ↵Len Brown
branches into release Signed-off-by: Len Brown <len.brown@intel.com>
2006-01-18[PATCH] EDAC: core EDAC support codeAlan Cox
This is a subset of the bluesmoke project core code, stripped of the NMI work which isn't ready to merge and some of the "interesting" proc functionality that needs reworking or just has no place in kernel. It requires no core kernel changes except the added scrub functions already posted. The goal is to merge further functionality only after the core code is accepted and proven in the base kernel, and only at the point the upstream extras are really ready to merge. From: doug thompson <norsk5@xmission.com> This converts EDAC to sysfs and is the final chunk neccessary before EDAC has a stable user space API and can be considered for submission into the base kernel. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: doug thompson <norsk5@xmission.com> Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18[PATCH] EDAC: atomic scrub operationsAlan Cox
EDAC requires a way to scrub memory if an ECC error is found and the chipset does not do the work automatically. That means rewriting memory locations atomically with respect to all CPUs _and_ bus masters. That means we can't use atomic_add(foo, 0) as it gets optimised for non-SMP This adds a function to include/asm-foo/atomic.h for the platforms currently supported which implements a scrub of a mapped block. It also adjusts a few other files include order where atomic.h is included before types.h as this now causes an error as atomic_scrub uses u32. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18[PATCH] vfs: *at functions: x86_64Ulrich Drepper
Wire up the x86_64 syscalls. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Fix VSMP buildRavikiran G Thirumalai
Patch fixes a build problem with CONFIG_X86_VSMP. The vSMP bits probably gathered some fuzz on its way to mainline, and safe_halt() which was outside the #endif (CONFIG_X86_VSMP) somehow got inside the !CONFIG_X86_VSMP condition, hence being undefined and breaking CONFIG_X86_VSMP builds. Patch takes safe_halt() and halt() macros out of the #endif Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Flexmap for 32bit and randomized mappings for 64bitAndi Kleen
Another try at this. For 32bit follow the 32bit implementation from Ingo - mappings are growing down from the end of stack now and vary randomly by 1GB. Randomized mappings for 64bit just vary the normal mmap break by 1TB. I didn't bother implementing full flex mmap for 64bit because it shouldn't be needed there. Cc: mingo@elte.hu Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Increase NR_IRQ_VECTORS to 32 * NR_CPUSAndi Kleen
This prevents running out of GSIs on large Unisys ES7000 machines. Follows i386 Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Allow nesting of int3 by default for kprobesAndi Kleen
This unbreaks recursive kprobes which didn't work anymore due to an earlier patch which converted the debug entry point to use an IST. This also allows nesting of the debug entry point too. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-14[PATCH] mark several functions __always_inlineIngo Molnar
Arjan van de Ven <arjan@infradead.org> Mark a number of functions as 'must inline'. The functions affected by this patch need to be inlined because they use knowledge that their arguments are constant so that most of the function optimizes away. At this point this patch does not change behavior, it's for documentation only (and for future patches in the inline series) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12[PATCH] death of get_thread_info/put_thread_infoAl Viro
{get,put}_thread_info() were introduced in 2.5.4 and never had been called by anything in the tree. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12[PATCH] amd64: task_pt_regs()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12[PATCH] amd64: task_thread_info()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12[PATCH] scheduler cache-hot-autodetectakpm@osdl.org
) From: Ingo Molnar <mingo@elte.hu> This is the latest version of the scheduler cache-hot-auto-tune patch. The first problem was that detection time scaled with O(N^2), which is unacceptable on larger SMP and NUMA systems. To solve this: - I've added a 'domain distance' function, which is used to cache measurement results. Each distance is only measured once. This means that e.g. on NUMA distances of 0, 1 and 2 might be measured, on HT distances 0 and 1, and on SMP distance 0 is measured. The code walks the domain tree to determine the distance, so it automatically follows whatever hierarchy an architecture sets up. This cuts down on the boot time significantly and removes the O(N^2) limit. The only assumption is that migration costs can be expressed as a function of domain distance - this covers the overwhelming majority of existing systems, and is a good guess even for more assymetric systems. [ People hacking systems that have assymetries that break this assumption (e.g. different CPU speeds) should experiment a bit with the cpu_distance() function. Adding a ->migration_distance factor to the domain structure would be one possible solution - but lets first see the problem systems, if they exist at all. Lets not overdesign. ] Another problem was that only a single cache-size was used for measuring the cost of migration, and most architectures didnt set that variable up. Furthermore, a single cache-size does not fit NUMA hierarchies with L3 caches and does not fit HT setups, where different CPUs will often have different 'effective cache sizes'. To solve this problem: - Instead of relying on a single cache-size provided by the platform and sticking to it, the code now auto-detects the 'effective migration cost' between two measured CPUs, via iterating through a wide range of cachesizes. The code searches for the maximum migration cost, which occurs when the working set of the test-workload falls just below the 'effective cache size'. I.e. real-life optimized search is done for the maximum migration cost, between two real CPUs. This, amongst other things, has the positive effect hat if e.g. two CPUs share a L2/L3 cache, a different (and accurate) migration cost will be found than between two CPUs on the same system that dont share any caches. (The reliable measurement of migration costs is tricky - see the source for details.) Furthermore i've added various boot-time options to override/tune migration behavior. Firstly, there's a blanket override for autodetection: migration_cost=1000,2000,3000 will override the depth 0/1/2 values with 1msec/2msec/3msec values. Secondly, there's a global factor that can be used to increase (or decrease) the autodetected values: migration_factor=120 will increase the autodetected values by 20%. This option is useful to tune things in a workload-dependent way - e.g. if a workload is cache-insensitive then CPU utilization can be maximized by specifying migration_factor=0. I've tested the autodetection code quite extensively on x86, on 3 P3/Xeon/2MB, and the autodetected values look pretty good: Dual Celeron (128K L2 cache): --------------------- migration cost matrix (max_cache_size: 131072, cpu: 467 MHz): --------------------- [00] [01] [00]: - 1.7(1) [01]: 1.7(1) - --------------------- cacheflush times [2]: 0.0 (0) 1.7 (1784008) --------------------- Here the slow memory subsystem dominates system performance, and even though caches are small, the migration cost is 1.7 msecs. Dual HT P4 (512K L2 cache): --------------------- migration cost matrix (max_cache_size: 524288, cpu: 2379 MHz): --------------------- [00] [01] [02] [03] [00]: - 0.4(1) 0.0(0) 0.4(1) [01]: 0.4(1) - 0.4(1) 0.0(0) [02]: 0.0(0) 0.4(1) - 0.4(1) [03]: 0.4(1) 0.0(0) 0.4(1) - --------------------- cacheflush times [2]: 0.0 (33900) 0.4 (448514) --------------------- Here it can be seen that there is no migration cost between two HT siblings (CPU#0/2 and CPU#1/3 are separate physical CPUs). A fast memory system makes inter-physical-CPU migration pretty cheap: 0.4 msecs. 8-way P3/Xeon [2MB L2 cache]: --------------------- migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz): --------------------- [00] [01] [02] [03] [04] [05] [06] [07] [00]: - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [01]: 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [02]: 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) [03]: 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) 19.2(1) [04]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) 19.2(1) [05]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) 19.2(1) [06]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - 19.2(1) [07]: 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) 19.2(1) - --------------------- cacheflush times [2]: 0.0 (0) 19.2 (19281756) --------------------- This one has huge caches and a relatively slow memory subsystem - so the migration cost is 19 msecs. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Cc: <wilder@us.ibm.com> Signed-off-by: John Hawkes <hawkes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12[PATCH] sched: add cacheflush() asmIngo Molnar
Add per-arch sched_cacheflush() which is a write-back cacheflush used by the migration-cost calibration code at bootup time. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Some housekeeping in local APIC codeAndi Kleen
Remove support for obsolete hardware and cleanup. - Remove checks for non integrated APICs - Replace apic_write_around with apic_write. - Remove apic_read_around - Remove APIC version reads used by old workarounds - Remove old workaround for Simics - Fix indentation Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Display meaningful part of filename during BUG()Jan Beulich
When building in a separate objtree, file names produced by BUG() & Co. can get fairly long; printing only the first 50 characters may thus result in (almost) no useful information. The following change makes it so that rather the last 50 characters of the filename get printed. Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove unused AMD K8 C stepping flagAndi Kleen
X86_FEATURE_K8_C was a synthetic Linux CPUID flag that was used for some code optimizations in Opteron C stepping or later. But support for pre C stepping optimizations has been removed, so this isn't needed anymore. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: sparse warning cleanupsStephen Hemminger
Fix some trivial sparse warnings in x86_64 code. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Move NUMA page_to_pfn/pfn_to_page functions out of lineAndi Kleen
Saves about ~18K .text in defconfig There would be more optimization potential, but that's for later. Suggestion originally from Bill Irwin. Fix from Andy Whitcroft. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove unused segmentsAndi Kleen
They used to be used by the reboot code, but not anymore. Noticed by Jan Beulich Cc: JBeulich@novell.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Inclusion of ScaleMP vSMP architecture patches - vsmp_archRavikiran G Thirumalai
Introduce vSMP arch to the kernel. This patch: 1. Adds CONFIG_X86_VSMP 2. Adds machine specific macros for local_irq_disabled, local_irq_enabled and irqs_disabled 3. Writes to the vSMP CTL device to indicate kernel compiled with CONFIG_VSMP Signed-off-by: Ravikiran Thirumalai <kiran@scalemp.com> Signed-off-by: Shai Fultheim <shai@scalemp.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Inclusion of ScaleMP vSMP architecture patches - vsmp_alignRavikiran G Thirumalai
vSMP specific alignment patch to 1. Define INTERNODE_CACHE_SHIFT for vSMP 2. Use this for alignment of critical structures 3. Use INTERNODE_CACHE_SHIFT for ARCH_MIN_TASKALIGN, and let the slab align task_struct allocations to the internode cacheline size 4. Introduce and use ARCH_MIN_MMSTRUCT_ALIGN for mm_struct slab allocations. Signed-off-by: Ravikiran Thirumalai <kiran@scalemp.com> Signed-off-by: Shai Fultheim <shai@scalemp.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Make sure BITS_PER_ATOMIC is defined in asm-generic/atomic.hAndi Kleen
Fixes CC fs/nfsctl.o In file included from include2/asm/atomic.h:427, from /home/lsrc/quilt/linux/include/linux/file.h:8, from /home/lsrc/quilt/linux/fs/nfsctl.c:8: /home/lsrc/quilt/linux/include/asm-generic/atomic.h:20:5: warning: "BITS_PER_LONG" is not defined Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: cleanup enter_lazy_tlb()Brian Gerst
Move the #ifdef into the function body. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove useless KDB vectorAndi Kleen
It was set as an NMI, but the NMI bit always forces an interrupt to end up at vector 2. So it was never used. Remove. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Don't claim too many vectors for TLB flushingJason Uhlenkott
It looks like the new scalable TLB flush code for x86_64 is claiming one more IRQ vector than it actually uses. Signed-off-by: Jason Uhlenkott <jasonuhl@sgi.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Allocate PDAs in the local nodeRavikiran G Thirumalai
Patch uses a static PDA array early at boot and reallocates processor PDA with node local memory when kmalloc is ready, just before pda_init. The boot_cpu_pda is needed since the cpu_pda is used even before pda_init for that cpu is called (to set the static per-cpu areas offset table etc) Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Node local pda take 2 -- cpu_pda preparationRavikiran G Thirumalai
Helper patch to change cpu_pda users to use macros to access cpu_pda instead of the cpu_pda[] array. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Early initialization of cpu_to_nodeRavikiran Thirumalai
Patch enables early intialization of cpu_to_node. apicid_to_node is built by reading the SRAT table, from acpi_numa_init with ACPI_NUMA and k8_scan_nodes with K8_NUMA. x86_cpu_to_apicid is built by parsing the ACPI MADT table, from acpi_boot_init. We combine these two tables and setup cpu_to_node. Early intialization helps the static per_cpu_areas in getting pages from correct node. Change since last release: Do not initialize early init_cpu_to_node for faking node cases. Patch tested on TYAN dual core 4P board with K8 only, ACPI_NUMA. Tested on EM64T NUMA. Also tested with numa=off, numa=fake, and running a kernel compiled with NUMA on a regular EM64 2 way SMP. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: On Intel CPUs don't do an additional CPU sync before RDTSCAndi Kleen
RDTSC serialization using cpuid is not needed for Intel platforms. This increases gettimeofday performance. Cc: vojtech@suse.cz Cc: rohit.seth@intel.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Support alternative() with a output argument.Andi Kleen
Needed for follow on patches Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Don't try to synchronize the TSC over CPUs on Intel CPUs at ↵Andi Kleen
boot. They already do this in hardware and the Linux algorithm actually adds errors. Cc: mingo@elte.hu Cc: rohit.seth@intel.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Fix compile error with !CONFIG_COMPATAndi Kleen
cpumask.h wasn't included implicitely into proto.h in this case. Just move it over to smp.h Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: x86_64 write apic id fixVivek Goyal
o Apic id is in most significant 8 bits of APIC_ID register. Current code is trying to write apic id to least significant 8 bits. This patch fixes it. o This fix enables booting uni kdump capture kernel on a cpu with non-zero apic id. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove unused apic_write_atomicAndi Kleen
This function is never used for x86_64. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>