Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2008-04-21	block: fix memory hotplug and bouncing in block layer	Andi Kleen
	Only noticed this while hacking something else, no test case. blk_max_low_pfn is initialized once at bootup by the block layer from max_low_pfn. But max_low_pfn is not necessarily constant over the runtime of the system when you consider memory hotplug. What could happen if that someone adds memory later the block layer wouldn't get updated and then start bouncing memory unnecessarily. Also on 64bit blk_max_low_pfn actually isn't needed because it just disables bouncing essentially and there is no highmem. And nobody can pass pfns > max_low_pfn to the block layer, because those wouldn't have a struct page and I suspect block layer wouldn't be very happy without that. So set BLK_BOUNCE_HIGH to infinity (-1ULL) on 64bit. That avoids the problem of having to update it on memory hotadd. On 32bit I kept the same behaviour because at least on i386 memory hotadd only adds HIGHMEM, never lowmem. BLK_BOUNCE_ANY is always set to infinity on both 32 and 64bit. Signed-off-by: Andi Kleen <ak@suse.de> Cc: Jens Axboe <jens.axboe@oracle.com> Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21	block: move the padding adjustment to blk_rq_map_sg	FUJITA Tomonori
	blk_rq_map_user adjusts bi_size of the last bio. It breaks the rule that req->data_len (the true data length) is equal to sum(bio). It broke the scsi command completion code. commit e97a294ef6938512b655b1abf17656cf2b26f709 was introduced to fix the above issue. However, the partial completion code doesn't work with it. The commit is also a layer violation (scsi mid-layer should not know about the block layer's padding). This patch moves the padding adjustment to blk_rq_map_sg (suggested by James). The padding works like the drain buffer. This patch breaks the rule that req->data_len is equal to sum(sg), however, the drain buffer already broke it. So this patch just restores the rule that req->data_len is equal to sub(bio) without breaking anything new. Now when a low level driver needs padding, blk_rq_map_user and blk_rq_map_user_iov guarantee there's enough room for padding. blk_rq_map_sg can safely extend the last entry of a scatter list. blk_rq_map_sg must extend the last entry of a scatter list only for a request that got through bio_copy_user_iov. This patches introduces new REQ_COPY_USER flag. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21	block: convert bio_copy_user to bio_copy_user_iov	FUJITA Tomonori
	This patch enables bio_copy_user to take struct sg_iovec (renamed bio_copy_user_iov). bio_copy_user uses bio_copy_user_iov internally as bio_map_user uses bio_map_user_iov. The major changes are: - adds sg_iovec array to struct bio_map_data - adds __bio_copy_iov that copy data between bio and sg_iovec. bio_copy_user_iov and bio_uncopy_user use it. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21	cdrom: make unregister_cdrom() return void	Akinobu Mita
	Now unregister_cdrom() always returns 0. Make it return void and update all callers that check the return value. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk> Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21	cdrom: use list_head for cdrom_device_info list	Akinobu Mita
	Use list_head for cdrom_device_info list instead of opencoded singly list handling. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21	jiffies: add time_is_after_jiffies and others which compare with jiffies	Dave Young
	Most of time_after like macros usages just compare jiffies and another number, so here add some time_is_* macros for convenience. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-20	PCI: clean up resource alignment management	Ivan Kokshaysky
	Done per Linus' request and suggestions. Linus has explained that better than I'll be able to explain: On Thu, Mar 27, 2008 at 10:12:10AM -0700, Linus Torvalds wrote: > Actually, before we go any further, there might be a less intrusive > alternative: add just a couple of flags to the resource flags field (we > still have something like 8 unused bits on 32-bit), and use those to > implement a generic "resource_alignment()" routine. > > Two flags would do it: > > - IORESOURCE_SIZEALIGN: size indicates alignment (regular PCI device > resources) > > - IORESOURCE_STARTALIGN: start field is alignment (PCI bus resources > during probing) > > and then the case of both flags zero (or both bits set) would actually be > "invalid", and we would also clear the IORESOURCE_STARTALIGN flag when we > actually allocate the resource (so that we don't use the "start" field as > alignment incorrectly when it no longer indicates alignment). > > That wouldn't be totally generic, but it would have the nice property of > automatically at least add sanity checking for that whole "res->start has > the odd meaning of 'alignment' during probing" and remove the need for a > new field, and it would allow us to have a generic "resource_alignment()" > routine that just gets a resource pointer. Besides, I removed IORESOURCE_BUS_HAS_VGA flag which was unused for ages. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20	PCI: Expose PCI VPD through sysfs	Ben Hutchings
	Vital Product Data (VPD) may be exposed by PCI devices in several ways. It is generally unsafe to read this information through the existing interfaces to user-land because of stateful interfaces. This adds: - abstract operations for VPD access (struct pci_vpd_ops) - VPD state information in struct pci_dev (struct pci_vpd) - an implementation of the VPD access method specified in PCI 2.2 (in access.c) - a 'vpd' binary file in sysfs directories for PCI devices with VPD operations defined It adds a probe for PCI 2.2 VPD in pci_scan_device() and release of VPD state in pci_release_dev(). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20	PCI: add generic pci_enable_resources()	Bjorn Helgaas
	Each architecture has its own pcibios_enable_resources() implementation. These differ in many minor ways that have nothing to do with actual architectural differences. Follow-on patches will make most arches use this generic version instead. This version is based on powerpc, which seemed most up-to-date. The only functional difference from the x86 version is that this uses "!r->parent" to check for resource collisions instead of "!r->start && r->end". Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20	PCI: add PCI Express ASPM support	Shaohua Li
	PCI Express ASPM defines a protocol for PCI Express components in the D0 state to reduce Link power by placing their Links into a low power state and instructing the other end of the Link to do likewise. This capability allows hardware-autonomous, dynamic Link power reduction beyond what is achievable by software-only controlled power management. However, The device should be configured by software appropriately. Enabling ASPM will save power, but will introduce device latency. This patch adds ASPM support in Linux. It introduces a global policy for ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control it. The interface can be used as a boot option too. Currently we have below setting: -default, BIOS default setting -powersave, highest power saving mode, enable all available ASPM state and clock power management -performance, highest performance, disable ASPM and clock power management By default, the 'default' policy is used currently. In my test, power difference between powersave mode and performance mode is about 1.3w in a system with 3 PCIE links. Note: some devices might not work well with aspm, either because chipset issue or device issue. The patch provide API (pci_disable_link_state), driver can disable ASPM for specific device. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20	PCI: #if 0 pci_cleanup_aer_correct_error_status()	Adrian Bunk
	#if 0 the no longer used pci_cleanup_aer_correct_error_status(). Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20	PCI: remove global list of PCI devices	Greg Kroah-Hartman
	This patch finally removes the global list of PCI devices. We are relying entirely on the list held in the driver core now, and do not need a separate "shadow" list as no one uses it. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20	PCI: add is_added flag to struct pci_dev	Greg Kroah-Hartman
	This lets us check if the device is really added to the driver core or not, which is what we need when walking some of the bus lists. The flag is there in anticipation of getting rid of the other PCI device list, which is what we used to check in this situation. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20	PCI: clean up search.c a lot	Greg Kroah-Hartman
	This cleans up the search.c file, now using the pci list of devices that are created for the driver core, instead of relying on our separate list of devices. It's better to use the functions already created for this kind of thing, instead of rolling our own all the time. This work is done in anticipation of getting rid of that second list of pci devices all together. And it ends up saving code, always a nice benefit. This also removes one compiler warning for when CONFIG_PCI_LEGACY is enabled as we no longer internally use the deprecated functions anymore. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20	PCI: remove pci_get_device_reverse	Greg Kroah-Hartman
	This removes the pci_get_device_reverse function as there should not be any need to walk pci devices backwards anymore. All users of this call are now gone from the tree, so it is safe to remove it. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20	PCI: remove pci_find_present	Greg Kroah-Hartman
	No one is using this function anymore for quite some time, so remove it. Everyone calls pci_dev_present() instead anyway... Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20	PCI: #if 0 pci_assign_resource_fixed()	Adrian Bunk
	An unused function that bloated the kernel only when CONFIG_EMBEDDED was enabled... Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-21	[CRYPTO] api: Make the crypto subsystem fully modular	Sebastian Siewior
	Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-04-19	SCSI: convert struct class_device to struct device	Tony Jones
	It's big, but there doesn't seem to be a way to split it up smaller... Signed-off-by: Tony Jones <tonyj@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19	memstick: convert struct class_device to struct device	Greg Kroah-Hartman
	struct class_device is going away, struct device should be used instead. Signed-off-by: Tony Jones <tonyj@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Alex Dubov <oakad@yahoo.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19	PM: Remove destroy_suspended_device()	Rafael J. Wysocki
	After 2.6.24 there was a plan to make the PM core acquire all device semaphores during a suspend/hibernation to protect itself from concurrent operations involving device objects. That proved to be too heavy-handed and we found a better way to achieve the goal, but before it happened, we had introduced the functions device_pm_schedule_removal() and destroy_suspended_device() to allow drivers to "safely" destroy a suspended device and we had adapted some drivers to use them. Now that these functions are no longer necessary, it seems reasonable to remove them and modify their users to use the normal device unregistration instead. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19	Firmware: add iSCSI iBFT Support	Konrad Rzeszutek
	Add /sysfs/firmware/ibft/[initiator\|targetX\|ethernetX] directories along with text properties which export the the iSCSI Boot Firmware Table (iBFT) structure. What is iSCSI Boot Firmware Table? It is a mechanism for the iSCSI tools to extract from the machine NICs the iSCSI connection information so that they can automagically mount the iSCSI share/target. Currently the iSCSI information is hard-coded in the initrd. The /sysfs entries are read-only one-name-and-value fields. The usual set of data exposed is: # for a in `find /sys/firmware/ibft/ -type f -print`; do echo -n "$a: "; cat $a; done /sys/firmware/ibft/target0/target-name: iqn.2007.com.intel-sbx44:storage-10gb /sys/firmware/ibft/target0/nic-assoc: 0 /sys/firmware/ibft/target0/chap-type: 0 /sys/firmware/ibft/target0/lun: 00000000 /sys/firmware/ibft/target0/port: 3260 /sys/firmware/ibft/target0/ip-addr: 192.168.79.116 /sys/firmware/ibft/target0/flags: 3 /sys/firmware/ibft/target0/index: 0 /sys/firmware/ibft/ethernet0/mac: 00:11:25:9d:8b:01 /sys/firmware/ibft/ethernet0/vlan: 0 /sys/firmware/ibft/ethernet0/gateway: 192.168.79.254 /sys/firmware/ibft/ethernet0/origin: 0 /sys/firmware/ibft/ethernet0/subnet-mask: 255.255.252.0 /sys/firmware/ibft/ethernet0/ip-addr: 192.168.77.41 /sys/firmware/ibft/ethernet0/flags: 7 /sys/firmware/ibft/ethernet0/index: 0 /sys/firmware/ibft/initiator/initiator-name: iqn.2007-07.com:konrad.initiator /sys/firmware/ibft/initiator/flags: 3 /sys/firmware/ibft/initiator/index: 0 For full details of the IBFT structure please take a look at: ftp://ftp.software.ibm.com/systems/support/system_x_pdf/ibm_iscsi_boot_firmware_table_v1.02.pdf [akpm@linux-foundation.org: fix build] Signed-off-by: Konrad Rzeszutek <konradr@linux.vnet.ibm.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Peter Jones <pjones@redhat.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19	Driver core: make device_is_registered() work for class devices	Greg Kroah-Hartman
	device_is_registered() can use the kobject value for this, so it will now work with devices that are associated with only a class, not a bus and a driver. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19	PM: Convert wakeup flag accessors to inline functions	Alan Stern
	This patch (as1058) improves the wakeup macros in include/linux/pm.h. All but the trivial ones are converted to inline routines, which requires moving them to a separate header file since they depend on the definition of struct device. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19	PM: Make wakeup flags available whenever CONFIG_PM is set	Alan Stern
	The various wakeup flags and their accessor macros in struct dev_pm_info should be available whenever CONFIG_PM is enabled, not just when CONFIG_PM_SLEEP is on. Otherwise remote wakeup won't always be configurable for runtime power management. This patch (as1056b) fixes the oversight. David Brownell adds: More accurately, fixes the "regression" ... as noted sometime last summer, after 296699de6bdc717189a331ab6bbe90e05c94db06 introduced CONFIG_SUSPEND. But that didn't make the regression list for that kernel, ergo the delay in fixing it. [rjw: rebased] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19	PM: Handle device registrations during suspend/resume	Rafael J. Wysocki
	Modify the PM core to protect its data structures, specifically the dpm_active list, from being corrupted if a child of the currently suspending device is registered concurrently with its ->suspend() callback. In that case, since the new device (the child) is added to dpm_active after its parent, the PM core will attempt to suspend it after the parent, which is wrong. Introduce a new member of struct dev_pm_info, called 'sleeping', and use it to check if the parent of the device being added to dpm_active has been suspended, in which case the device registration fails. Also, use 'sleeping' for checking if the ordering of devices on dpm_active is correct. Introduce variable 'all_sleeping' that will be set to 'true' once all devices have been suspended and make new device registrations fail until 'all_sleeping' is reset to 'false', in order to avoid having unsuspended devices around while the system is going into a sleep state. Remove pm_sleep_rwsem which is not necessary any more. Special thanks to Alan Stern for discussions and suggestions that lead to the creation of this patch. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19	sysfs: small header file cleanup for SYSFS=n	David Rientjes
	Convert sysfs_remove_bin_file() to have a return type of 'void' for !CONFIG_SYSFS configurations. Also removes unnecessary colons from empty void functions. Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19	driver core: Convert debug functions declared inline __attribute__((format ↵	Joe Perches
	(printf,x,y) to statement expression macros When DEBUG is not defined, pr_debug and dev_dbg and some other local debugging functions are specified as: "inline __attribute__((format (printf, x, y)))" This is done to validate printk arguments when not debugging. Converting these functions to macros or statement expressions "do { if (0) printk(fmt, ##arg); } while (0)" or "({ if (0) printk(fmt, ##arg); 0; }) makes at least gcc 4.2.2 produce smaller objects. This has the additional benefit of allowing the optimizer to avoid calling functions like print_mac that might have been arguments to the printk. defconfig x86 current: $ size vmlinux text data bss dec hex filename 4716770 474560 618496 5809826 58a6a2 vmlinux all converted: (More patches follow) $ size vmlinux text data bss dec hex filename 4716642 474560 618496 5809698 58a622 vmlinux Even kernel/sched.o, which doesn't even use these functions, becomes smaller. It appears that merely having an indirect include of <linux/device.h> can cause bigger objects. $ size sched.inline.o sched.if0.o text data bss dec hex filename 31385 2854 328 34567 8707 sched.inline.o 31366 2854 328 34548 86f4 sched.if0.o The current preprocessed only kernel/sched.i file contains: # 612 "include/linux/device.h" static inline __attribute__((always_inline)) int __attribute__ ((format (printf, 2, 3))) dev_dbg(struct device dev, const char fmt, ...) { return 0; } # 628 "include/linux/device.h" static inline __attribute__((always_inline)) int __attribute__ ((format (printf, 2, 3))) dev_vdbg(struct device dev, const char fmt, ...) { return 0; } Removing these unused inlines from sched.i shrinks sched.o Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19	driver core: memory: semaphore to mutex	Daniel Walker
	Signed-off-by: Daniel Walker <dwalker@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-19	Merge branch 'master' of ↵	Haavard Skinnemoen
	git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/usba-2.6.26 into base
2008-04-19	Merge branch 'master' of ↵	Haavard Skinnemoen
	git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/tclib into base
2008-04-19	sched: fair-group: de-couple load-balancing from the rb-trees	Peter Zijlstra
	De-couple load-balancing from the rb-trees, so that I can change their organization. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	sched: rt-group: optimize dequeue_rt_stack	Peter Zijlstra
	Now that the group hierarchy can have an arbitrary depth the O(n^2) nature of RT task dequeues will really hurt. Optimize this by providing space to store the tree path, so we can walk it the other way. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	sched: fair-group: SMP-nice for group scheduling	Peter Zijlstra
	Implement SMP nice support for the full group hierarchy. On each load-balance action, compile a sched_domain wide view of the full task_group tree. We compute the domain wide view when walking down the hierarchy, and readjust the weights when walking back up. After collecting and readjusting the domain wide view, we try to balance the tasks within the task_groups. The current approach is a naively balance each task group until we've moved the targeted amount of load. Inspired by Srivatsa Vaddsgiri's previous code and Abhishek Chandra's H-SMP paper. XXX: there will be some numerical issues due to the limited nature of SCHED_LOAD_SCALE wrt to representing a task_groups influence on the total weight. When the tree is deep enough, or the task weight small enough, we'll run out of bits. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Abhishek Chandra <chandra@cs.umn.edu> CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	sched, cpuset: customize sched domains, core	Hidetoshi Seto
	[rebased for sched-devel/latest] - Add a new cpuset file, having levels: sched_relax_domain_level - Modify partition_sched_domains() and build_sched_domains() to take attributes parameter passed from cpuset. - Fill newidle_idx for node domains which currently unused but might be required if sched_relax_domain_level become higher. - We can change the default level by boot option 'relax_domain_level='. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	sched: fix the task_group hierarchy for UID grouping	Peter Zijlstra
	UID grouping doesn't actually have a task_group representing the root of the task_group tree. Add one. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	sched: allow the group scheduler to have multiple levels	Dhaval Giani
	This patch makes the group scheduler multi hierarchy aware. [a.p.zijlstra@chello.nl: rt-parts and assorted fixes] Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	sched: add new set_cpus_allowed_ptr function	Mike Travis
	Add a new function that accepts a pointer to the "newly allowed cpus" cpumask argument. int set_cpus_allowed_ptr(struct task_struct p, const cpumask_t new_mask) The current set_cpus_allowed() function is modified to use the above but this does not result in an ABI change. And with some compiler optimization help, it may not introduce any additional overhead. Additionally, to enforce the read only nature of the new_mask arg, the "const" property is migrated to sub-functions called by set_cpus_allowed. This silences compiler warnings. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	cpumask: add show cpu map functions	Mike Travis
	* Add cpu_sysdev_class functions to display the following maps with cpulist_scnprintf(). cpu_online_map cpu_present_map cpu_possible_map * Small change to include/linux/sysdev.h to allow the attribute name and label to be different (to avoid collision with the "attr_online" entry for bringing cpus on- and off-line.) Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	x86: convert cpumask_of_cpu macro to allocated array	Mike Travis
	* Here is a simple patch to use an allocated array of cpumasks to represent cpumask_of_cpu() instead of constructing one on the stack. It's based on the Kconfig option "HAVE_CPUMASK_OF_CPU_MAP" which is currently only set for x86_64 SMP. Otherwise the the existing cpumask_of_cpu() is used but has been changed to produce an lvalue so a pointer to it can be used. Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	cpumask: add CPU_MASK_ALL_PTR macro	Mike Travis
	* Add a static cpumask_t variable "CPU_MASK_ALL_PTR" to use as a pointer reference to CPU_MASK_ALL. This reduces where possible the instances where CPU_MASK_ALL allocates and fills a large array on the stack. Used only if NR_CPUS > BITS_PER_LONG. * Change init/main.c to use new set_cpus_allowed_ptr(). Depends on: [sched-devel]: sched: add new set_cpus_allowed_ptr function Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	cpumask: reduce stack usage in SD_x_INIT initializers	Mike Travis
	* Remove empty cpumask_t (and all non-zero/non-null) variables in SD__INIT macros. Use memset(0) to clear. Also, don't inline the initializer functions to save on stack space in build_sched_domains(). Merge change to include/linux/topology.h that uses the new node_to_cpumask_ptr function in the nr_cpus_node macro into this patch. Depends on: [mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch [sched-devel]: sched: add new set_cpus_allowed_ptr function Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	generic: reduce stack pressure in sched_affinity	Mike Travis
	* Modify sched_affinity functions to pass cpumask_t variables by reference instead of by value. * Use new set_cpus_allowed_ptr function. Depends on: [sched-devel]: sched: add new set_cpus_allowed_ptr function Cc: Paul Jackson <pj@sgi.com> Cc: Cliff Wickman <cpw@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	cpuset: modify cpuset_set_cpus_allowed to use cpumask pointer	Mike Travis
	* Modify cpuset_cpus_allowed to return the currently allowed cpuset via a pointer argument instead of as the function return value. * Use new set_cpus_allowed_ptr function. * Cleanup CPU_MASK_ALL and NODE_MASK_ALL uses. Depends on: [sched-devel]: sched: add new set_cpus_allowed_ptr function Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	cpumask: add cpumask_scnprintf_len function	Mike Travis
	Add a new function cpumask_scnprintf_len() to return the number of characters needed to display "len" cpumask bits. The current method of allocating NR_CPUS bytes is incorrect as what's really needed is 9 characters per 32-bit word of cpumask bits (8 hex digits plus the seperator [','] or the terminating NULL.) This function provides the caller the means to allocate the correct string length. Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	sched: rt-group: synchonised bandwidth period	Peter Zijlstra
	Various SMP balancing algorithms require that the bandwidth period run in sync. Possible improvements are moving the rt_bandwidth thing into root_domain and keeping a span per rt_bandwidth which marks throttled cpus. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	time: add ns_to_ktime()	Ingo Molnar
	Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	sched: remove sysctl_sched_batch_wakeup_granularity	Ingo Molnar
	it's unused. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19	generic, x86: add prctl commands PR_GET_TSC and PR_SET_TSC	Erik Bosman
	This patch adds prctl commands that make it possible to deny the execution of timestamp counters in userspace. If this is not implemented on a specific architecture, prctl will return -EINVAL. ned-off-by: Erik Bosman <ejbosman@cs.vu.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-19	x86: pageattr.c fix shadowed variable warning	Harvey Harrison
	irqs_disabled() uses flags internally, use _flags to avoid shadowing code calling into this macro. Introduced between 2.6.25-rc3 and -rc4 Fixes the sparse warning: arch/x86/mm/pageattr.c:383:21: warning: symbol 'flags' shadows an earlier one arch/x86/mm/pageattr.c:369:16: originally declared here Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>