aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2006-08-03PCI: docking station: remove dock ueventsKristen Carlson Accardi
Remove uevent dock notifications. There are no consumers of these events at present, and uevents are likely not the correct way to send this type of event anyway. Until I get some kind of idea if anyone in userspace cares about dock events, I will just not send any. Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-03PCI: Unhide the SMBus on Asus PU-DLSJean Delvare
Unhide the SMBus controller on the Asus PU-DLS board. This fixes bug #6763. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-03[PATCH] don't bother with aux entires for dummy contextAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-08-03[PATCH] mark context of syscall entered with no rules as dummyAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-08-03[PATCH] introduce audit rules counterAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-08-03[PATCH] fix missed create event for directory auditAmy Griffis
When an object is created via a symlink into an audited directory, audit misses the event due to not having collected the inode data for the directory. Modify __audit_inode_child() to copy the parent inode data if a parent wasn't found in audit_names[]. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-08-03[PATCH] fix faulty inode data collection for open() with O_CREATAmy Griffis
When the specified path is an existing file or when it is a symlink, audit collects the wrong inode number, which causes it to miss the open() event. Adding a second hook to the open() path fixes this. Also add audit_copy_inode() to consolidate some code. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-08-02Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits) [NET]: Fix more per-cpu typos [SECURITY]: Fix build with CONFIG_SECURITY disabled. [I/OAT]: Remove CPU hotplug lock from net_dma_rebalance [DECNET]: Fix for routing bug [AF_UNIX]: Kernel memory leak fix for af_unix datagram getpeersec patch [NET]: skb_queue_lock_key() is no longer used. [NET]: Remove lockdep_set_class() call from skb_queue_head_init(). [IPV6]: SNMPv2 "ipv6IfStatsOutFragCreates" counter error [IPV6]: SNMPv2 "ipv6IfStatsInHdrErrors" counter error [NET]: Kill the WARN_ON() calls for checksum fixups. [NETFILTER]: xt_hashlimit/xt_string: missing string validation [NETFILTER]: SIP helper: expect RTP streams in both directions [E1000]: Convert to netdev_alloc_skb [TG3]: Convert to netdev_alloc_skb [NET]: Add netdev_alloc_skb(). [TCP]: Process linger2 timeout consistently. [SECURITY] secmark: nul-terminate secdata [NET] infiniband: Cleanup ib_addr module to use the netevents [NET]: Core net changes to generate netevents [NET]: Network Event Notifier Mechanism. ...
2006-08-02Revert "[PATCH] USB: move usb_device_class class devices to be real devices"Greg Kroah-Hartman
This reverts c182274ffe1277f4e7c564719a696a37cacf74ea commit because it required a newer version of udev to work properly than what is currently documented in Documentation/Changes. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-02Revert "[PATCH] USB: convert usb class devices to real devices"Greg Kroah-Hartman
This reverts bd00949647ddcea47ce4ea8bb2cfcfc98ebf9f2a commit because it required a newer version of udev to work properly than what is currently documented in Documentation/Changes. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-02usb-storage: Add US_FL_IGNORE_DEVICE flag; ignore ZyXEL G220FDaniel Drake
This patch adds a new unusual_devs flag for when usb-storage needs to ignore a device that it would otherwise claim. We need to ignore the ZyXEL G220F as it is a virtual CDROM drive which includes the windows driver for this USB-WLAN adapter. After the windows driver is installed on a windows system, it converts it into a WLAN adapter (by ejecting the virtual disc). The virtual CDROM is of no interest to Linux users. The zd1211rw driver will automatically perform the eject operation, we just need to ensure that usb-storage does not claim the device. Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: Matthew Dharm <mdharm-usb@one-eyed-alien.net> Signed-off-by: Phil Dibowitz <phil@ipom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-02[SECURITY]: Fix build with CONFIG_SECURITY disabled.David S. Miller
include/linux/security.h: In function ‘security_release_secctx’: include/linux/security.h:2757: warning: ‘return’ with a value, in function returning void Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[AF_UNIX]: Kernel memory leak fix for af_unix datagram getpeersec patchCatherine Zhang
From: Catherine Zhang <cxzhang@watson.ibm.com> This patch implements a cleaner fix for the memory leak problem of the original unix datagram getpeersec patch. Instead of creating a security context each time a unix datagram is sent, we only create the security context when the receiver requests it. This new design requires modification of the current unix_getsecpeer_dgram LSM hook and addition of two new hooks, namely, secid_to_secctx and release_secctx. The former retrieves the security context and the latter releases it. A hook is required for releasing the security context because it is up to the security module to decide how that's done. In the case of Selinux, it's a simple kfree operation. Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NET]: skb_queue_lock_key() is no longer used.Adrian Bunk
Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NET]: Remove lockdep_set_class() call from skb_queue_head_init().Arjan van de Ven
The skb_queue_head_init() function is used both in drivers for private use and in the core networking code. The usage models are vastly set of functions that is only softirq safe; while the driver usage tends to be more limited to a few hardirq safe accessor functions. Rather than annotating all 133+ driver usages, for now just split this lock into a per queue class. This change is obviously safe and probably should make 2.6.18. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NET]: Add netdev_alloc_skb().Christoph Hellwig
Add a dev_alloc_skb variant that takes a struct net_device * paramater. For now that paramater is unused, but I'll use it to allocate the skb from node-local memory in a follow-up patch. Also there have been some other plans mentioned on the list that can use it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02[NETFILTER]: include/linux/netfilter_bridge.h: header cleanupAlexey Dobriyan
Header doesn't use anything from atomic.h. It fixes headers_check warning: include/linux/netfilter_bridge.h requires asm/atomic.h, which does not exist Compile tested on alpha arm i386-up sparc sparc64-up x86_64 alpha-up i386 sparc64 sparc-up x86_64-up Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvbLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (26 commits) V4L/DVB (4380): Bttv: Revert VBI_OFFSET to previous value, it works better V4L/DVB (4379): Videodev: Check return value of class_device_register() correctly V4L/DVB (4373): Correctly handle sysfs error leg file removal in pvrusb2 V4L/DVB (4368): Bttv: use class_device_create_file and handle errors V4L/DVB (4367): Videodev: Handle class_device related errors V4L/DVB (4365): OVERLAY flag were enabled by mistake V4L/DVB (4344): Fix broken dependencies on media Kconfig V4L/DVB (4343): Fix for compilation without V4L1 or V4L1_COMPAT V4L/DVB (4342): Fix ext_controls align on 64 bit architectures V4L/DVB (4341): VIDIOCSMICROCODE were missing on compat_ioctl32 V4L/DVB (4322): Fix dvb-pll autoprobing V4L/DVB (4311): Fix possible dvb-pll oops V4L/DVB (4337): Refine dead code elimination in pvrusb2 V4L/DVB (4323): [budget/budget-av/budget-ci/budget-patch drivers] fixed DMA start/stop code V4L/DVB (4316): Check __must_check warnings V4L/DVB (4314): Set the Auxiliary Byte when tuning LG H06xF in analog mode V4L/DVB (4313): Bugfix for keycode calculation on NPG remotes V4L/DVB (4310): Saa7134: rename dmasound_{init, exit} V4L/DVB (4306): Support non interlaced capture by default for saa713x V4L/DVB (4298): Check all __must_check warnings in bttv. ...
2006-07-31[PATCH] powermac: More powermac backlight fixesMichael Hanselmann
This patch fixes several problems: - The legacy backlight value might be set at interrupt time. Introduced a worker to prevent it from directly calling the backlight code. - via-pmu allows the backlight to be grabbed, in which case we need to prevent other kernel code from changing the brightness. - Don't send PMU requests in via-pmu-backlight when the machine is about to sleep or waking up. - More Kconfig fixes. Signed-off-by: Michael Hanselmann <linux-kernel@hansmi.ch> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] fbdev: statically link the framebuffer notification functionsAntonino A. Daplas
The backlight and lcd subsystems can be notified by the framebuffer layer of blanking events. However, these subsystems, as a whole, can function independently from the framebuffer layer. But in order to enable to the lcd and backlight subsystems, the framebuffer has to be compiled also, effectively sucking in a huge amount of unneeded code. To prevent dependency problems, separate out the framebuffer notification mechanism from the framebuffer layer and permanently link it to the kernel. Signed-off-by: Antonino Daplas <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] Add parentheses around arguments in the SH_DIV macro.Uwe Zeisberger
There is currently no affected user in the tree, but usage is less surprising that way. Signed-off-by: Uwe Zeisberger <Uwe_Zeisberger@digi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] always define IRQ_PER_CPUYoichi Yuasa
Reduce the likelihood of someone accidentally introducing namespace collisions. Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] delay accounting: temporarily enable by defaultShailabh Nagar
Enable delay accounting by default so that feature gets coverage testing without requiring special measures. Earlier, it was off by default and had to be enabled via a boot time param. This patch reverses the default behaviour to improve coverage testing. It can be removed late in the kernel development cycle if its believed users shouldn't have to incur any cost if they don't want delay accounting. Or it can be retained forever if the utility of the stats is deemed common enough to warrant keeping the feature on. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] make taskstats sending completely independent of delay accounting ↵Shailabh Nagar
on/off status Complete the separation of delay accounting and taskstats by ignoring the return value of delay accounting functions that fill in parts of taskstats before it is sent out (either in response to a command or as part of a task exit). Also make delayacct_add_tsk return silently when delay accounting is turned off rather than treat it as an error. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] genirq: {en,dis}able_irq_wake() need refcounting tooDavid Brownell
IRQs need refcounting and a state flag to track whether the the IRQ should be enabled or disabled as a "normal IRQ" source after a series of calls to {en,dis}able_irq(). For shared IRQs, the IRQ must be enabled so long as at least one driver needs it active. Likewise, IRQs need the same support to track whether the IRQ should be enabled or disabled as a "wakeup event" source after a series of calls to {en,dis}able_irq_wake(). For shared IRQs, the IRQ must be enabled as a wakeup source during sleep so long as at least one driver needs it. But right now they _don't have_ that refcounting ... which means sharing a wakeup-capable IRQ can't work correctly in some configurations. This patch adds the refcount and flag mechanisms to set_irq_wake() -- which is what {en,dis}able_irq_wake() call -- and minimal documentation of what the irq wake mechanism does. Drivers relying on the older (broken) "toggle" semantics will trigger a warning; that'll be a handful of drivers on ARM systems. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] Process Events: Fix biarch compatibility issue. use __u64 timestampChandra Seetharaman
Events sent by Process Events Connector from a 64-bit kernel are not binary compatible with a 32-bit userspace program because the "timestamp" field (struct timespec) is not arch independent. This affects the fields that follow "timestamp" as they will be be off by 8 bytes. This is a problem for 32-bit userspace programs running with 64-bit kernels on ppc64, s390, x86-64.. any "biarch" system. Matt had submitted a different solution to lkml as an RFC earlier. We have since switched to a solution recommended by Evgeniy Polyakov. This patch fixes the problem by changing the timestamp to be a __u64, which stores the number of nanoseconds. Tested on a x86_64 system with both 32 bit application and 64 bit application and on a i386 system. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31[PATCH] ext3: avoid triggering ext3_error on bad NFS file handleNeil Brown
The inode number out of an NFS file handle gets passed eventually to ext3_get_inode_block() without any checking. If ext3_get_inode_block() allows it to trigger an error, then bad filehandles can have unpleasant effect - ext3_error() will usually cause a forced read-only remount, or a panic if `errors=panic' was used. So remove the call to ext3_error there and put a matching check in ext3/namei.c where inode numbers are read off storage. [akpm@osdl.org: fix off-by-one error] Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: <stable@kernel.org> Cc: "Stephen C. Tweedie" <sct@redhat.com> Cc: Eric Sandeen <esandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29V4L/DVB (4343): Fix for compilation without V4L1 or V4L1_COMPATMauro Carvalho Chehab
Removed usage of HAVE_V4L1 Including videodev.h will just include videodev2.h if V4L1 is not supported V4L1 code at core drivers will honor CONFIG_V4L1_COMPAT stuff Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-07-29V4L/DVB (4342): Fix ext_controls align on 64 bit architecturesMauro Carvalho Chehab
u64 is aligned as 128bits on x86_64 architetures, requiring an special handling to ioctls that depends on v4l2_ext_control. Let's fix this before ext controls go to kernel mainstream to avoid one more compat32 stuff. Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-07-29Merge branch 'master' into upstream-fixesJeff Garzik
2006-07-28[PATCH] pi-futex: robust-futex exitIngo Molnar
Fix robust PI-futexes to be properly unlocked on unexpected exit. For this to work the kernel has to know whether a futex is a PI or a non-PI one, because the semantics are different. Since the space in relevant glibc data structures is extremely scarce, the best solution is to encode the 'PI' information in bit 0 of the robust list pointer. Existing (non-PI) glibc robust futexes have this bit always zero, so the ABI is kept. New glibc with PI-robust-futexes will set this bit. Further fixes from Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-28[PATCH] ide: option to disable cache flushes for buggy drivesJens Axboe
Some drives claim they support cache flushing, but get seriously confused if you try. Add this option to be able to boot with barriers enabled by default. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-07-26Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg(). [IPV4] ipmr: ip multicast route bug fix. [TG3]: Update version and reldate [TG3]: Handle tg3_init_rings() failures [TG3]: Add tg3_restart_hw() [IPV4]: Clear the whole IPCB, this clears also IPCB(skb)->flags. [IPV6]: Clean skb cb on IPv6 input. [NETFILTER]: Demote xt_sctp to EXPERIMENTAL [NETFILTER]: bridge netfilter: add deferred output hooks to feature-removal-schedule [NETFILTER]: xt_pkttype: fix mismatches on locally generated packets [NETFILTER]: SNMP NAT: fix byteorder confusion [NETFILTER]: conntrack: fix SYSCTL=n compile [NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in nf_reinject [NETFILTER]: H.323 helper: fix possible NULL-ptr dereference
2006-07-26[PATCH] Reorganize the cpufreq cpu hotplug locking to not be totally bizareArjan van de Ven
The patch below moves the cpu hotplugging higher up in the cpufreq layering; this is needed to avoid recursive taking of the cpu hotplug lock and to otherwise detangle the mess. The new rules are: 1. you must do lock_cpu_hotplug() around the following functions: __cpufreq_driver_target __cpufreq_governor (for CPUFREQ_GOV_LIMITS operation only) __cpufreq_set_policy 2. governer methods (.governer) must NOT take the lock_cpu_hotplug() lock in any way; they are called with the lock taken already 3. if your governer spawns a thread that does things, like calling __cpufreq_driver_target, your thread must honor rule #1. 4. the policy lock and other cpufreq internal locks nest within the lock_cpu_hotplug() lock. I'm not entirely happy about how the __cpufreq_governor rule ended up (conditional locking rule depending on the argument) but basically all callers pass this as a constant so it's not too horrible. The patch also removes the cpufreq_governor() function since during the locking audit it turned out to be entirely unused (so no need to fix it) The patch works on my testbox, but it could use more testing (otoh... it can't be much worse than the current code) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-24[NETFILTER]: bridge netfilter: add deferred output hooks to ↵Patrick McHardy
feature-removal-schedule Add bridge netfilter deferred output hooks to feature-removal-schedule and disable them by default. Until their removal they will be activated by the physdev match when needed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[NET]: Correct dev_alloc_skb kerneldocChristoph Hellwig
dev_alloc_skb is designated for RX descriptors, not TX. (Some drivers use it for the latter anyway, but that's a different story) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24[NET]: Remove CONFIG_HAVE_ARCH_DEV_ALLOC_SKBChristoph Hellwig
skbuff.h has an #ifndef CONFIG_HAVE_ARCH_DEV_ALLOC_SKB to allow architectures to reimplement __dev_alloc_skb. It's not set on any architecture and now that we have an architecture-overrideable NET_SKB_PAD there is not point at all to have one either. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24Merge branch 'master' into upstream-fixesJeff Garzik
2006-07-23cpu hotplug: simplify and hopefully fix lockingLinus Torvalds
The CPU hotplug locking was quite messy, with a recursive lock to handle the fact that both the actual up/down sequence wanted to protect itself from being re-entered, but the callbacks that it called also tended to want to protect themselves from CPU events. This splits the lock into two (one to serialize the whole hotplug sequence, the other to protect against the CPU present bitmaps changing). The latter still allows recursive usage because some subsystems (ondemand policy for cpufreq at least) had already gotten too used to the lax locking, but the locking mistakes are hopefully now less fundamental, and we now warn about recursive lock usage when we see it, in the hope that it can be fixed. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-21[NET]: Fix reversed error test in netif_tx_trylockHerbert Xu
A non-zero return value indicates success from spin_trylock, not error. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-19[PATCH] libata: improve EH action and EHI flag handlingTejun Heo
Update ata_eh_about_to_do() and ata_eh_done() to improve EH action and EHI flag handling. * There are two types of EHI flags - one which expires on successful EH and the other which expires on a successful reset. Make this distinction clear. * Unlike other EH actions, reset actions are represented by two EH action masks and a EHI modifier. Implement correct about_to_do/done semantics for resets. That is, prior to reset, related EH info is sucked in from ehi and cleared, and after reset is complete, related EH info in ehc is cleared. These changes improve consistency and remove unnecessary EH actions caused by stale EH action masks and EHI flags. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-07-14Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [VLAN]: __vlan_hwaccel_rx can use the faster ether_compare_addr [PKT_SCHED] HTB: initialize upper bound properly [IPV4]: Clear skb cb on IP input [NET]: Update frag_list in pskb_trim
2006-07-14[PATCH] per-task delay accounting taskstats interface: control exit data ↵Shailabh Nagar
through cpumasks On systems with a large number of cpus, with even a modest rate of tasks exiting per cpu, the volume of taskstats data sent on thread exit can overflow a userspace listener's buffers. One approach to avoiding overflow is to allow listeners to get data for a limited and specific set of cpus. By scaling the number of listeners and/or the cpus they monitor, userspace can handle the statistical data overload more gracefully. In this patch, each listener registers to listen to a specific set of cpus by specifying a cpumask. The interest is recorded per-cpu. When a task exits on a cpu, its taskstats data is unicast to each listener interested in that cpu. Thanks to Andrew Morton for pointing out the various scalability and general concerns of previous attempts and for suggesting this design. [akpm@osdl.org: build fix] Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] per-task delay accounting: avoid send without listenersShailabh Nagar
Don't send taskstats (per-pid or per-tgid) on thread exit when no one is listening for such data. Currently the taskstats interface allocates a structure, fills it in and calls netlink to send out per-pid and per-tgid stats regardless of whether a userspace listener for the data exists (netlink layer would check for that and avoid the multicast). As a result of this patch, the check for the no-listener case is performed early, avoiding the redundant allocation and filling up of the taskstats structures. Signed-off-by: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] delay accounting taskstats interface send tgid onceShailabh Nagar
Send per-tgid data only once during exit of a thread group instead of once with each member thread exit. Currently, when a thread exits, besides its per-tid data, the per-tgid data of its thread group is also sent out, if its thread group is non-empty. The per-tgid data sent consists of the sum of per-tid stats for all *remaining* threads of the thread group. This patch modifies this sending in two ways: - the per-tgid data is sent only when the last thread of a thread group exits. This cuts down heavily on the overhead of sending/receiving per-tgid data, especially when other exploiters of the taskstats interface aren't interested in per-tgid stats - the semantics of the per-tgid data sent are changed. Instead of being the sum of per-tid data for remaining threads, the value now sent is the true total accumalated statistics for all threads that are/were part of the thread group. The patch also addresses a minor issue where failure of one accounting subsystem to fill in the taskstats structure was causing the send of taskstats to not be sent at all. The patch has been tested for stability and run cerberus for over 4 hours on an SMP. [akpm@osdl.org: bugfixes] Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] per-task-delay-accounting: /proc export of aggregated block I/O delaysShailabh Nagar
Export I/O delays seen by a task through /proc/<tgid>/stats for use in top etc. Note that delays for I/O done for swapping in pages (swapin I/O) is clubbed together with all other I/O here (this is not the case in the netlink interface where the swapin I/O is kept distinct) [akpm@osdl.org: printk warning fix] Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] per-task-delay-accounting: delay accounting usage of taskstats interfaceShailabh Nagar
Usage of taskstats interface by delay accounting. Signed-off-by: Shailabh Nagar <nagar@us.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] per-task-delay-accounting: taskstats interfaceShailabh Nagar
Create a "taskstats" interface based on generic netlink (NETLINK_GENERIC family), for getting statistics of tasks and thread groups during their lifetime and when they exit. The interface is intended for use by multiple accounting packages though it is being created in the context of delay accounting. This patch creates the interface without populating the fields of the data that is sent to the user in response to a command or upon the exit of a task. Each accounting package interested in using taskstats has to provide an additional patch to add its stats to the common structure. [akpm@osdl.org: cleanups, Kconfig fix] Signed-off-by: Shailabh Nagar <nagar@us.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] per-task-delay-accounting: cpu delay collection via schedstatsChandra Seetharaman
Make the task-related schedstats functions callable by delay accounting even if schedstats collection isn't turned on. This removes the dependency of delay accounting on schedstats. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] per-task-delay-accounting: sync block I/O and swapin delay collectionShailabh Nagar
Unlike earlier iterations of the delay accounting patches, now delays are only collected for the actual I/O waits rather than try and cover the delays seen in I/O submission paths. Account separately for block I/O delays incurred as a result of swapin page faults whose frequency can be affected by the task/process' rss limit. Hence swapin delays can act as feedback for rss limit changes independent of I/O priority changes. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>