From 64552a50bc80fecb73617336bf197375868faf6e Mon Sep 17 00:00:00 2001
From: Horms <horms@verge.net.au>
Date: Mon, 10 Jul 2006 04:43:58 -0700
Subject: [PATCH] nfs: Update Documentation/nfsroot.txt to include dhcp,
 syslinux and isolinux

* Document the ip command a little differently to make the
  interaction between defaults and autoconfiguration a little clearer
  (I hope)

* Update autoconfiguration the current set of options, including DHCP

* Update the boot methods to add syslinux and isolinux, and remove
  dd of=/dev/fd0 which is no longer supported by linux

* Add a referance to initramfs along side initrd.
  Should the latter and its document be removed some time soon?

* Various cleanups to put the text consistently into the thrid person

* Reformated a bit to fit into 80 columns a bit more nicely

* Should the bootloaders documentation be removed or split
  into a separate documentation, it seems somewhat out of scope

Signed-off-by: Horms <horms@verge.net.au>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/nfsroot.txt | 275 +++++++++++++++++++++++++++-------------------
 1 file changed, 160 insertions(+), 115 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/nfsroot.txt b/Documentation/nfsroot.txt
index d56dc71d943..3cc953cb288 100644
--- a/Documentation/nfsroot.txt
+++ b/Documentation/nfsroot.txt
@@ -4,15 +4,16 @@ Mounting the root filesystem via NFS (nfsroot)
 Written 1996 by Gero Kuhlmann <gero@gkminix.han.de>
 Updated 1997 by Martin Mares <mj@atrey.karlin.mff.cuni.cz>
 Updated 2006 by Nico Schottelius <nico-kernel-nfsroot@schottelius.org>
+Updated 2006 by Horms <horms@verge.net.au>
 
 
-If you want to use a diskless system, as an X-terminal or printer
-server for example, you have to put your root filesystem onto a
-non-disk device. This can either be a ramdisk (see initrd.txt in
-this directory for further information) or a filesystem mounted
-via NFS. The following text describes on how to use NFS for the
-root filesystem. For the rest of this text 'client' means the
+In order to use a diskless system, such as an X-terminal or printer server
+for example, it is necessary for the root filesystem to be present on a
+non-disk device. This may be an initramfs (see Documentation/filesystems/
+ramfs-rootfs-initramfs.txt), a ramdisk (see Documenation/initrd.txt) or a
+filesystem mounted via NFS. The following text describes on how to use NFS
+for the root filesystem. For the rest of this text 'client' means the
 diskless system, and 'server' means the NFS server.
 
 
@@ -21,11 +22,13 @@ diskless system, and 'server' means the NFS server.
 1.) Enabling nfsroot capabilities
     -----------------------------
 
-In order to use nfsroot you have to select support for NFS during
-kernel configuration. Note that NFS cannot be loaded as a module
-in this case. The configuration script will then ask you whether
-you want to use nfsroot, and if yes what kind of auto configuration
-system you want to use. Selecting both BOOTP and RARP is safe.
+In order to use nfsroot, NFS client support needs to be selected as
+built-in during configuration. Once this has been selected, the nfsroot
+option will become available, which should also be selected.
+
+In the networking options, kernel level autoconfiguration can be selected,
+along with the types of autoconfiguration to support. Selecting all of
+DHCP, BOOTP and RARP is safe.
 
 
@@ -33,11 +36,10 @@ system you want to use. Selecting both BOOTP and RARP is safe.
 2.) Kernel command line
     -------------------
 
-When the kernel has been loaded by a boot loader (either by loadlin,
-LILO or a network boot program) it has to be told what root fs device
-to use, and where to find the server and the name of the directory
-on the server to mount as root. This can be established by a couple
-of kernel command line parameters:
+When the kernel has been loaded by a boot loader (see below) it needs to be
+told what root fs device to use. And in the case of nfsroot, where to find
+both the server and the name of the directory on the server to mount as root.
+This can be established using the following kernel command line parameters:
 
 
 root=/dev/nfs
@@ -49,23 +51,21 @@ root=/dev/nfs
 
 nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>]
 
-  If the `nfsroot' parameter is NOT given on the command line, the default
-  "/tftpboot/%s" will be used.
+  If the `nfsroot' parameter is NOT given on the command line,
+  the default "/tftpboot/%s" will be used.
 
-  <server-ip>	Specifies the IP address of the NFS server. If this field
-		is not given, the default address as determined by the
-		`ip' variable (see below) is used. One use of this
-		parameter is for example to allow using different servers
-		for RARP and NFS. Usually you can leave this blank.
+  <server-ip>	Specifies the IP address of the NFS server.
+		The default address is determined by the `ip' parameter
+		(see below). This parameter allows the use of different
+		servers for IP autoconfiguration and NFS.
 
-  <root-dir>	Name of the directory on the server to mount as root. If
-		there is a "%s" token in the string, the token will be
-		replaced by the ASCII-representation of the client's IP
-		address.
+  <root-dir>	Name of the directory on the server to mount as root.
+		If there is a "%s" token in the string, it will be
+		replaced by the ASCII-representation of the client's
+		IP address.
 
   <nfs-options>	Standard NFS options. All options are separated by commas.
-		If the options field is not given, the following defaults
-		will be used:
+		The following defaults are used:
 			port		= as given by server portmap daemon
 			rsize		= 1024
 			wsize		= 1024
@@ -81,129 +81,174 @@ nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>]
 ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>
 
   This parameter tells the kernel how to configure IP addresses of devices
-  and also how to set up the IP routing table. It was originally called `nfsaddrs',
-  but now the boot-time IP configuration works independently of NFS, so it
-  was renamed to `ip' and the old name remained as an alias for compatibility
-  reasons.
+  and also how to set up the IP routing table. It was originally called
+  `nfsaddrs', but now the boot-time IP configuration works independently of
+  NFS, so it was renamed to `ip' and the old name remained as an alias for
+  compatibility reasons.
 
   If this parameter is missing from the kernel command line, all fields are
   assumed to be empty, and the defaults mentioned below apply. In general
-  this means that the kernel tries to configure everything using both
-  RARP and BOOTP (depending on what has been enabled during kernel confi-
-  guration, and if both what protocol answer got in first).
+  this means that the kernel tries to configure everything using
+  autoconfiguration.
+
+  The <autoconf> parameter can appear alone as the value to the `ip'
+  parameter (without all the ':' characters before) in which case auto-
+  configuration is used.
+
+  <client-ip>	IP address of the client.
 
-  <client-ip>	IP address of the client. If empty, the address will either
-		be determined by RARP or BOOTP. What protocol is used de-
-		pends on what has been enabled during kernel configuration
-		and on the <autoconf> parameter. If this parameter is not
-		empty, neither RARP nor BOOTP will be used.
+  		Default:  Determined using autoconfiguration.
 
   <server-ip>	IP address of the NFS server. If RARP is used to determine
 		the client address and this parameter is NOT empty only
-		replies from the specified server are accepted. To use
-		different RARP and NFS server, specify your RARP server
-		here (or leave it blank), and specify your NFS server in
-		the `nfsroot' parameter (see above). If this entry is blank
-		the address of the server is used which answered the RARP
-		or BOOTP request.
-
-  <gw-ip>	IP address of a gateway if the server is on a different
-		subnet. If this entry is empty no gateway is used and the
-		server is assumed to be on the local network, unless a
-		value has been received by BOOTP.
-
-  <netmask>	Netmask for local network interface. If this is empty,
+		replies from the specified server are accepted.
+
+		Only required for for NFS root. That is autoconfiguration
+		will not be triggered if it is missing and NFS root is not
+		in operation.
+
+		Default: Determined using autoconfiguration.
+		         The address of the autoconfiguration server is used.
+
+  <gw-ip>	IP address of a gateway if the server is on a different subnet.
+
+		Default: Determined using autoconfiguration.
+
+  <netmask>	Netmask for local network interface. If unspecified
 		the netmask is derived from the client IP address assuming
-		classful addressing, unless overridden in BOOTP reply.
+		classful addressing.
 
-  <hostname>	Name of the client. If empty, the client IP address is
-		used in ASCII notation, or the value received by BOOTP.
+		Default:  Determined using autoconfiguration.
 
-  <device>	Name of network device to use. If this is empty, all
-		devices are used for RARP and BOOTP requests, and the
-		first one we receive a reply on is configured. If you have
-		only one device, you can safely leave this blank.
+  <hostname>	Name of the client. May be supplied by autoconfiguration,
+  		but its absence will not trigger autoconfiguration.
 
-  <autoconf>	Method to use for autoconfiguration. If this is either
-		'rarp' or 'bootp', the specified protocol is used.
-		If the value is 'both' or empty, both protocols are used
-		so far as they have been enabled during kernel configura-
-		tion. 'off' means no autoconfiguration.
+  		Default: Client IP address is used in ASCII notation.
 
-  The <autoconf> parameter can appear alone as the value to the `ip'
-  parameter (without all the ':' characters before) in which case auto-
-  configuration is used.
+  <device>	Name of network device to use.
+
+		Default: If the host only has one device, it is used.
+			 Otherwise the device is determined using
+			 autoconfiguration. This is done by sending
+			 autoconfiguration requests out of all devices,
+			 and using the device that received the first reply.
 
+  <autoconf>	Method to use for autoconfiguration. In the case of options
+                which specify multiple autoconfiguration protocols,
+		requests are sent using all protocols, and the first one
+		to reply is used.
 
+		Only autoconfiguration protocols that have been compiled
+		into the kernel will be used, regardless of the value of
+		this option.
 
+                  off or none: don't use autoconfiguration (default)
+		  on or any:   use any protocol available in the kernel
+		  dhcp:        use DHCP
+		  bootp:       use BOOTP
+		  rarp:        use RARP
+		  both:        use both BOOTP and RARP but not DHCP
+		               (old option kept for backwards compatibility)
 
-3.) Kernel loader
-    -------------
+                Default: any
 
-To get the kernel into memory different approaches can be used. They
-depend on what facilities are available:
 
 
-3.1)  Writing the kernel onto a floppy using dd:
-	As always you can just write the kernel onto a floppy using dd,
-	but then it's not possible to use kernel command lines at all.
-	To substitute the 'root=' parameter, create a dummy device on any
-	linux system with major number 0 and minor number 255 using mknod:
 
-		mknod /dev/boot255 c 0 255
+3.) Boot Loader
+    ----------
 
-	Then copy the kernel zImage file onto a floppy using dd:
+To get the kernel into memory different approaches can be used.
+They depend on various facilities being available:
 
-		dd if=/usr/src/linux/arch/i386/boot/zImage of=/dev/fd0
 
-	And finally use rdev to set the root device:
+3.1)  Booting from a floppy using syslinux
 
-		rdev /dev/fd0 /dev/boot255
+	When building kernels, an easy way to create a boot floppy that uses
+	syslinux is to use the zdisk or bzdisk make targets which use
+      	and bzimage images respectively. Both targets accept the
+     	FDARGS parameter which can be used to set the kernel command line.
 
-	You can then remove the dummy device /dev/boot255 again. There
-	is no real device available for it.
-	The other two kernel command line parameters cannot be substi-
-	tuted with rdev. Therefore, using this method the kernel will
-	by default use RARP and/or BOOTP, and if it gets an answer via
-	RARP will mount the directory /tftpboot/<client-ip>/ as its
-	root. If it got a BOOTP answer the directory name in that answer
-	is used.
+	e.g.
+	   make bzdisk FDARGS="root=/dev/nfs"
+
+   	Note that the user running this command will need to have
+     	access to the floppy drive device, /dev/fd0
+
+     	For more information on syslinux, including how to create bootdisks
+     	for prebuilt kernels, see http://syslinux.zytor.com/
+
+	N.B: Previously it was possible to write a kernel directly to
+	     a floppy using dd, configure the boot device using rdev, and
+	     boot using the resulting floppy. Linux no longer supports this
+	     method of booting.
+
+3.2) Booting from a cdrom using isolinux
+
+     	When building kernels, an easy way to create a bootable cdrom that
+     	uses isolinux is to use the isoimage target which uses a bzimage
+     	image. Like zdisk and bzdisk, this target accepts the FDARGS
+     	parameter which can be used to set the kernel command line.
+
+	e.g.
+	  make isoimage FDARGS="root=/dev/nfs"
+
+     	The resulting iso image will be arch/<ARCH>/boot/image.iso
+     	This can be written to a cdrom using a variety of tools including
+     	cdrecord.
+
+	e.g.
+	  cdrecord dev=ATAPI:1,0,0 arch/i386/boot/image.iso
+
+     	For more information on isolinux, including how to create bootdisks
+     	for prebuilt kernels, see http://syslinux.zytor.com/
 
 3.2) Using LILO
-	When using LILO you can specify all necessary command line
-	parameters with the 'append=' command in the LILO configuration
-	file. However, to use the 'root=' command you also need to
-	set up a dummy device as described in 3.1 above. For how to use
-	LILO and its 'append=' command please refer to the LILO
-	documentation.
+	When using LILO all the necessary command line parameters may be
+	specified using the 'append=' directive in the LILO configuration
+	file.
+
+	However, to use the 'root=' directive you also need to create
+	a dummy root device, which may be removed after LILO is run.
+
+	mknod /dev/boot255 c 0 255
+
+	For information on configuring LILO, please refer to its documentation.
 
 3.3) Using GRUB
-	When you use GRUB, you simply append the parameters after the kernel
-	specification: "kernel <kernel> <parameters>" (without the quotes).
+	When using GRUB, kernel parameter are simply appended after the kernel
+	specification: kernel <kernel> <parameters>
 
 3.4) Using loadlin
-	When you want to boot Linux from a DOS command prompt without
-	having a local hard disk to mount as root, you can use loadlin.
-	I was told that it works, but haven't used it myself yet. In
-	general you should be able to create a kernel command line simi-
-	lar to how LILO is doing it. Please refer to the loadlin docu-
-	mentation for further information.
+	loadlin may be used to boot Linux from a DOS command prompt without
+	requiring a local hard disk to mount as root. This has not been
+	thoroughly tested by the authors of this document, but in general
+	it should be possible configure the kernel command line similarly
+	to the configuration of LILO.
+
+	Please refer to the loadlin documentation for further information.
 
 3.5) Using a boot ROM
-	This is probably the most elegant way of booting a diskless
-	client. With a boot ROM the kernel gets loaded using the TFTP
-	protocol. As far as I know, no commercial boot ROMs yet
-	support booting Linux over the network, but there are two
-	free implementations of a boot ROM available on sunsite.unc.edu
-	and its mirrors. They are called 'netboot-nfs' and 'etherboot'.
-	Both contain everything you need to boot a diskless Linux client.
+	This is probably the most elegant way of booting a diskless client.
+	With a boot ROM the kernel is loaded using the TFTP protocol. The
+	authors of this document are not aware of any no commercial boot
+	ROMs that support booting Linux over the network. However, there
+	are two free implementations of a boot ROM, netboot-nfs and
+	etherboot, both of which are available on sunsite.unc.edu, and both
+	of which contain everything you need to boot a diskless Linux client.
 
 3.6) Using pxelinux
-	Using pxelinux you specify the kernel you built with
+	Pxelinux may be used to boot linux using the PXE boot loader
+	which is present on many modern network cards.
+
+	When using pxelinux, the kernel image is specified using
 	"kernel <relative-path-below /tftpboot>". The nfsroot parameters
 	are passed to the kernel by adding them to the "append" line.
-	You may perhaps also want to fine tune the console output,
-	see Documentation/serial-console.txt for serial console help.
+	It is common to use serial console in conjunction with pxeliunx,
+	see Documentation/serial-console.txt for more information.
+
+	For more information on isolinux, including how to create bootdisks
+	for prebuilt kernels, see http://syslinux.zytor.com/
 
 
-- 
cgit v1.2.3


From 82a854ec4f46c5fbef11b06bb49078ecc5784a2d Mon Sep 17 00:00:00 2001
From: Urs Thuermann <urs@isnogud.escape.de>
Date: Mon, 10 Jul 2006 04:44:06 -0700
Subject: [PATCH] RCU Documentation fix

Updater should use _rcu variant of list_del().

Signed-off-by: Urs Thuermann <urs@isnogud.escape.de>
Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/RCU/whatisRCU.txt | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 4f41a60e511..318df44259b 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -687,8 +687,9 @@ diff shows how closely related RCU and reader-writer locking can be.
 	+	spin_lock(&listmutex);
 		list_for_each_entry(p, head, lp) {
 			if (p->key == key) {
-				list_del(&p->list);
+	-			list_del(&p->list);
 	-			write_unlock(&listmutex);
+	+			list_del_rcu(&p->list);
 	+			spin_unlock(&listmutex);
 	+			synchronize_rcu();
 				kfree(p);
@@ -736,7 +737,7 @@ Or, for those who prefer a side-by-side listing:
  5   write_lock(&listmutex);            5   spin_lock(&listmutex);
  6   list_for_each_entry(p, head, lp) { 6   list_for_each_entry(p, head, lp) {
  7     if (p->key == key) {             7     if (p->key == key) {
- 8       list_del(&p->list);            8       list_del(&p->list);
+ 8       list_del(&p->list);            8       list_del_rcu(&p->list);
  9       write_unlock(&listmutex);      9       spin_unlock(&listmutex);
                                        10       synchronize_rcu();
 10       kfree(p);                     11       kfree(p);
-- 
cgit v1.2.3


From 5d8b2ebfa298ec4e6d9fa43e60fb013e8cd963aa Mon Sep 17 00:00:00 2001
From: Jonathan Corbet <corbet@lwn.net>
Date: Mon, 10 Jul 2006 04:44:07 -0700
Subject: [PATCH] VFS documentation tweak

As I was looking over the get_sb() changes, I stumbled across a little
mistake in the documentation updates.  Unless we're getting into an
interesting new object-oriented realm, I doubt that get_sb() should really
return "struct int"...

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/filesystems/Locking | 4 ++--
 Documentation/filesystems/vfs.txt | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index d31efbbdfe5..247d7f619aa 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -142,8 +142,8 @@ see also dquot_operations section.
 
 --------------------------- file_system_type ---------------------------
 prototypes:
-	struct int (*get_sb) (struct file_system_type *, int,
-			const char *, void *, struct vfsmount *);
+	int (*get_sb) (struct file_system_type *, int,
+		       const char *, void *, struct vfsmount *);
 	void (*kill_sb) (struct super_block *);
 locking rules:
 		may block	BKL
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 9d3aed628bc..1cb7e8be927 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -113,8 +113,8 @@ members are defined:
 struct file_system_type {
 	const char *name;
 	int fs_flags;
-        struct int (*get_sb) (struct file_system_type *, int,
-                              const char *, void *, struct vfsmount *);
+        int (*get_sb) (struct file_system_type *, int,
+                       const char *, void *, struct vfsmount *);
         void (*kill_sb) (struct super_block *);
         struct module *owner;
         struct file_system_type * next;
-- 
cgit v1.2.3


From 49c0dab7e6000888b616bedcbbc8cd4710331610 Mon Sep 17 00:00:00 2001
From: Doug Thompson <norsk5@xmission.com>
Date: Mon, 10 Jul 2006 04:45:19 -0700
Subject: [PATCH] Fix and enable EDAC sysfs operation

When EDAC was first introduced into the kernel it had a sysfs interface,
but due to some problems it was disabled in 2.6.16 and remained disabled in
2.6.17.

With feedback, several of the control and attribute files of that interface
had some good constructive feedback.  PCI Blacklist/Whitelist was a major
set which has design issues and it has been removed in this patch.  Instead
of storing PCI broken parity status in EDAC, it has been moved to the
pci_dev structure itself by a previous PCI patch.  A future patch will
enable that feature in EDAC by utilizing the pci_dev info.

The sysfs is now enabled in this patch, with a minimal set of control and
attribute files for examining EDAC state and for enabling/disabling the
memory and PCI operations.

The Documentation for EDAC has also been updated to reflect the new state
of EDAC operation.

Signed-off-by:Doug Thompson <norsk5@xmisson.com>
Cc: Greg KH <greg@kroah.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/drivers/edac/edac.txt | 152 +++++++-----------------------------
 1 file changed, 27 insertions(+), 125 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/drivers/edac/edac.txt b/Documentation/drivers/edac/edac.txt
index 70d96a62e5e..7b3d969d296 100644
--- a/Documentation/drivers/edac/edac.txt
+++ b/Documentation/drivers/edac/edac.txt
@@ -35,15 +35,14 @@ the vendor should tie the parity status bits to 0 if they do not intend
 to generate parity.  Some vendors do not do this, and thus the parity bit
 can "float" giving false positives.
 
-The PCI Parity EDAC device has the ability to "skip" known flaky
-cards during the parity scan. These are set by the parity "blacklist"
-interface in the sysfs for PCI Parity. (See the PCI section in the sysfs
-section below.) There is also a parity "whitelist" which is used as
-an explicit list of devices to scan, while the blacklist is a list
-of devices to skip.
+[There are patches in the kernel queue which will allow for storage of
+quirks of PCI devices reporting false parity positives. The 2.6.18
+kernel should have those patches included. When that becomes available,
+then EDAC will be patched to utilize that information to "skip" such
+devices.]
 
-EDAC will have future error detectors that will be added or integrated
-into EDAC in the following list:
+EDAC will have future error detectors that will be integrated with
+EDAC or added to it, in the following list:
 
 	MCE	Machine Check Exception
 	MCA	Machine Check Architecture
@@ -93,22 +92,24 @@ EDAC lives in the /sys/devices/system/edac directory. Within this directory
 there currently reside 2 'edac' components:
 
 	mc	memory controller(s) system
-	pci	PCI status system
+	pci	PCI control and status system
 
 
 ============================================================================
 Memory Controller (mc) Model
 
 First a background on the memory controller's model abstracted in EDAC.
-Each mc device controls a set of DIMM memory modules. These modules are
+Each 'mc' device controls a set of DIMM memory modules. These modules are
 laid out in a Chip-Select Row (csrowX) and Channel table (chX). There can
-be multiple csrows and two channels.
+be multiple csrows and multiple channels.
 
 Memory controllers allow for several csrows, with 8 csrows being a typical value.
 Yet, the actual number of csrows depends on the electrical "loading"
 of a given motherboard, memory controller and DIMM characteristics.
 
 Dual channels allows for 128 bit data transfers to the CPU from memory.
+Some newer chipsets allow for more than 2 channels, like Fully Buffered DIMMs
+(FB-DIMMs). The following example will assume 2 channels:
 
 
 		Channel 0	Channel 1
@@ -234,23 +235,15 @@ Polling period control file:
 	The time period, in milliseconds, for polling for error information.
 	Too small a value wastes resources.  Too large a value might delay
 	necessary handling of errors and might loose valuable information for
-	locating the error.  1000 milliseconds (once each second) is about
-	right for most uses.
+	locating the error.  1000 milliseconds (once each second) is the current
+	default. Systems which require all the bandwidth they can get, may
+	increase this.
 
 	LOAD TIME: module/kernel parameter: poll_msec=[0|1]
 
 	RUN TIME: echo "1000" >/sys/devices/system/edac/mc/poll_msec
 
 
-Module Version read-only attribute file:
-
-	'mc_version'
-
-	The EDAC CORE module's version and compile date are shown here to
-	indicate what EDAC is running.
-
-
-
 ============================================================================
 'mcX' DIRECTORIES
 
@@ -284,35 +277,6 @@ Seconds since last counter reset control file:
 
 
-DIMM capability attribute file:
-
-	'edac_capability'
-
-	The EDAC (Error Detection and Correction) capabilities/modes of
-	the memory controller hardware.
-
-
-DIMM Current Capability attribute file:
-
-	'edac_current_capability'
-
-	The EDAC capabilities available with the hardware
-	configuration.  This may not be the same as "EDAC capability"
-	if the correct memory is not used.  If a memory controller is
-	capable of EDAC, but DIMMs without check bits are in use, then
-	Parity, SECDED, S4ECD4ED capabilities will not be available
-	even though the memory controller might be capable of those
-	modes with the proper memory loaded.
-
-
-Memory Type supported on this controller attribute file:
-
-	'supported_mem_type'
-
-	This attribute file displays the memory type, usually
-	buffered and unbuffered DIMMs.
-
-
 Memory Controller name attribute file:
 
 	'mc_name'
@@ -321,16 +285,6 @@ Memory Controller name attribute file:
 	that is being utilized.
 
 
-Memory Controller Module name attribute file:
-
-	'module_name'
-
-	This attribute file displays the memory controller module name,
-	version and date built.  The name of the memory controller
-	hardware - some drivers work with multiple controllers and
-	this field shows which hardware is present.
-
-
 Total memory managed by this memory controller attribute file:
 
 	'size_mb'
@@ -432,6 +386,9 @@ Memory Type attribute file:
 
 	This attribute file will display what type of memory is currently
 	on this csrow. Normally, either buffered or unbuffered memory.
+	Examples:
+		Registered-DDR
+		Unbuffered-DDR
 
 
 EDAC Mode of operation attribute file:
@@ -446,8 +403,13 @@ Device type attribute file:
 
 	'dev_type'
 
-	This attribute file will display what type of DIMM device is
-	being utilized. Example:  x4
+	This attribute file will display what type of DRAM device is
+	being utilized on this DIMM.
+	Examples:
+		x1
+		x2
+		x4
+		x8
 
 
 Channel 0 CE Count attribute file:
@@ -522,10 +484,10 @@ SYSTEM LOGGING
 If logging for UEs and CEs are enabled then system logs will have
 error notices indicating errors that have been detected:
 
-MC0: CE page 0x283, offset 0xce0, grain 8, syndrome 0x6ec3, row 0,
+EDAC MC0: CE page 0x283, offset 0xce0, grain 8, syndrome 0x6ec3, row 0,
 channel 1 "DIMM_B1": amd76x_edac
 
-MC0: CE page 0x1e5, offset 0xfb0, grain 8, syndrome 0xb741, row 0,
+EDAC MC0: CE page 0x1e5, offset 0xfb0, grain 8, syndrome 0xb741, row 0,
 channel 1 "DIMM_B1": amd76x_edac
 
 
@@ -610,64 +572,4 @@ Parity Count:
 
 
-PCI Device Whitelist:
-
-	'pci_parity_whitelist'
-
-	This control file allows for an explicit list of PCI devices to be
-	scanned for parity errors. Only devices found on this list will
-	be examined.  The list is a line of hexadecimal VENDOR and DEVICE
-	ID tuples:
-
-	1022:7450,1434:16a6
-
-	One or more can be inserted, separated by a comma.
-
-	To write the above list doing the following as one command line:
-
-	echo "1022:7450,1434:16a6"
-		> /sys/devices/system/edac/pci/pci_parity_whitelist
-
-
-
-	To display what the whitelist is, simply 'cat' the same file.
-
-
-PCI Device Blacklist:
-
-	'pci_parity_blacklist'
-
-	This control file allows for a list of PCI devices to be
-	skipped for scanning.
-	The list is a line of hexadecimal VENDOR and DEVICE ID tuples:
-
-	1022:7450,1434:16a6
-
-	One or more can be inserted, separated by a comma.
-
-	To write the above list doing the following as one command line:
-
-	echo "1022:7450,1434:16a6"
-		> /sys/devices/system/edac/pci/pci_parity_blacklist
-
-
-	To display what the whitelist currently contains,
-	simply 'cat' the same file.
-
 =======================================================================
-
-PCI Vendor and Devices IDs can be obtained with the lspci command. Using
-the -n option lspci will display the vendor and device IDs. The system
-administrator will have to determine which devices should be scanned or
-skipped.
-
-
-
-The two lists (white and black) are prioritized. blacklist is the lower
-priority and will NOT be utilized when a whitelist has been set.
-Turn OFF a whitelist by an empty echo command:
-
-	echo > /sys/devices/system/edac/pci/pci_parity_whitelist
-
-and any previous blacklist will be utilized.
-
-- 
cgit v1.2.3


From c59923a15c12d2b3597af913bf234a0ef264a38b Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Mon, 10 Jul 2006 04:45:40 -0700
Subject: [PATCH] remove the tasklist_lock export

As announced half a year ago this patch will remove the tasklist_lock
export.  The previous two patches got rid of the remaining modular users.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/feature-removal-schedule.txt | 11 -----------
 1 file changed, 11 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 99f219a01e0..ee287988934 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -166,17 +166,6 @@ Who:	Arjan van de Ven <arjan@linux.intel.com>
 
 ---------------------------
 
-What:	remove EXPORT_SYMBOL(tasklist_lock)
-When:	August 2006
-Files:	kernel/fork.c
-Why:	tasklist_lock protects the kernel internal task list.  Modules have
-	no business looking at it, and all instances in drivers have been due
-	to use of too-lowlevel APIs.  Having this symbol exported prevents
-	moving to more scalable locking schemes for the task list.
-Who:	Christoph Hellwig <hch@lst.de>
-
----------------------------
-
 What:	mount/umount uevents
 When:	February 2007
 Why:	These events are not correct, and do not properly let userspace know
-- 
cgit v1.2.3


From e54695a59c278b9ff48cd4b263da7a1d392f5061 Mon Sep 17 00:00:00 2001
From: Andrew Morton <akpm@osdl.org>
Date: Mon, 10 Jul 2006 04:45:42 -0700
Subject: [PATCH] checklist update

Update Documentation/SubmitChecklist.

- Mention lockdep coverage

- Describe documentation requirements

- Number the various items to simplify the composition of caustic emails.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/SubmitChecklist | 76 +++++++++++++++++++++++--------------------
 1 file changed, 41 insertions(+), 35 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/SubmitChecklist b/Documentation/SubmitChecklist
index 8230098da52..a10bfb6ecd9 100644
--- a/Documentation/SubmitChecklist
+++ b/Documentation/SubmitChecklist
@@ -1,57 +1,63 @@
 Linux Kernel patch sumbittal checklist
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-Here are some basic things that developers should do if they
-want to see their kernel patch submittals accepted quicker.
+Here are some basic things that developers should do if they want to see their
+kernel patch submissions accepted more quickly.
 
-These are all above and beyond the documentation that is provided
-in Documentation/SubmittingPatches and elsewhere about submitting
-Linux kernel patches.
+These are all above and beyond the documentation that is provided in
+Documentation/SubmittingPatches and elsewhere regarding submitting Linux
+kernel patches.
 
 
-- Builds cleanly with applicable or modified CONFIG options =y, =m, and =n.
-  No gcc warnings/errors, no linker warnings/errors.
+1: Builds cleanly with applicable or modified CONFIG options =y, =m, and
+   =n.  No gcc warnings/errors, no linker warnings/errors.
 
-- Passes allnoconfig, allmodconfig
+2: Passes allnoconfig, allmodconfig
 
-- Builds on multiple CPU arch-es by using local cross-compile tools
-  or something like PLM at OSDL.
+3: Builds on multiple CPU architectures by using local cross-compile tools
+   or something like PLM at OSDL.
 
-- ppc64 is a good architecture for cross-compilation checking because it
-  tends to use `unsigned long' for 64-bit quantities.
+4: ppc64 is a good architecture for cross-compilation checking because it
+   tends to use `unsigned long' for 64-bit quantities.
 
-- Matches kernel coding style(!)
+5: Matches kernel coding style(!)
 
-- Any new or modified CONFIG options don't muck up the config menu.
+6: Any new or modified CONFIG options don't muck up the config menu.
 
-- All new Kconfig options have help text.
+7: All new Kconfig options have help text.
 
-- Has been carefully reviewed with respect to relevant Kconfig
-  combinations.  This is very hard to get right with testing --
-  brainpower pays off here.
+8: Has been carefully reviewed with respect to relevant Kconfig
+   combinations.  This is very hard to get right with testing -- brainpower
+   pays off here.
 
-- Check cleanly with sparse.
+9: Check cleanly with sparse.
 
-- Use 'make checkstack' and 'make namespacecheck' and fix any
-  problems that they find.  Note:  checkstack does not point out
-  problems explicitly, but any one function that uses more than
-  512 bytes on the stack is a candidate for change.
+10: Use 'make checkstack' and 'make namespacecheck' and fix any problems
+    that they find.  Note: checkstack does not point out problems explicitly,
+    but any one function that uses more than 512 bytes on the stack is a
+    candidate for change.
 
-- Include kernel-doc to document global kernel APIs.  (Not required
-  for static functions, but OK there also.)  Use 'make htmldocs'
-  or 'make mandocs' to check the kernel-doc and fix any issues.
+11: Include kernel-doc to document global kernel APIs.  (Not required for
+    static functions, but OK there also.) Use 'make htmldocs' or 'make
+    mandocs' to check the kernel-doc and fix any issues.
 
-- Has been tested with CONFIG_PREEMPT, CONFIG_DEBUG_PREEMPT,
-  CONFIG_DEBUG_SLAB, CONFIG_DEBUG_PAGEALLOC, CONFIG_DEBUG_MUTEXES,
-  CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_SPINLOCK_SLEEP all simultaneously
-  enabled.
+12: Has been tested with CONFIG_PREEMPT, CONFIG_DEBUG_PREEMPT,
+    CONFIG_DEBUG_SLAB, CONFIG_DEBUG_PAGEALLOC, CONFIG_DEBUG_MUTEXES,
+    CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_SPINLOCK_SLEEP all simultaneously
+    enabled.
 
-- Has been build- and runtime tested with and without CONFIG_SMP and
-  CONFIG_PREEMPT.
+13: Has been build- and runtime tested with and without CONFIG_SMP and
+    CONFIG_PREEMPT.
 
-- If the patch affects IO/Disk, etc: has been tested with and without
-  CONFIG_LBD.
+14: If the patch affects IO/Disk, etc: has been tested with and without
+    CONFIG_LBD.
 
+15: All codepaths have been exercised with all lockdep features enabled.
 
-2006-APR-27
+16: All new /proc entries are documented under Documentation/
+
+17: All new kernel boot parameters are documented in
+    Documentation/kernel-parameters.txt.
+
+18: All new module parameters are documented with MODULE_PARM_DESC()
-- 
cgit v1.2.3


From f40b68903ccd511ea9d658b4bce319dd032a265a Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai@suse.de>
Date: Wed, 5 Jul 2006 16:51:05 +0200
Subject: [ALSA] Fix section mismatch errors in ALSA PCI drivers

Fixed 'section mismatch' errors in ALSA PCI drivers:
- removed invalid __devinitdata from pci id tables
- fix/remove __devinit of functions called in suspend/resume

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@suse.cz>
---
 Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl
index 69866d5997a..b8dc51ca776 100644
--- a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl
+++ b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl
@@ -1172,7 +1172,7 @@
   }        
 
   /* PCI IDs */
-  static struct pci_device_id snd_mychip_ids[] __devinitdata = {
+  static struct pci_device_id snd_mychip_ids[] = {
           { PCI_VENDOR_ID_FOO, PCI_DEVICE_ID_BAR,
             PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, },
           ....
@@ -1565,7 +1565,7 @@
         <informalexample>
           <programlisting>
 <![CDATA[
-  static struct pci_device_id snd_mychip_ids[] __devinitdata = {
+  static struct pci_device_id snd_mychip_ids[] = {
           { PCI_VENDOR_ID_FOO, PCI_DEVICE_ID_BAR,
             PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, },
           ....
-- 
cgit v1.2.3


From 5a0174831c48df04df83339578b409c7b5f75885 Mon Sep 17 00:00:00 2001
From: Jean Delvare <khali@linux-fr.org>
Date: Sat, 1 Jul 2006 17:13:37 +0200
Subject: [PATCH] i2c-ite: Plan for removal

Plan the i2c-ite and i2c-algo-ite drivers for removal.
These drivers never compiled since they were added to the kernel
tree 5 years ago. Also see:
http://marc.theaimsgroup.com/?l=linux-mips&m=115040510817448

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 Documentation/feature-removal-schedule.txt | 11 +++++++++++
 1 file changed, 11 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index ee287988934..ffa4d6c55dc 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -255,3 +255,14 @@ Why:	The interrupt related SA_* flags are replaced by IRQF_* to move them
 Who:	Thomas Gleixner <tglx@linutronix.de>
 
 ---------------------------
+
+What:	i2c-ite and i2c-algo-ite drivers
+When:	September 2006
+Why:	These drivers never compiled since they were added to the kernel
+	tree 5 years ago. This feature removal can be reevaluated if
+	someone shows interest in the drivers, fixes them and takes over
+	maintenance.
+	http://marc.theaimsgroup.com/?l=linux-mips&m=115040510817448
+Who:	Jean Delvare <khali@linux-fr.org>
+
+---------------------------
-- 
cgit v1.2.3


From 5d925fecac26651e6b0e19cf4ca16933aa640f99 Mon Sep 17 00:00:00 2001
From: Jean Delvare <khali@linux-fr.org>
Date: Sat, 1 Jul 2006 17:14:32 +0200
Subject: [PATCH] i2c: New mailing list

We have a new mailing list dedicated to linux i2c:
http://lists.lm-sensors.org/mailman/listinfo/i2c

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 Documentation/i2c/busses/i2c-sis96x | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/i2c/busses/i2c-sis96x b/Documentation/i2c/busses/i2c-sis96x
index 00a009b977e..08d7b2dac69 100644
--- a/Documentation/i2c/busses/i2c-sis96x
+++ b/Documentation/i2c/busses/i2c-sis96x
@@ -42,8 +42,8 @@ I suspect that this driver could be made to work for the following SiS
 chipsets as well: 635, and 635T. If anyone owns a board with those chips
 AND is willing to risk crashing & burning an otherwise well-behaved kernel
 in the name of progress... please contact me at <mhoffman@lightlink.com> or
-via the project's mailing list: <lm-sensors@lm-sensors.org>.  Please
-send bug reports and/or success stories as well.
+via the project's mailing list: <i2c@lm-sensors.org>.  Please send bug
+reports and/or success stories as well.
 
 
 TO DOs
-- 
cgit v1.2.3


From 5cab828bf0f52f3697a61aa99c54ee43844f53c0 Mon Sep 17 00:00:00 2001
From: Hans de Goede <j.w.r.degoede@hhs.nl>
Date: Wed, 5 Jul 2006 18:09:09 +0200
Subject: [PATCH] hwmon: Documentation update for abituguru

Documentation update for the new bank1_types module param.

Also add what we know about different revisions of the uGuru and
a note that the abituguru driver unfortunatly does not work with the
latest and greatest motherboards, which have what I think is revision
4 of the uGuru.

Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 Documentation/hwmon/abituguru | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/hwmon/abituguru b/Documentation/hwmon/abituguru
index 69cdb527d58..b2c0d61b39a 100644
--- a/Documentation/hwmon/abituguru
+++ b/Documentation/hwmon/abituguru
@@ -2,13 +2,36 @@ Kernel driver abituguru
 =======================
 
 Supported chips:
-  * Abit uGuru (Hardware Monitor part only)
+  * Abit uGuru revision 1-3 (Hardware Monitor part only)
     Prefix: 'abituguru'
     Addresses scanned: ISA 0x0E0
     Datasheet: Not available, this driver is based on reverse engineering.
 	A "Datasheet" has been written based on the reverse engineering it
 	should be available in the same dir as this file under the name
 	abituguru-datasheet.
+    Note:
+	The uGuru is a microcontroller with onboard firmware which programs
+	it to behave as a hwmon IC. There are many different revisions of the
+	firmware and thus effectivly many different revisions of the uGuru.
+	Below is an incomplete list with which revisions are used for which
+	Motherboards:
+	uGuru 1.00    ~ 1.24    (AI7, KV8-MAX3, AN7) (1)
+	uGuru 2.0.0.0 ~ 2.0.4.2 (KV8-PRO)
+	uGuru 2.1.0.0 ~ 2.1.2.8 (AS8, AV8, AA8, AG8, AA8XE, AX8)
+	uGuru 2.2.0.0 ~ 2.2.0.6 (AA8 Fatal1ty)
+	uGuru 2.3.0.0 ~ 2.3.0.9 (AN8)
+	uGuru 3.0.0.0 ~ 3.0.1.2 (AW8, AL8, NI8)
+	uGuru 4.xxxxx?          (AT8 32X) (2)
+	1) For revisions 2 and 3 uGuru's the driver can autodetect the
+	   sensortype (Volt or Temp) for bank1 sensors, for revision 1 uGuru's
+	   this doesnot always work. For these uGuru's the autodection can
+	   be overriden with the bank1_types module param. For all 3 known
+	   revison 1 motherboards the correct use of this param is:
+	   bank1_types=1,1,0,0,0,0,0,2,0,0,0,0,2,0,0,1
+	   You may also need to specify the fan_sensors option for these boards
+	   fan_sensors=5
+	2) The current version of the abituguru driver is known to NOT work
+	   on these Motherboards
 
 Authors:
 	Hans de Goede <j.w.r.degoede@hhs.nl>,
@@ -22,6 +45,11 @@ Module Parameters
 * force: bool		Force detection. Note this parameter only causes the
 			detection to be skipped, if the uGuru can't be read
 			the module initialization (insmod) will still fail.
+* bank1_types: int[]	Bank1 sensortype autodetection override:
+			  -1 autodetect (default)
+			   0 volt sensor
+			   1 temp sensor
+			   2 not connected
 * fan_sensors: int	Tell the driver how many fan speed sensors there are
 			on your motherboard. Default: 0 (autodetect).
 * pwms: int		Tell the driver how many fan speed controls (fan
@@ -29,7 +57,7 @@ Module Parameters
 * verbose: int		How verbose should the driver be? (0-3):
 			   0 normal output
 			   1 + verbose error reporting
-			   2 + sensors type probing info\n"
+			   2 + sensors type probing info (default)
 			   3 + retryable error reporting
 			Default: 2 (the driver is still in the testing phase)
 
-- 
cgit v1.2.3


From 3d861494729c70d9ebeb7d93caa107897925c355 Mon Sep 17 00:00:00 2001
From: Peter Moulder <Peter.Moulder@infotech.monash.edu.au>
Date: Mon, 19 Jun 2006 22:47:49 +1000
Subject: [PATCH] USB: Addition of vendor/product id pair for pl2303 driver

Text from the back of the box, for your information/amusement:

 USB DATA CABLE
   FOR K700 Series

 The USB Cable is an ideal link between your mobile phone and PC. Employing
 the user-friendiy [sic] USB standard,its capacity for rapid data transfer enables functions
 such as synchronization of phone book and calendar,as well as Internet browsing via
 a modem-enabled phone.Autual [sic] connection speed is dependent on phone capacity.

 MADE IN CHINA

From: Peter Moulder <Peter.Moulder@infotech.monash.edu.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 Documentation/usb/usb-serial.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/usb/usb-serial.txt b/Documentation/usb/usb-serial.txt
index f001cd93b79..02b0f7beb6d 100644
--- a/Documentation/usb/usb-serial.txt
+++ b/Documentation/usb/usb-serial.txt
@@ -399,10 +399,10 @@ REINER SCT cyberJack pinpad/e-com USB chipcard reader
 
 Prolific PL2303 Driver
 
-  This driver support any device that has the PL2303 chip from Prolific
+  This driver supports any device that has the PL2303 chip from Prolific
   in it.  This includes a number of single port USB to serial
   converters and USB GPS devices.  Devices from Aten (the UC-232) and
-  IO-Data work with this driver.
+  IO-Data work with this driver, as does the DCU-11 mobile-phone cable.
 
   For any questions or problems with this driver, please contact Greg
   Kroah-Hartman at greg@kroah.com
-- 
cgit v1.2.3


From cd6ef2ada54aa4788d5a3dee3cffaad41383a52a Mon Sep 17 00:00:00 2001
From: Adrian Bunk <bunk@stusta.de>
Date: Fri, 30 Jun 2006 02:15:42 -0700
Subject: [PATCH] The scheduled unexport of insert_resource

Implement the scheduled unexport of insert_resource.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 Documentation/feature-removal-schedule.txt | 8 --------
 1 file changed, 8 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index ee287988934..47d714d7082 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -55,14 +55,6 @@ Who:	Mauro Carvalho Chehab <mchehab@brturbo.com.br>
 
 ---------------------------
 
-What:	remove EXPORT_SYMBOL(insert_resource)
-When:	April 2006
-Files:	kernel/resource.c
-Why:	No modular usage in the kernel.
-Who:	Adrian Bunk <bunk@stusta.de>
-
----------------------------
-
 What:	PCMCIA control ioctl (needed for pcmcia-cs [cardmgr, cardctl])
 When:	November 2005
 Files:	drivers/pcmcia/: pcmcia_ioctl.c
-- 
cgit v1.2.3


From 54d0a216f40e060ba4265bb851cc36b3ca55d1a8 Mon Sep 17 00:00:00 2001
From: Ralf Baechle <ralf@linux-mips.org>
Date: Sun, 9 Jul 2006 21:38:56 +0100
Subject: [MIPS] Replace board_timer_setup function pointer by
 plat_timer_setup.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

---
---
 Documentation/mips/time.README | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/mips/time.README b/Documentation/mips/time.README
index 70bc0dd43d6..69ddc5c14b7 100644
--- a/Documentation/mips/time.README
+++ b/Documentation/mips/time.README
@@ -65,7 +65,7 @@ the following functions or values:
 	1. (optional) set up RTC routines
 	2. (optional) calibrate and set the mips_counter_frequency
 
-  b) board_timer_setup - a function pointer.  Invoked at the end of time_init()
+  b) plat_timer_setup - a function pointer.  Invoked at the end of time_init()
 	1. (optional) over-ride any decisions made in time_init()
 	2. set up the irqaction for timer interrupt.
 	3. enable the timer interrupt
@@ -116,19 +116,17 @@ Step 2:  the machine setup() function
 
   If you supply board_time_init(), set the function poointer.
 
-  Set the function pointer board_timer_setup() (mandatory)
 
-
-Step 3: implement rtc routines, board_time_init() and board_timer_setup()
+Step 3: implement rtc routines, board_time_init() and plat_timer_setup()
   if needed.
 
-  board_time_init() - 
+  board_time_init() -
   	a) (optional) set up RTC routines, 
         b) (optional) calibrate and set the mips_counter_frequency
  	    (only needed if you intended to use fixed_rate_gettimeoffset
  	     or use cpu counter as timer interrupt source)
 
-  board_timer_setup() - 
+  plat_timer_setup() -
  	a) (optional) over-write any choices made above by time_init().
  	b) machine specific code should setup the timer irqaction.
  	c) enable the timer interrupt
-- 
cgit v1.2.3


From 086626a747300e37043a553dac639c5900c4a2c0 Mon Sep 17 00:00:00 2001
From: Nathan Scott <nathans@sgi.com>
Date: Fri, 14 Jul 2006 00:24:10 -0700
Subject: [PATCH] Update ramdisk documentation

The default ramdisk blocksize is actually 1024, not 512 bytes.  Also fixes
up some trailing whitespace issues.

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/ramdisk.txt | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/ramdisk.txt b/Documentation/ramdisk.txt
index 7c25584e082..52f75b7d51c 100644
--- a/Documentation/ramdisk.txt
+++ b/Documentation/ramdisk.txt
@@ -6,7 +6,7 @@ Contents:
 	1) Overview
 	2) Kernel Command Line Parameters
 	3) Using "rdev -r"
-	4) An Example of Creating a Compressed RAM Disk 
+	4) An Example of Creating a Compressed RAM Disk
 
 
 1) Overview
@@ -34,7 +34,7 @@ make it clearer.  The original "ramdisk=<ram_size>" has been kept around for
 compatibility reasons, but it may be removed in the future.
 
 The new RAM disk also has the ability to load compressed RAM disk images,
-allowing one to squeeze more programs onto an average installation or 
+allowing one to squeeze more programs onto an average installation or
 rescue floppy disk.
 
 
@@ -51,7 +51,7 @@ default is 4096 (4 MB) (8192 (8 MB) on S390).
 	===================
 
 This parameter tells the RAM disk driver how many bytes to use per block.  The
-default is 512.
+default is 1024 (BLOCK_SIZE).
 
 
 3) Using "rdev -r"
@@ -70,7 +70,7 @@ These numbers are no magical secrets, as seen below:
 ./arch/i386/kernel/setup.c:#define RAMDISK_PROMPT_FLAG          0x8000
 ./arch/i386/kernel/setup.c:#define RAMDISK_LOAD_FLAG            0x4000
 
-Consider a typical two floppy disk setup, where you will have the 
+Consider a typical two floppy disk setup, where you will have the
 kernel on disk one, and have already put a RAM disk image onto disk #2.
 
 Hence you want to set bits 0 to 13 as 0, meaning that your RAM disk
@@ -97,12 +97,12 @@ Since the default start = 0 and the default prompt = 1, you could use:
 	append = "load_ramdisk=1"
 
 
-4) An Example of Creating a Compressed RAM Disk 
+4) An Example of Creating a Compressed RAM Disk
 ----------------------------------------------
 
 To create a RAM disk image, you will need a spare block device to
 construct it on. This can be the RAM disk device itself, or an
-unused disk partition (such as an unmounted swap partition). For this 
+unused disk partition (such as an unmounted swap partition). For this
 example, we will use the RAM disk device, "/dev/ram0".
 
 Note: This technique should not be done on a machine with less than 8 MB
-- 
cgit v1.2.3


From b9432e4d8866606466117664472c58ac981ea4f4 Mon Sep 17 00:00:00 2001
From: Rolf Eike Beer <eike-kernel@sf-tec.de>
Date: Fri, 14 Jul 2006 00:24:24 -0700
Subject: [PATCH] Remove pci_dac_set_dma_mask() from
 Documentation/DMA-mapping.txt

pci_dac_set_dma_mask() gives only a single match in the whole kernel tree
and that's in this doc file.  The best candidate for replacement is
pci_dac_dma_supported().

Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Cc: Greg KH <greg@kroah.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/DMA-mapping.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt
index 7c717699032..63392c9132b 100644
--- a/Documentation/DMA-mapping.txt
+++ b/Documentation/DMA-mapping.txt
@@ -698,12 +698,12 @@ these interfaces.  Remember that, as defined, consistent mappings are
 always going to be SAC addressable.
 
 The first thing your driver needs to do is query the PCI platform
-layer with your devices DAC addressing capabilities:
+layer if it is capable of handling your devices DAC addressing
+capabilities:
 
-	int pci_dac_set_dma_mask(struct pci_dev *pdev, u64 mask);
+	int pci_dac_dma_supported(struct pci_dev *hwdev, u64 mask);
 
-This routine behaves identically to pci_set_dma_mask.  You may not
-use the following interfaces if this routine fails.
+You may not use the following interfaces if this routine fails.
 
 Next, DMA addresses using this API are kept track of using the
 dma64_addr_t type.  It is guaranteed to be big enough to hold any
-- 
cgit v1.2.3


From ca74e92b4698276b6696f15a801759f50944f387 Mon Sep 17 00:00:00 2001
From: Shailabh Nagar <nagar@watson.ibm.com>
Date: Fri, 14 Jul 2006 00:24:36 -0700
Subject: [PATCH] per-task-delay-accounting: setup

Initialization code related to collection of per-task "delay" statistics which
measure how long it had to wait for cpu, sync block io, swapping etc.  The
collection of statistics and the interface are in other patches.  This patch
sets up the data structures and allows the statistics collection to be
disabled through a kernel boot parameter.

Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Cc: Jes Sorensen <jes@sgi.com>
Cc: Peter Chubb <peterc@gelato.unsw.edu.au>
Cc: Erich Focht <efocht@ess.nec.de>
Cc: Levent Serinol <lserinol@gmail.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/kernel-parameters.txt | 2 ++
 1 file changed, 2 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 149f62ba14a..e11f7728ec6 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -448,6 +448,8 @@ running once the system is up.
 			Format: <area>[,<node>]
 			See also Documentation/networking/decnet.txt.
 
+	delayacct	[KNL] Enable per-task delay accounting
+
 	dhash_entries=	[KNL]
 			Set number of hash buckets for dentry cache.
 
-- 
cgit v1.2.3


From c757249af152c59fd74b85e52e8c090acb33d9c0 Mon Sep 17 00:00:00 2001
From: Shailabh Nagar <nagar@watson.ibm.com>
Date: Fri, 14 Jul 2006 00:24:40 -0700
Subject: [PATCH] per-task-delay-accounting: taskstats interface

Create a "taskstats" interface based on generic netlink (NETLINK_GENERIC
family), for getting statistics of tasks and thread groups during their
lifetime and when they exit.  The interface is intended for use by multiple
accounting packages though it is being created in the context of delay
accounting.

This patch creates the interface without populating the fields of the data
that is sent to the user in response to a command or upon the exit of a task.
Each accounting package interested in using taskstats has to provide an
additional patch to add its stats to the common structure.

[akpm@osdl.org: cleanups, Kconfig fix]
Signed-off-by: Shailabh Nagar <nagar@us.ibm.com>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Cc: Jes Sorensen <jes@sgi.com>
Cc: Peter Chubb <peterc@gelato.unsw.edu.au>
Cc: Erich Focht <efocht@ess.nec.de>
Cc: Levent Serinol <lserinol@gmail.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/accounting/taskstats.txt | 146 +++++++++++++++++++++++++++++++++
 1 file changed, 146 insertions(+)
 create mode 100644 Documentation/accounting/taskstats.txt

(limited to 'Documentation')

diff --git a/Documentation/accounting/taskstats.txt b/Documentation/accounting/taskstats.txt
new file mode 100644
index 00000000000..ad9b6997e16
--- /dev/null
+++ b/Documentation/accounting/taskstats.txt
@@ -0,0 +1,146 @@
+Per-task statistics interface
+-----------------------------
+
+
+Taskstats is a netlink-based interface for sending per-task and
+per-process statistics from the kernel to userspace.
+
+Taskstats was designed for the following benefits:
+
+- efficiently provide statistics during lifetime of a task and on its exit
+- unified interface for multiple accounting subsystems
+- extensibility for use by future accounting patches
+
+Terminology
+-----------
+
+"pid", "tid" and "task" are used interchangeably and refer to the standard
+Linux task defined by struct task_struct.  per-pid stats are the same as
+per-task stats.
+
+"tgid", "process" and "thread group" are used interchangeably and refer to the
+tasks that share an mm_struct i.e. the traditional Unix process. Despite the
+use of tgid, there is no special treatment for the task that is thread group
+leader - a process is deemed alive as long as it has any task belonging to it.
+
+Usage
+-----
+
+To get statistics during task's lifetime, userspace opens a unicast netlink
+socket (NETLINK_GENERIC family) and sends commands specifying a pid or a tgid.
+The response contains statistics for a task (if pid is specified) or the sum of
+statistics for all tasks of the process (if tgid is specified).
+
+To obtain statistics for tasks which are exiting, userspace opens a multicast
+netlink socket. Each time a task exits, two records are sent by the kernel to
+each listener on the multicast socket. The first the per-pid task's statistics
+and the second is the sum for all tasks of the process to which the task
+belongs (the task does not need to be the thread group leader). The need for
+per-tgid stats to be sent for each exiting task is explained in the per-tgid
+stats section below.
+
+
+Interface
+---------
+
+The user-kernel interface is encapsulated in include/linux/taskstats.h
+
+To avoid this documentation becoming obsolete as the interface evolves, only
+an outline of the current version is given. taskstats.h always overrides the
+description here.
+
+struct taskstats is the common accounting structure for both per-pid and
+per-tgid data. It is versioned and can be extended by each accounting subsystem
+that is added to the kernel. The fields and their semantics are defined in the
+taskstats.h file.
+
+The data exchanged between user and kernel space is a netlink message belonging
+to the NETLINK_GENERIC family and using the netlink attributes interface.
+The messages are in the format
+
+    +----------+- - -+-------------+-------------------+
+    | nlmsghdr | Pad |  genlmsghdr | taskstats payload |
+    +----------+- - -+-------------+-------------------+
+
+
+The taskstats payload is one of the following three kinds:
+
+1. Commands: Sent from user to kernel. The payload is one attribute, of type
+TASKSTATS_CMD_ATTR_PID/TGID, containing a u32 pid or tgid in the attribute
+payload. The pid/tgid denotes the task/process for which userspace wants
+statistics.
+
+2. Response for a command: sent from the kernel in response to a userspace
+command. The payload is a series of three attributes of type:
+
+a) TASKSTATS_TYPE_AGGR_PID/TGID : attribute containing no payload but indicates
+a pid/tgid will be followed by some stats.
+
+b) TASKSTATS_TYPE_PID/TGID: attribute whose payload is the pid/tgid whose stats
+is being returned.
+
+c) TASKSTATS_TYPE_STATS: attribute with a struct taskstsats as payload. The
+same structure is used for both per-pid and per-tgid stats.
+
+3. New message sent by kernel whenever a task exits. The payload consists of a
+   series of attributes of the following type:
+
+a) TASKSTATS_TYPE_AGGR_PID: indicates next two attributes will be pid+stats
+b) TASKSTATS_TYPE_PID: contains exiting task's pid
+c) TASKSTATS_TYPE_STATS: contains the exiting task's per-pid stats
+d) TASKSTATS_TYPE_AGGR_TGID: indicates next two attributes will be tgid+stats
+e) TASKSTATS_TYPE_TGID: contains tgid of process to which task belongs
+f) TASKSTATS_TYPE_STATS: contains the per-tgid stats for exiting task's process
+
+
+per-tgid stats
+--------------
+
+Taskstats provides per-process stats, in addition to per-task stats, since
+resource management is often done at a process granularity and aggregating task
+stats in userspace alone is inefficient and potentially inaccurate (due to lack
+of atomicity).
+
+However, maintaining per-process, in addition to per-task stats, within the
+kernel has space and time overheads. Hence the taskstats implementation
+dynamically sums up the per-task stats for each task belonging to a process
+whenever per-process stats are needed.
+
+Not maintaining per-tgid stats creates a problem when userspace is interested
+in getting these stats when the process dies i.e. the last thread of
+a process exits. It isn't possible to simply return some aggregated per-process
+statistic from the kernel.
+
+The approach taken by taskstats is to return the per-tgid stats *each* time
+a task exits, in addition to the per-pid stats for that task. Userspace can
+maintain task<->process mappings and use them to maintain the per-process stats
+in userspace, updating the aggregate appropriately as the tasks of a process
+exit.
+
+Extending taskstats
+-------------------
+
+There are two ways to extend the taskstats interface to export more
+per-task/process stats as patches to collect them get added to the kernel
+in future:
+
+1. Adding more fields to the end of the existing struct taskstats. Backward
+   compatibility is ensured by the version number within the
+   structure. Userspace will use only the fields of the struct that correspond
+   to the version its using.
+
+2. Defining separate statistic structs and using the netlink attributes
+   interface to return them. Since userspace processes each netlink attribute
+   independently, it can always ignore attributes whose type it does not
+   understand (because it is using an older version of the interface).
+
+
+Choosing between 1. and 2. is a matter of trading off flexibility and
+overhead. If only a few fields need to be added, then 1. is the preferable
+path since the kernel and userspace don't need to incur the overhead of
+processing new netlink attributes. But if the new fields expand the existing
+struct too much, requiring disparate userspace accounting utilities to
+unnecessarily receive large structures whose fields are of no interest, then
+extending the attributes structure would be worthwhile.
+
+----
-- 
cgit v1.2.3


From a3baf649ca9ca0a96fba538f03b0f17c043b755c Mon Sep 17 00:00:00 2001
From: Shailabh Nagar <nagar@watson.ibm.com>
Date: Fri, 14 Jul 2006 00:24:42 -0700
Subject: [PATCH] per-task-delay-accounting: documentation

Some documentation for delay accounting.

Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Cc: Jes Sorensen <jes@sgi.com>
Cc: Peter Chubb <peterc@gelato.unsw.edu.au>
Cc: Erich Focht <efocht@ess.nec.de>
Cc: Levent Serinol <lserinol@gmail.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/accounting/delay-accounting.txt | 115 ++++++++
 Documentation/accounting/getdelays.c          | 376 ++++++++++++++++++++++++++
 Documentation/accounting/taskstats.txt        |   2 +
 3 files changed, 493 insertions(+)
 create mode 100644 Documentation/accounting/delay-accounting.txt
 create mode 100644 Documentation/accounting/getdelays.c

(limited to 'Documentation')

diff --git a/Documentation/accounting/delay-accounting.txt b/Documentation/accounting/delay-accounting.txt
new file mode 100644
index 00000000000..f3dc0ca04fa
--- /dev/null
+++ b/Documentation/accounting/delay-accounting.txt
@@ -0,0 +1,115 @@
+Delay accounting
+----------------
+
+Tasks encounter delays in execution when they wait
+for some kernel resource to become available e.g. a
+runnable task may wait for a free CPU to run on.
+
+The per-task delay accounting functionality measures
+the delays experienced by a task while
+
+a) waiting for a CPU (while being runnable)
+b) completion of synchronous block I/O initiated by the task
+c) swapping in pages
+
+and makes these statistics available to userspace through
+the taskstats interface.
+
+Such delays provide feedback for setting a task's cpu priority,
+io priority and rss limit values appropriately. Long delays for
+important tasks could be a trigger for raising its corresponding priority.
+
+The functionality, through its use of the taskstats interface, also provides
+delay statistics aggregated for all tasks (or threads) belonging to a
+thread group (corresponding to a traditional Unix process). This is a commonly
+needed aggregation that is more efficiently done by the kernel.
+
+Userspace utilities, particularly resource management applications, can also
+aggregate delay statistics into arbitrary groups. To enable this, delay
+statistics of a task are available both during its lifetime as well as on its
+exit, ensuring continuous and complete monitoring can be done.
+
+
+Interface
+---------
+
+Delay accounting uses the taskstats interface which is described
+in detail in a separate document in this directory. Taskstats returns a
+generic data structure to userspace corresponding to per-pid and per-tgid
+statistics. The delay accounting functionality populates specific fields of
+this structure. See
+     include/linux/taskstats.h
+for a description of the fields pertaining to delay accounting.
+It will generally be in the form of counters returning the cumulative
+delay seen for cpu, sync block I/O, swapin etc.
+
+Taking the difference of two successive readings of a given
+counter (say cpu_delay_total) for a task will give the delay
+experienced by the task waiting for the corresponding resource
+in that interval.
+
+When a task exits, records containing the per-task and per-process statistics
+are sent to userspace without requiring a command. More details are given in
+the taskstats interface description.
+
+The getdelays.c userspace utility in this directory allows simple commands to
+be run and the corresponding delay statistics to be displayed. It also serves
+as an example of using the taskstats interface.
+
+Usage
+-----
+
+Compile the kernel with
+	CONFIG_TASK_DELAY_ACCT=y
+	CONFIG_TASKSTATS=y
+
+Enable the accounting at boot time by adding
+the following to the kernel boot options
+	delayacct
+
+and after the system has booted up, use a utility
+similar to  getdelays.c to access the delays
+seen by a given task or a task group (tgid).
+The utility also allows a given command to be
+executed and the corresponding delays to be
+seen.
+
+General format of the getdelays command
+
+getdelays [-t tgid] [-p pid] [-c cmd...]
+
+
+Get delays, since system boot, for pid 10
+# ./getdelays -p 10
+(output similar to next case)
+
+Get sum of delays, since system boot, for all pids with tgid 5
+# ./getdelays -t 5
+
+
+CPU	count	real total	virtual total	delay total
+	7876	92005750	100000000	24001500
+IO	count	delay total
+	0	0
+MEM	count	delay total
+	0	0
+
+Get delays seen in executing a given simple command
+# ./getdelays -c ls /
+
+bin   data1  data3  data5  dev  home  media  opt   root  srv        sys  usr
+boot  data2  data4  data6  etc  lib   mnt    proc  sbin  subdomain  tmp  var
+
+
+CPU	count	real total	virtual total	delay total
+	6	4000250		4000000		0
+IO	count	delay total
+	0	0
+MEM	count	delay total
+	0	0
+
+
+
+
+
+
diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c
new file mode 100644
index 00000000000..33de89e56a3
--- /dev/null
+++ b/Documentation/accounting/getdelays.c
@@ -0,0 +1,376 @@
+/* getdelays.c
+ *
+ * Utility to get per-pid and per-tgid delay accounting statistics
+ * Also illustrates usage of the taskstats interface
+ *
+ * Copyright (C) Shailabh Nagar, IBM Corp. 2005
+ * Copyright (C) Balbir Singh, IBM Corp. 2006
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <unistd.h>
+#include <poll.h>
+#include <string.h>
+#include <fcntl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/socket.h>
+#include <sys/types.h>
+#include <signal.h>
+
+#include <linux/genetlink.h>
+#include <linux/taskstats.h>
+
+/*
+ * Generic macros for dealing with netlink sockets. Might be duplicated
+ * elsewhere. It is recommended that commercial grade applications use
+ * libnl or libnetlink and use the interfaces provided by the library
+ */
+#define GENLMSG_DATA(glh)	((void *)(NLMSG_DATA(glh) + GENL_HDRLEN))
+#define GENLMSG_PAYLOAD(glh)	(NLMSG_PAYLOAD(glh, 0) - GENL_HDRLEN)
+#define NLA_DATA(na)		((void *)((char*)(na) + NLA_HDRLEN))
+#define NLA_PAYLOAD(len)	(len - NLA_HDRLEN)
+
+#define err(code, fmt, arg...) do { printf(fmt, ##arg); exit(code); } while (0)
+int done = 0;
+
+/*
+ * Create a raw netlink socket and bind
+ */
+static int create_nl_socket(int protocol, int groups)
+{
+    socklen_t addr_len;
+    int fd;
+    struct sockaddr_nl local;
+
+    fd = socket(AF_NETLINK, SOCK_RAW, protocol);
+    if (fd < 0)
+	return -1;
+
+    memset(&local, 0, sizeof(local));
+    local.nl_family = AF_NETLINK;
+    local.nl_groups = groups;
+
+    if (bind(fd, (struct sockaddr *) &local, sizeof(local)) < 0)
+	goto error;
+
+    return fd;
+  error:
+    close(fd);
+    return -1;
+}
+
+int sendto_fd(int s, const char *buf, int bufLen)
+{
+    struct sockaddr_nl nladdr;
+    int r;
+
+    memset(&nladdr, 0, sizeof(nladdr));
+    nladdr.nl_family = AF_NETLINK;
+
+    while ((r = sendto(s, buf, bufLen, 0, (struct sockaddr *) &nladdr,
+		       sizeof(nladdr))) < bufLen) {
+	if (r > 0) {
+	    buf += r;
+	    bufLen -= r;
+	} else if (errno != EAGAIN)
+	    return -1;
+    }
+    return 0;
+}
+
+/*
+ * Probe the controller in genetlink to find the family id
+ * for the TASKSTATS family
+ */
+int get_family_id(int sd)
+{
+    struct {
+	struct nlmsghdr n;
+	struct genlmsghdr g;
+	char buf[256];
+    } family_req;
+    struct {
+	struct nlmsghdr n;
+	struct genlmsghdr g;
+	char buf[256];
+    } ans;
+
+    int id;
+    struct nlattr *na;
+    int rep_len;
+
+    /* Get family name */
+    family_req.n.nlmsg_type = GENL_ID_CTRL;
+    family_req.n.nlmsg_flags = NLM_F_REQUEST;
+    family_req.n.nlmsg_seq = 0;
+    family_req.n.nlmsg_pid = getpid();
+    family_req.n.nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN);
+    family_req.g.cmd = CTRL_CMD_GETFAMILY;
+    family_req.g.version = 0x1;
+    na = (struct nlattr *) GENLMSG_DATA(&family_req);
+    na->nla_type = CTRL_ATTR_FAMILY_NAME;
+    na->nla_len = strlen(TASKSTATS_GENL_NAME) + 1 + NLA_HDRLEN;
+    strcpy(NLA_DATA(na), TASKSTATS_GENL_NAME);
+    family_req.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);
+
+    if (sendto_fd(sd, (char *) &family_req, family_req.n.nlmsg_len) < 0)
+	err(1, "error sending message via Netlink\n");
+
+    rep_len = recv(sd, &ans, sizeof(ans), 0);
+
+    if (rep_len < 0)
+	err(1, "error receiving reply message via Netlink\n");
+
+
+    /* Validate response message */
+    if (!NLMSG_OK((&ans.n), rep_len))
+	err(1, "invalid reply message received via Netlink\n");
+
+    if (ans.n.nlmsg_type == NLMSG_ERROR) {	/* error */
+	printf("error received NACK - leaving\n");
+	exit(1);
+    }
+
+
+    na = (struct nlattr *) GENLMSG_DATA(&ans);
+    na = (struct nlattr *) ((char *) na + NLA_ALIGN(na->nla_len));
+    if (na->nla_type == CTRL_ATTR_FAMILY_ID) {
+	id = *(__u16 *) NLA_DATA(na);
+    }
+    return id;
+}
+
+void print_taskstats(struct taskstats *t)
+{
+    printf("\n\nCPU   %15s%15s%15s%15s\n"
+	   "      %15llu%15llu%15llu%15llu\n"
+	   "IO    %15s%15s\n"
+	   "      %15llu%15llu\n"
+	   "MEM   %15s%15s\n"
+	   "      %15llu%15llu\n\n",
+	   "count", "real total", "virtual total", "delay total",
+	   t->cpu_count, t->cpu_run_real_total, t->cpu_run_virtual_total,
+	   t->cpu_delay_total,
+	   "count", "delay total",
+	   t->blkio_count, t->blkio_delay_total,
+	   "count", "delay total", t->swapin_count, t->swapin_delay_total);
+}
+
+void sigchld(int sig)
+{
+    done = 1;
+}
+
+int main(int argc, char *argv[])
+{
+    int rc;
+    int sk_nl;
+    struct nlmsghdr *nlh;
+    struct genlmsghdr *genlhdr;
+    char *buf;
+    struct taskstats_cmd_param *param;
+    __u16 id;
+    struct nlattr *na;
+
+    /* For receiving */
+    struct sockaddr_nl kern_nla, from_nla;
+    socklen_t from_nla_len;
+    int recv_len;
+    struct taskstats_reply *reply;
+
+    struct {
+	struct nlmsghdr n;
+	struct genlmsghdr g;
+	char buf[256];
+    } req;
+
+    struct {
+	struct nlmsghdr n;
+	struct genlmsghdr g;
+	char buf[256];
+    } ans;
+
+    int nl_sd = -1;
+    int rep_len;
+    int len = 0;
+    int aggr_len, len2;
+    struct sockaddr_nl nladdr;
+    pid_t tid = 0;
+    pid_t rtid = 0;
+    int cmd_type = TASKSTATS_TYPE_TGID;
+    int c, status;
+    int forking = 0;
+    struct sigaction act = {
+	.sa_handler = SIG_IGN,
+	.sa_mask = SA_NOMASK,
+    };
+    struct sigaction tact ;
+
+    if (argc < 3) {
+	printf("usage %s [-t tgid][-p pid][-c cmd]\n", argv[0]);
+	exit(-1);
+    }
+
+    tact.sa_handler = sigchld;
+    sigemptyset(&tact.sa_mask);
+    if (sigaction(SIGCHLD, &tact, NULL) < 0)
+	err(1, "sigaction failed for SIGCHLD\n");
+
+    while (1) {
+
+	c = getopt(argc, argv, "t:p:c:");
+	if (c < 0)
+	    break;
+
+	switch (c) {
+	case 't':
+	    tid = atoi(optarg);
+	    if (!tid)
+		err(1, "Invalid tgid\n");
+	    cmd_type = TASKSTATS_CMD_ATTR_TGID;
+	    break;
+	case 'p':
+	    tid = atoi(optarg);
+	    if (!tid)
+		err(1, "Invalid pid\n");
+	    cmd_type = TASKSTATS_CMD_ATTR_TGID;
+	    break;
+	case 'c':
+	    opterr = 0;
+	    tid = fork();
+	    if (tid < 0)
+		err(1, "fork failed\n");
+
+	    if (tid == 0) {	/* child process */
+		if (execvp(argv[optind - 1], &argv[optind - 1]) < 0) {
+		    exit(-1);
+		}
+	    }
+	    forking = 1;
+	    break;
+	default:
+	    printf("usage %s [-t tgid][-p pid][-c cmd]\n", argv[0]);
+	    exit(-1);
+	    break;
+	}
+	if (c == 'c')
+	    break;
+    }
+
+    /* Construct Netlink request message */
+
+    /* Send Netlink request message & get reply */
+
+    if ((nl_sd =
+	 create_nl_socket(NETLINK_GENERIC, TASKSTATS_LISTEN_GROUP)) < 0)
+	err(1, "error creating Netlink socket\n");
+
+
+    id = get_family_id(nl_sd);
+
+    /* Send command needed */
+    req.n.nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN);
+    req.n.nlmsg_type = id;
+    req.n.nlmsg_flags = NLM_F_REQUEST;
+    req.n.nlmsg_seq = 0;
+    req.n.nlmsg_pid = tid;
+    req.g.cmd = TASKSTATS_CMD_GET;
+    na = (struct nlattr *) GENLMSG_DATA(&req);
+    na->nla_type = cmd_type;
+    na->nla_len = sizeof(unsigned int) + NLA_HDRLEN;
+    *(__u32 *) NLA_DATA(na) = tid;
+    req.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);
+
+
+    if (!forking && sendto_fd(nl_sd, (char *) &req, req.n.nlmsg_len) < 0)
+	err(1, "error sending message via Netlink\n");
+
+    act.sa_handler = SIG_IGN;
+    sigemptyset(&act.sa_mask);
+    if (sigaction(SIGINT, &act, NULL) < 0)
+	err(1, "sigaction failed for SIGINT\n");
+
+    do {
+	int i;
+	struct pollfd pfd;
+	int pollres;
+
+	pfd.events = 0xffff & ~POLLOUT;
+	pfd.fd = nl_sd;
+	pollres = poll(&pfd, 1, 5000);
+	if (pollres < 0 || done) {
+	    break;
+	}
+
+	rep_len = recv(nl_sd, &ans, sizeof(ans), 0);
+	nladdr.nl_family = AF_NETLINK;
+	nladdr.nl_groups = TASKSTATS_LISTEN_GROUP;
+
+	if (ans.n.nlmsg_type == NLMSG_ERROR) {	/* error */
+	    printf("error received NACK - leaving\n");
+	    exit(1);
+	}
+
+	if (rep_len < 0) {
+	    err(1, "error receiving reply message via Netlink\n");
+	    break;
+	}
+
+	/* Validate response message */
+	if (!NLMSG_OK((&ans.n), rep_len))
+	    err(1, "invalid reply message received via Netlink\n");
+
+	rep_len = GENLMSG_PAYLOAD(&ans.n);
+
+	na = (struct nlattr *) GENLMSG_DATA(&ans);
+	len = 0;
+	i = 0;
+	while (len < rep_len) {
+	    len += NLA_ALIGN(na->nla_len);
+	    switch (na->nla_type) {
+	    case TASKSTATS_TYPE_AGGR_PID:
+		/* Fall through */
+	    case TASKSTATS_TYPE_AGGR_TGID:
+		aggr_len = NLA_PAYLOAD(na->nla_len);
+		len2 = 0;
+		/* For nested attributes, na follows */
+		na = (struct nlattr *) NLA_DATA(na);
+		done = 0;
+		while (len2 < aggr_len) {
+		    switch (na->nla_type) {
+		    case TASKSTATS_TYPE_PID:
+			rtid = *(int *) NLA_DATA(na);
+			break;
+		    case TASKSTATS_TYPE_TGID:
+			rtid = *(int *) NLA_DATA(na);
+			break;
+		    case TASKSTATS_TYPE_STATS:
+			if (rtid == tid) {
+			    print_taskstats((struct taskstats *)
+					    NLA_DATA(na));
+			    done = 1;
+			}
+			break;
+		    }
+		    len2 += NLA_ALIGN(na->nla_len);
+		    na = (struct nlattr *) ((char *) na + len2);
+		    if (done)
+			break;
+		}
+	    }
+	    na = (struct nlattr *) (GENLMSG_DATA(&ans) + len);
+	    if (done)
+		break;
+	}
+	if (done)
+	    break;
+    }
+    while (1);
+
+    close(nl_sd);
+    return 0;
+}
diff --git a/Documentation/accounting/taskstats.txt b/Documentation/accounting/taskstats.txt
index ad9b6997e16..acc6b4f37fc 100644
--- a/Documentation/accounting/taskstats.txt
+++ b/Documentation/accounting/taskstats.txt
@@ -39,6 +39,8 @@ belongs (the task does not need to be the thread group leader). The need for
 per-tgid stats to be sent for each exiting task is explained in the per-tgid
 stats section below.
 
+getdelays.c is a simple utility demonstrating usage of the taskstats interface
+for reporting delay accounting statistics.
 
 Interface
 ---------
-- 
cgit v1.2.3


From ad4ecbcba72855a2b5319b96e2a3a65ed1ca3bfd Mon Sep 17 00:00:00 2001
From: Shailabh Nagar <nagar@watson.ibm.com>
Date: Fri, 14 Jul 2006 00:24:44 -0700
Subject: [PATCH] delay accounting taskstats interface send tgid once

Send per-tgid data only once during exit of a thread group instead of once
with each member thread exit.

Currently, when a thread exits, besides its per-tid data, the per-tgid data
of its thread group is also sent out, if its thread group is non-empty.
The per-tgid data sent consists of the sum of per-tid stats for all
*remaining* threads of the thread group.

This patch modifies this sending in two ways:

- the per-tgid data is sent only when the last thread of a thread group
  exits.  This cuts down heavily on the overhead of sending/receiving
  per-tgid data, especially when other exploiters of the taskstats
  interface aren't interested in per-tgid stats

- the semantics of the per-tgid data sent are changed.  Instead of being
  the sum of per-tid data for remaining threads, the value now sent is the
  true total accumalated statistics for all threads that are/were part of
  the thread group.

The patch also addresses a minor issue where failure of one accounting
subsystem to fill in the taskstats structure was causing the send of
taskstats to not be sent at all.

The patch has been tested for stability and run cerberus for over 4 hours
on an SMP.

[akpm@osdl.org: bugfixes]
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/accounting/delay-accounting.txt | 13 ++++-------
 Documentation/accounting/taskstats.txt        | 33 +++++++++++----------------
 2 files changed, 17 insertions(+), 29 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/accounting/delay-accounting.txt b/Documentation/accounting/delay-accounting.txt
index f3dc0ca04fa..be215e58423 100644
--- a/Documentation/accounting/delay-accounting.txt
+++ b/Documentation/accounting/delay-accounting.txt
@@ -48,9 +48,10 @@ counter (say cpu_delay_total) for a task will give the delay
 experienced by the task waiting for the corresponding resource
 in that interval.
 
-When a task exits, records containing the per-task and per-process statistics
-are sent to userspace without requiring a command. More details are given in
-the taskstats interface description.
+When a task exits, records containing the per-task statistics
+are sent to userspace without requiring a command. If it is the last exiting
+task of a thread group, the per-tgid statistics are also sent. More details
+are given in the taskstats interface description.
 
 The getdelays.c userspace utility in this directory allows simple commands to
 be run and the corresponding delay statistics to be displayed. It also serves
@@ -107,9 +108,3 @@ IO	count	delay total
 	0	0
 MEM	count	delay total
 	0	0
-
-
-
-
-
-
diff --git a/Documentation/accounting/taskstats.txt b/Documentation/accounting/taskstats.txt
index acc6b4f37fc..efd8f605bcd 100644
--- a/Documentation/accounting/taskstats.txt
+++ b/Documentation/accounting/taskstats.txt
@@ -32,12 +32,11 @@ The response contains statistics for a task (if pid is specified) or the sum of
 statistics for all tasks of the process (if tgid is specified).
 
 To obtain statistics for tasks which are exiting, userspace opens a multicast
-netlink socket. Each time a task exits, two records are sent by the kernel to
-each listener on the multicast socket. The first the per-pid task's statistics
-and the second is the sum for all tasks of the process to which the task
-belongs (the task does not need to be the thread group leader). The need for
-per-tgid stats to be sent for each exiting task is explained in the per-tgid
-stats section below.
+netlink socket. Each time a task exits, its per-pid statistics is always sent
+by the kernel to each listener on the multicast socket. In addition, if it is
+the last thread exiting its thread group, an additional record containing the
+per-tgid stats are also sent. The latter contains the sum of per-pid stats for
+all threads in the thread group, both past and present.
 
 getdelays.c is a simple utility demonstrating usage of the taskstats interface
 for reporting delay accounting statistics.
@@ -104,20 +103,14 @@ stats in userspace alone is inefficient and potentially inaccurate (due to lack
 of atomicity).
 
 However, maintaining per-process, in addition to per-task stats, within the
-kernel has space and time overheads. Hence the taskstats implementation
-dynamically sums up the per-task stats for each task belonging to a process
-whenever per-process stats are needed.
-
-Not maintaining per-tgid stats creates a problem when userspace is interested
-in getting these stats when the process dies i.e. the last thread of
-a process exits. It isn't possible to simply return some aggregated per-process
-statistic from the kernel.
-
-The approach taken by taskstats is to return the per-tgid stats *each* time
-a task exits, in addition to the per-pid stats for that task. Userspace can
-maintain task<->process mappings and use them to maintain the per-process stats
-in userspace, updating the aggregate appropriately as the tasks of a process
-exit.
+kernel has space and time overheads. To address this, the taskstats code
+accumalates each exiting task's statistics into a process-wide data structure.
+When the last task of a process exits, the process level data accumalated also
+gets sent to userspace (along with the per-task data).
+
+When a user queries to get per-tgid data, the sum of all other live threads in
+the group is added up and added to the accumalated total for previously exited
+threads of the same thread group.
 
 Extending taskstats
 -------------------
-- 
cgit v1.2.3


From 9e06d3f9f6b14f6e3120923ed215032726246c98 Mon Sep 17 00:00:00 2001
From: Shailabh Nagar <nagar@watson.ibm.com>
Date: Fri, 14 Jul 2006 00:24:45 -0700
Subject: [PATCH] per task delay accounting taskstats interface: documentation
 fix

Change documentation and example program to reflect the flow control issues
being addressed by the cpumask changes.

Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/accounting/getdelays.c   | 606 +++++++++++++++++----------------
 Documentation/accounting/taskstats.txt |  64 +++-
 2 files changed, 365 insertions(+), 305 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c
index 33de89e56a3..795ca3911cc 100644
--- a/Documentation/accounting/getdelays.c
+++ b/Documentation/accounting/getdelays.c
@@ -5,6 +5,7 @@
  *
  * Copyright (C) Shailabh Nagar, IBM Corp. 2005
  * Copyright (C) Balbir Singh, IBM Corp. 2006
+ * Copyright (c) Jay Lan, SGI. 2006
  *
  */
 
@@ -36,341 +37,360 @@
 
 #define err(code, fmt, arg...) do { printf(fmt, ##arg); exit(code); } while (0)
 int done = 0;
+int rcvbufsz=0;
+
+    char name[100];
+int dbg=0, print_delays=0;
+__u64 stime, utime;
+#define PRINTF(fmt, arg...) {			\
+	    if (dbg) {				\
+		printf(fmt, ##arg);		\
+	    }					\
+	}
+
+/* Maximum size of response requested or message sent */
+#define MAX_MSG_SIZE	256
+/* Maximum number of cpus expected to be specified in a cpumask */
+#define MAX_CPUS	32
+/* Maximum length of pathname to log file */
+#define MAX_FILENAME	256
+
+struct msgtemplate {
+	struct nlmsghdr n;
+	struct genlmsghdr g;
+	char buf[MAX_MSG_SIZE];
+};
+
+char cpumask[100+6*MAX_CPUS];
 
 /*
  * Create a raw netlink socket and bind
  */
-static int create_nl_socket(int protocol, int groups)
+static int create_nl_socket(int protocol)
 {
-    socklen_t addr_len;
-    int fd;
-    struct sockaddr_nl local;
-
-    fd = socket(AF_NETLINK, SOCK_RAW, protocol);
-    if (fd < 0)
-	return -1;
+	int fd;
+	struct sockaddr_nl local;
+
+	fd = socket(AF_NETLINK, SOCK_RAW, protocol);
+	if (fd < 0)
+		return -1;
+
+	if (rcvbufsz)
+		if (setsockopt(fd, SOL_SOCKET, SO_RCVBUF,
+				&rcvbufsz, sizeof(rcvbufsz)) < 0) {
+			printf("Unable to set socket rcv buf size to %d\n",
+			       rcvbufsz);
+			return -1;
+		}
 
-    memset(&local, 0, sizeof(local));
-    local.nl_family = AF_NETLINK;
-    local.nl_groups = groups;
+	memset(&local, 0, sizeof(local));
+	local.nl_family = AF_NETLINK;
 
-    if (bind(fd, (struct sockaddr *) &local, sizeof(local)) < 0)
-	goto error;
+	if (bind(fd, (struct sockaddr *) &local, sizeof(local)) < 0)
+		goto error;
 
-    return fd;
-  error:
-    close(fd);
-    return -1;
+	return fd;
+error:
+	close(fd);
+	return -1;
 }
 
-int sendto_fd(int s, const char *buf, int bufLen)
+
+int send_cmd(int sd, __u16 nlmsg_type, __u32 nlmsg_pid,
+	     __u8 genl_cmd, __u16 nla_type,
+	     void *nla_data, int nla_len)
 {
-    struct sockaddr_nl nladdr;
-    int r;
-
-    memset(&nladdr, 0, sizeof(nladdr));
-    nladdr.nl_family = AF_NETLINK;
-
-    while ((r = sendto(s, buf, bufLen, 0, (struct sockaddr *) &nladdr,
-		       sizeof(nladdr))) < bufLen) {
-	if (r > 0) {
-	    buf += r;
-	    bufLen -= r;
-	} else if (errno != EAGAIN)
-	    return -1;
-    }
-    return 0;
+	struct nlattr *na;
+	struct sockaddr_nl nladdr;
+	int r, buflen;
+	char *buf;
+
+	struct msgtemplate msg;
+
+	msg.n.nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN);
+	msg.n.nlmsg_type = nlmsg_type;
+	msg.n.nlmsg_flags = NLM_F_REQUEST;
+	msg.n.nlmsg_seq = 0;
+	msg.n.nlmsg_pid = nlmsg_pid;
+	msg.g.cmd = genl_cmd;
+	msg.g.version = 0x1;
+	na = (struct nlattr *) GENLMSG_DATA(&msg);
+	na->nla_type = nla_type;
+	na->nla_len = nla_len + 1 + NLA_HDRLEN;
+	memcpy(NLA_DATA(na), nla_data, nla_len);
+	msg.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);
+
+	buf = (char *) &msg;
+	buflen = msg.n.nlmsg_len ;
+	memset(&nladdr, 0, sizeof(nladdr));
+	nladdr.nl_family = AF_NETLINK;
+	while ((r = sendto(sd, buf, buflen, 0, (struct sockaddr *) &nladdr,
+			   sizeof(nladdr))) < buflen) {
+		if (r > 0) {
+			buf += r;
+			buflen -= r;
+		} else if (errno != EAGAIN)
+			return -1;
+	}
+	return 0;
 }
 
+
 /*
  * Probe the controller in genetlink to find the family id
  * for the TASKSTATS family
  */
 int get_family_id(int sd)
 {
-    struct {
-	struct nlmsghdr n;
-	struct genlmsghdr g;
-	char buf[256];
-    } family_req;
-    struct {
-	struct nlmsghdr n;
-	struct genlmsghdr g;
-	char buf[256];
-    } ans;
-
-    int id;
-    struct nlattr *na;
-    int rep_len;
-
-    /* Get family name */
-    family_req.n.nlmsg_type = GENL_ID_CTRL;
-    family_req.n.nlmsg_flags = NLM_F_REQUEST;
-    family_req.n.nlmsg_seq = 0;
-    family_req.n.nlmsg_pid = getpid();
-    family_req.n.nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN);
-    family_req.g.cmd = CTRL_CMD_GETFAMILY;
-    family_req.g.version = 0x1;
-    na = (struct nlattr *) GENLMSG_DATA(&family_req);
-    na->nla_type = CTRL_ATTR_FAMILY_NAME;
-    na->nla_len = strlen(TASKSTATS_GENL_NAME) + 1 + NLA_HDRLEN;
-    strcpy(NLA_DATA(na), TASKSTATS_GENL_NAME);
-    family_req.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);
-
-    if (sendto_fd(sd, (char *) &family_req, family_req.n.nlmsg_len) < 0)
-	err(1, "error sending message via Netlink\n");
-
-    rep_len = recv(sd, &ans, sizeof(ans), 0);
-
-    if (rep_len < 0)
-	err(1, "error receiving reply message via Netlink\n");
-
-
-    /* Validate response message */
-    if (!NLMSG_OK((&ans.n), rep_len))
-	err(1, "invalid reply message received via Netlink\n");
-
-    if (ans.n.nlmsg_type == NLMSG_ERROR) {	/* error */
-	printf("error received NACK - leaving\n");
-	exit(1);
-    }
-
-
-    na = (struct nlattr *) GENLMSG_DATA(&ans);
-    na = (struct nlattr *) ((char *) na + NLA_ALIGN(na->nla_len));
-    if (na->nla_type == CTRL_ATTR_FAMILY_ID) {
-	id = *(__u16 *) NLA_DATA(na);
-    }
-    return id;
-}
+	struct {
+		struct nlmsghdr n;
+		struct genlmsghdr g;
+		char buf[256];
+	} ans;
+
+	int id, rc;
+	struct nlattr *na;
+	int rep_len;
+
+	strcpy(name, TASKSTATS_GENL_NAME);
+	rc = send_cmd(sd, GENL_ID_CTRL, getpid(), CTRL_CMD_GETFAMILY,
+			CTRL_ATTR_FAMILY_NAME, (void *)name,
+			strlen(TASKSTATS_GENL_NAME)+1);
+
+	rep_len = recv(sd, &ans, sizeof(ans), 0);
+	if (ans.n.nlmsg_type == NLMSG_ERROR ||
+	    (rep_len < 0) || !NLMSG_OK((&ans.n), rep_len))
+		return 0;
 
-void print_taskstats(struct taskstats *t)
-{
-    printf("\n\nCPU   %15s%15s%15s%15s\n"
-	   "      %15llu%15llu%15llu%15llu\n"
-	   "IO    %15s%15s\n"
-	   "      %15llu%15llu\n"
-	   "MEM   %15s%15s\n"
-	   "      %15llu%15llu\n\n",
-	   "count", "real total", "virtual total", "delay total",
-	   t->cpu_count, t->cpu_run_real_total, t->cpu_run_virtual_total,
-	   t->cpu_delay_total,
-	   "count", "delay total",
-	   t->blkio_count, t->blkio_delay_total,
-	   "count", "delay total", t->swapin_count, t->swapin_delay_total);
+	na = (struct nlattr *) GENLMSG_DATA(&ans);
+	na = (struct nlattr *) ((char *) na + NLA_ALIGN(na->nla_len));
+	if (na->nla_type == CTRL_ATTR_FAMILY_ID) {
+		id = *(__u16 *) NLA_DATA(na);
+	}
+	return id;
 }
 
-void sigchld(int sig)
+void print_delayacct(struct taskstats *t)
 {
-    done = 1;
+	printf("\n\nCPU   %15s%15s%15s%15s\n"
+	       "      %15llu%15llu%15llu%15llu\n"
+	       "IO    %15s%15s\n"
+	       "      %15llu%15llu\n"
+	       "MEM   %15s%15s\n"
+	       "      %15llu%15llu\n\n",
+	       "count", "real total", "virtual total", "delay total",
+	       t->cpu_count, t->cpu_run_real_total, t->cpu_run_virtual_total,
+	       t->cpu_delay_total,
+	       "count", "delay total",
+	       t->blkio_count, t->blkio_delay_total,
+	       "count", "delay total", t->swapin_count, t->swapin_delay_total);
 }
 
 int main(int argc, char *argv[])
 {
-    int rc;
-    int sk_nl;
-    struct nlmsghdr *nlh;
-    struct genlmsghdr *genlhdr;
-    char *buf;
-    struct taskstats_cmd_param *param;
-    __u16 id;
-    struct nlattr *na;
-
-    /* For receiving */
-    struct sockaddr_nl kern_nla, from_nla;
-    socklen_t from_nla_len;
-    int recv_len;
-    struct taskstats_reply *reply;
-
-    struct {
-	struct nlmsghdr n;
-	struct genlmsghdr g;
-	char buf[256];
-    } req;
+	int c, rc, rep_len, aggr_len, len2, cmd_type;
+	__u16 id;
+	__u32 mypid;
+
+	struct nlattr *na;
+	int nl_sd = -1;
+	int len = 0;
+	pid_t tid = 0;
+	pid_t rtid = 0;
+
+	int fd = 0;
+	int count = 0;
+	int write_file = 0;
+	int maskset = 0;
+	char logfile[128];
+	int loop = 0;
+
+	struct msgtemplate msg;
+
+	while (1) {
+		c = getopt(argc, argv, "dw:r:m:t:p:v:l");
+		if (c < 0)
+			break;
 
-    struct {
-	struct nlmsghdr n;
-	struct genlmsghdr g;
-	char buf[256];
-    } ans;
-
-    int nl_sd = -1;
-    int rep_len;
-    int len = 0;
-    int aggr_len, len2;
-    struct sockaddr_nl nladdr;
-    pid_t tid = 0;
-    pid_t rtid = 0;
-    int cmd_type = TASKSTATS_TYPE_TGID;
-    int c, status;
-    int forking = 0;
-    struct sigaction act = {
-	.sa_handler = SIG_IGN,
-	.sa_mask = SA_NOMASK,
-    };
-    struct sigaction tact ;
-
-    if (argc < 3) {
-	printf("usage %s [-t tgid][-p pid][-c cmd]\n", argv[0]);
-	exit(-1);
-    }
-
-    tact.sa_handler = sigchld;
-    sigemptyset(&tact.sa_mask);
-    if (sigaction(SIGCHLD, &tact, NULL) < 0)
-	err(1, "sigaction failed for SIGCHLD\n");
-
-    while (1) {
-
-	c = getopt(argc, argv, "t:p:c:");
-	if (c < 0)
-	    break;
-
-	switch (c) {
-	case 't':
-	    tid = atoi(optarg);
-	    if (!tid)
-		err(1, "Invalid tgid\n");
-	    cmd_type = TASKSTATS_CMD_ATTR_TGID;
-	    break;
-	case 'p':
-	    tid = atoi(optarg);
-	    if (!tid)
-		err(1, "Invalid pid\n");
-	    cmd_type = TASKSTATS_CMD_ATTR_TGID;
-	    break;
-	case 'c':
-	    opterr = 0;
-	    tid = fork();
-	    if (tid < 0)
-		err(1, "fork failed\n");
-
-	    if (tid == 0) {	/* child process */
-		if (execvp(argv[optind - 1], &argv[optind - 1]) < 0) {
-		    exit(-1);
+		switch (c) {
+		case 'd':
+			printf("print delayacct stats ON\n");
+			print_delays = 1;
+			break;
+		case 'w':
+			strncpy(logfile, optarg, MAX_FILENAME);
+			printf("write to file %s\n", logfile);
+			write_file = 1;
+			break;
+		case 'r':
+			rcvbufsz = atoi(optarg);
+			printf("receive buf size %d\n", rcvbufsz);
+			if (rcvbufsz < 0)
+				err(1, "Invalid rcv buf size\n");
+			break;
+		case 'm':
+			strncpy(cpumask, optarg, sizeof(cpumask));
+			maskset = 1;
+			printf("cpumask %s maskset %d\n", cpumask, maskset);
+			break;
+		case 't':
+			tid = atoi(optarg);
+			if (!tid)
+				err(1, "Invalid tgid\n");
+			cmd_type = TASKSTATS_CMD_ATTR_TGID;
+			print_delays = 1;
+			break;
+		case 'p':
+			tid = atoi(optarg);
+			if (!tid)
+				err(1, "Invalid pid\n");
+			cmd_type = TASKSTATS_CMD_ATTR_PID;
+			print_delays = 1;
+			break;
+		case 'v':
+			printf("debug on\n");
+			dbg = 1;
+			break;
+		case 'l':
+			printf("listen forever\n");
+			loop = 1;
+			break;
+		default:
+			printf("Unknown option %d\n", c);
+			exit(-1);
 		}
-	    }
-	    forking = 1;
-	    break;
-	default:
-	    printf("usage %s [-t tgid][-p pid][-c cmd]\n", argv[0]);
-	    exit(-1);
-	    break;
 	}
-	if (c == 'c')
-	    break;
-    }
-
-    /* Construct Netlink request message */
-
-    /* Send Netlink request message & get reply */
 
-    if ((nl_sd =
-	 create_nl_socket(NETLINK_GENERIC, TASKSTATS_LISTEN_GROUP)) < 0)
-	err(1, "error creating Netlink socket\n");
-
-
-    id = get_family_id(nl_sd);
-
-    /* Send command needed */
-    req.n.nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN);
-    req.n.nlmsg_type = id;
-    req.n.nlmsg_flags = NLM_F_REQUEST;
-    req.n.nlmsg_seq = 0;
-    req.n.nlmsg_pid = tid;
-    req.g.cmd = TASKSTATS_CMD_GET;
-    na = (struct nlattr *) GENLMSG_DATA(&req);
-    na->nla_type = cmd_type;
-    na->nla_len = sizeof(unsigned int) + NLA_HDRLEN;
-    *(__u32 *) NLA_DATA(na) = tid;
-    req.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);
-
-
-    if (!forking && sendto_fd(nl_sd, (char *) &req, req.n.nlmsg_len) < 0)
-	err(1, "error sending message via Netlink\n");
+	if (write_file) {
+		fd = open(logfile, O_WRONLY | O_CREAT | O_TRUNC,
+			  S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
+		if (fd == -1) {
+			perror("Cannot open output file\n");
+			exit(1);
+		}
+	}
 
-    act.sa_handler = SIG_IGN;
-    sigemptyset(&act.sa_mask);
-    if (sigaction(SIGINT, &act, NULL) < 0)
-	err(1, "sigaction failed for SIGINT\n");
+	if ((nl_sd = create_nl_socket(NETLINK_GENERIC)) < 0)
+		err(1, "error creating Netlink socket\n");
 
-    do {
-	int i;
-	struct pollfd pfd;
-	int pollres;
 
-	pfd.events = 0xffff & ~POLLOUT;
-	pfd.fd = nl_sd;
-	pollres = poll(&pfd, 1, 5000);
-	if (pollres < 0 || done) {
-	    break;
+	mypid = getpid();
+	id = get_family_id(nl_sd);
+	if (!id) {
+		printf("Error getting family id, errno %d", errno);
+		goto err;
 	}
-
-	rep_len = recv(nl_sd, &ans, sizeof(ans), 0);
-	nladdr.nl_family = AF_NETLINK;
-	nladdr.nl_groups = TASKSTATS_LISTEN_GROUP;
-
-	if (ans.n.nlmsg_type == NLMSG_ERROR) {	/* error */
-	    printf("error received NACK - leaving\n");
-	    exit(1);
+	PRINTF("family id %d\n", id);
+
+	if (maskset) {
+		rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET,
+			      TASKSTATS_CMD_ATTR_REGISTER_CPUMASK,
+			      &cpumask, sizeof(cpumask));
+		PRINTF("Sent register cpumask, retval %d\n", rc);
+		if (rc < 0) {
+			printf("error sending register cpumask\n");
+			goto err;
+		}
 	}
 
-	if (rep_len < 0) {
-	    err(1, "error receiving reply message via Netlink\n");
-	    break;
+	if (tid) {
+		rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET,
+			      cmd_type, &tid, sizeof(__u32));
+		PRINTF("Sent pid/tgid, retval %d\n", rc);
+		if (rc < 0) {
+			printf("error sending tid/tgid cmd\n");
+			goto done;
+		}
 	}
 
-	/* Validate response message */
-	if (!NLMSG_OK((&ans.n), rep_len))
-	    err(1, "invalid reply message received via Netlink\n");
+	do {
+		int i;
 
-	rep_len = GENLMSG_PAYLOAD(&ans.n);
+		rep_len = recv(nl_sd, &msg, sizeof(msg), 0);
+		PRINTF("received %d bytes\n", rep_len);
 
-	na = (struct nlattr *) GENLMSG_DATA(&ans);
-	len = 0;
-	i = 0;
-	while (len < rep_len) {
-	    len += NLA_ALIGN(na->nla_len);
-	    switch (na->nla_type) {
-	    case TASKSTATS_TYPE_AGGR_PID:
-		/* Fall through */
-	    case TASKSTATS_TYPE_AGGR_TGID:
-		aggr_len = NLA_PAYLOAD(na->nla_len);
-		len2 = 0;
-		/* For nested attributes, na follows */
-		na = (struct nlattr *) NLA_DATA(na);
-		done = 0;
-		while (len2 < aggr_len) {
-		    switch (na->nla_type) {
-		    case TASKSTATS_TYPE_PID:
-			rtid = *(int *) NLA_DATA(na);
-			break;
-		    case TASKSTATS_TYPE_TGID:
-			rtid = *(int *) NLA_DATA(na);
-			break;
-		    case TASKSTATS_TYPE_STATS:
-			if (rtid == tid) {
-			    print_taskstats((struct taskstats *)
-					    NLA_DATA(na));
-			    done = 1;
+		if (rep_len < 0) {
+			printf("nonfatal reply error: errno %d\n", errno);
+			continue;
+		}
+		if (msg.n.nlmsg_type == NLMSG_ERROR ||
+		    !NLMSG_OK((&msg.n), rep_len)) {
+			printf("fatal reply error,  errno %d\n", errno);
+			goto done;
+		}
+
+		PRINTF("nlmsghdr size=%d, nlmsg_len=%d, rep_len=%d\n",
+		       sizeof(struct nlmsghdr), msg.n.nlmsg_len, rep_len);
+
+
+		rep_len = GENLMSG_PAYLOAD(&msg.n);
+
+		na = (struct nlattr *) GENLMSG_DATA(&msg);
+		len = 0;
+		i = 0;
+		while (len < rep_len) {
+			len += NLA_ALIGN(na->nla_len);
+			switch (na->nla_type) {
+			case TASKSTATS_TYPE_AGGR_TGID:
+				/* Fall through */
+			case TASKSTATS_TYPE_AGGR_PID:
+				aggr_len = NLA_PAYLOAD(na->nla_len);
+				len2 = 0;
+				/* For nested attributes, na follows */
+				na = (struct nlattr *) NLA_DATA(na);
+				done = 0;
+				while (len2 < aggr_len) {
+					switch (na->nla_type) {
+					case TASKSTATS_TYPE_PID:
+						rtid = *(int *) NLA_DATA(na);
+						if (print_delays)
+							printf("PID\t%d\n", rtid);
+						break;
+					case TASKSTATS_TYPE_TGID:
+						rtid = *(int *) NLA_DATA(na);
+						if (print_delays)
+							printf("TGID\t%d\n", rtid);
+						break;
+					case TASKSTATS_TYPE_STATS:
+						count++;
+						if (print_delays)
+							print_delayacct((struct taskstats *) NLA_DATA(na));
+						if (fd) {
+							if (write(fd, NLA_DATA(na), na->nla_len) < 0) {
+								err(1,"write error\n");
+							}
+						}
+						if (!loop)
+							goto done;
+						break;
+					default:
+						printf("Unknown nested nla_type %d\n", na->nla_type);
+						break;
+					}
+					len2 += NLA_ALIGN(na->nla_len);
+					na = (struct nlattr *) ((char *) na + len2);
+				}
+				break;
+
+			default:
+				printf("Unknown nla_type %d\n", na->nla_type);
+				break;
 			}
-			break;
-		    }
-		    len2 += NLA_ALIGN(na->nla_len);
-		    na = (struct nlattr *) ((char *) na + len2);
-		    if (done)
-			break;
+			na = (struct nlattr *) (GENLMSG_DATA(&msg) + len);
 		}
-	    }
-	    na = (struct nlattr *) (GENLMSG_DATA(&ans) + len);
-	    if (done)
-		break;
+	} while (loop);
+done:
+	if (maskset) {
+		rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET,
+			      TASKSTATS_CMD_ATTR_DEREGISTER_CPUMASK,
+			      &cpumask, sizeof(cpumask));
+		printf("Sent deregister mask, retval %d\n", rc);
+		if (rc < 0)
+			err(rc, "error sending deregister cpumask\n");
 	}
-	if (done)
-	    break;
-    }
-    while (1);
-
-    close(nl_sd);
-    return 0;
+err:
+	close(nl_sd);
+	if (fd)
+		close(fd);
+	return 0;
 }
diff --git a/Documentation/accounting/taskstats.txt b/Documentation/accounting/taskstats.txt
index efd8f605bcd..92ebf29e904 100644
--- a/Documentation/accounting/taskstats.txt
+++ b/Documentation/accounting/taskstats.txt
@@ -26,20 +26,28 @@ leader - a process is deemed alive as long as it has any task belonging to it.
 Usage
 -----
 
-To get statistics during task's lifetime, userspace opens a unicast netlink
+To get statistics during a task's lifetime, userspace opens a unicast netlink
 socket (NETLINK_GENERIC family) and sends commands specifying a pid or a tgid.
 The response contains statistics for a task (if pid is specified) or the sum of
 statistics for all tasks of the process (if tgid is specified).
 
-To obtain statistics for tasks which are exiting, userspace opens a multicast
-netlink socket. Each time a task exits, its per-pid statistics is always sent
-by the kernel to each listener on the multicast socket. In addition, if it is
-the last thread exiting its thread group, an additional record containing the
-per-tgid stats are also sent. The latter contains the sum of per-pid stats for
-all threads in the thread group, both past and present.
+To obtain statistics for tasks which are exiting, the userspace listener
+sends a register command and specifies a cpumask. Whenever a task exits on
+one of the cpus in the cpumask, its per-pid statistics are sent to the
+registered listener. Using cpumasks allows the data received by one listener
+to be limited and assists in flow control over the netlink interface and is
+explained in more detail below.
+
+If the exiting task is the last thread exiting its thread group,
+an additional record containing the per-tgid stats is also sent to userspace.
+The latter contains the sum of per-pid stats for all threads in the thread
+group, both past and present.
 
 getdelays.c is a simple utility demonstrating usage of the taskstats interface
-for reporting delay accounting statistics.
+for reporting delay accounting statistics. Users can register cpumasks,
+send commands and process responses, listen for per-tid/tgid exit data,
+write the data received to a file and do basic flow control by increasing
+receive buffer sizes.
 
 Interface
 ---------
@@ -66,10 +74,20 @@ The messages are in the format
 
 The taskstats payload is one of the following three kinds:
 
-1. Commands: Sent from user to kernel. The payload is one attribute, of type
-TASKSTATS_CMD_ATTR_PID/TGID, containing a u32 pid or tgid in the attribute
-payload. The pid/tgid denotes the task/process for which userspace wants
-statistics.
+1. Commands: Sent from user to kernel. Commands to get data on
+a pid/tgid consist of one attribute, of type TASKSTATS_CMD_ATTR_PID/TGID,
+containing a u32 pid or tgid in the attribute payload. The pid/tgid denotes
+the task/process for which userspace wants statistics.
+
+Commands to register/deregister interest in exit data from a set of cpus
+consist of one attribute, of type
+TASKSTATS_CMD_ATTR_REGISTER/DEREGISTER_CPUMASK and contain a cpumask in the
+attribute payload. The cpumask is specified as an ascii string of
+comma-separated cpu ranges e.g. to listen to exit data from cpus 1,2,3,5,7,8
+the cpumask would be "1-3,5,7-8". If userspace forgets to deregister interest
+in cpus before closing the listening socket, the kernel cleans up its interest
+set over time. However, for the sake of efficiency, an explicit deregistration
+is advisable.
 
 2. Response for a command: sent from the kernel in response to a userspace
 command. The payload is a series of three attributes of type:
@@ -138,4 +156,26 @@ struct too much, requiring disparate userspace accounting utilities to
 unnecessarily receive large structures whose fields are of no interest, then
 extending the attributes structure would be worthwhile.
 
+Flow control for taskstats
+--------------------------
+
+When the rate of task exits becomes large, a listener may not be able to keep
+up with the kernel's rate of sending per-tid/tgid exit data leading to data
+loss. This possibility gets compounded when the taskstats structure gets
+extended and the number of cpus grows large.
+
+To avoid losing statistics, userspace should do one or more of the following:
+
+- increase the receive buffer sizes for the netlink sockets opened by
+listeners to receive exit data.
+
+- create more listeners and reduce the number of cpus being listened to by
+each listener. In the extreme case, there could be one listener for each cpu.
+Users may also consider setting the cpu affinity of the listener to the subset
+of cpus to which it listens, especially if they are listening to just one cpu.
+
+Despite these measures, if the userspace receives ENOBUFS error messages
+indicated overflow of receive buffers, it should take measures to handle the
+loss of data.
+
 ----
-- 
cgit v1.2.3


From f92213bae062cf88c099fbfd3040fef512b19905 Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt@goodmis.org>
Date: Fri, 14 Jul 2006 16:05:01 -0400
Subject: [PATCH] remove set_wmb - doc update

This patch removes the reference to set_wmb from memory-barriers.txt
since it shouldn't be used.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/memory-barriers.txt | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 28d1bc3edb1..46b9b389df3 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1015,10 +1015,9 @@ CPU from reordering them.
 There are some more advanced barrier functions:
 
  (*) set_mb(var, value)
- (*) set_wmb(var, value)
 
-     These assign the value to the variable and then insert at least a write
-     barrier after it, depending on the function.  They aren't guaranteed to
+     This assigns the value to the variable and then inserts at least a write
+     barrier after it, depending on the function.  It isn't guaranteed to
      insert anything more than a compiler barrier in a UP compilation.
 
 
-- 
cgit v1.2.3


From 10ea6ac895418bd0d23900e3330daa6ba0836d26 Mon Sep 17 00:00:00 2001
From: Patrick McHardy <kaber@trash.net>
Date: Mon, 24 Jul 2006 22:54:55 -0700
Subject: [NETFILTER]: bridge netfilter: add deferred output hooks to
 feature-removal-schedule

Add bridge netfilter deferred output hooks to feature-removal-schedule
and disable them by default. Until their removal they will be
activated by the physdev match when needed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/feature-removal-schedule.txt | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 9d3a0775a11..87851efb022 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -258,3 +258,19 @@ Why:	These drivers never compiled since they were added to the kernel
 Who:	Jean Delvare <khali@linux-fr.org>
 
 ---------------------------
+
+What:	Bridge netfilter deferred IPv4/IPv6 output hook calling
+When:	January 2007
+Why:	The deferred output hooks are a layering violation causing unusual
+	and broken behaviour on bridge devices. Examples of things they
+	break include QoS classifation using the MARK or CLASSIFY targets,
+	the IPsec policy match and connection tracking with VLANs on a
+	bridge. Their only use is to enable bridge output port filtering
+	within iptables with the physdev match, which can also be done by
+	combining iptables and ebtables using netfilter marks. Until it
+	will get removed the hook deferral is disabled by default and is
+	only enabled when needed.
+
+Who:	Patrick McHardy <kaber@trash.net>
+
+---------------------------
-- 
cgit v1.2.3


From fbf6080225a03aa2b3671acacebdf615f1d3f6ba Mon Sep 17 00:00:00 2001
From: "Ju, Seokmann" <Seokmann.Ju@lsil.com>
Date: Tue, 25 Jul 2006 08:44:48 -0600
Subject: [SCSI] megaraid_{mm,mbox}: 64-bit DMA capability checker

This patch contains
- a fix for 64-bit DMA capability check in megaraid_{mm,mbox} driver.
- includes changes (going back to 32-bit DMA mask if 64-bit DMA mask
failes) suggested by James with previous patch.
- addition of SATA 150-4/6 as commented by Vasily Averin.

With patch, the driver access PCIconfiguration space with dedicated
offset to read a signature. If the signature read, it means that the
controller has capability to handle 64-bit DMA.
Without this patch, the driver used to blindly claim 64-bit DMA
capability.
The issue has been reported by Vasily Averin [vvs@sw.ru].
Thank you Vasily for the reporting.

Signed-Off By: Seokmann Ju <seokmann.ju@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
---
 Documentation/scsi/ChangeLog.megaraid | 61 +++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/scsi/ChangeLog.megaraid b/Documentation/scsi/ChangeLog.megaraid
index c173806c91f..fd8939e0045 100644
--- a/Documentation/scsi/ChangeLog.megaraid
+++ b/Documentation/scsi/ChangeLog.megaraid
@@ -1,3 +1,64 @@
+Release Date	: Fri May 19 09:31:45 EST 2006 - Seokmann Ju <sju@lsil.com>
+Current Version : 2.20.4.9 (scsi module), 2.20.2.6 (cmm module)
+Older Version	: 2.20.4.8 (scsi module), 2.20.2.6 (cmm module)
+
+1.	Fixed a bug in megaraid_init_mbox().
+	Customer reported "garbage in file on x86_64 platform".
+	Root Cause: the driver registered controllers as 64-bit DMA capable
+	for those which are not support it.
+	Fix: Made change in the function inserting identification machanism
+	identifying 64-bit DMA capable controllers.
+
+	> -----Original Message-----
+	> From: Vasily Averin [mailto:vvs@sw.ru]
+	> Sent: Thursday, May 04, 2006 2:49 PM
+	> To: linux-scsi@vger.kernel.org; Kolli, Neela; Mukker, Atul;
+	> Ju, Seokmann; Bagalkote, Sreenivas;
+	> James.Bottomley@SteelEye.com; devel@openvz.org
+	> Subject: megaraid_mbox: garbage in file
+	>
+	> Hello all,
+	>
+	> I've investigated customers claim on the unstable work of
+	> their node and found a
+	> strange effect: reading from some files leads to the
+	>  "attempt to access beyond end of device" messages.
+	>
+	> I've checked filesystem, memory on the node, motherboard BIOS
+	> version, but it
+	> does not help and issue still has been reproduced by simple
+	> file reading.
+	>
+	> Reproducer is simple:
+	>
+	> echo 0xffffffff >/proc/sys/dev/scsi/logging_level ;
+	> cat /vz/private/101/root/etc/ld.so.cache >/tmp/ttt  ;
+	> echo 0 >/proc/sys/dev/scsi/logging
+	>
+	> It leads to the following messages in dmesg
+	>
+	> sd_init_command: disk=sda, block=871769260, count=26
+	> sda : block=871769260
+	> sda : reading 26/26 512 byte blocks.
+	> scsi_add_timer: scmd: f79ed980, time: 7500, (c02b1420)
+	> sd 0:1:0:0: send 0xf79ed980                  sd 0:1:0:0:
+	>         command: Read (10): 28 00 33 f6 24 ac 00 00 1a 00
+	> buffer = 0xf7cfb540, bufflen = 13312, done = 0xc0366b40,
+	> queuecommand 0xc0344010
+	> leaving scsi_dispatch_cmnd()
+	> scsi_delete_timer: scmd: f79ed980, rtn: 1
+	> sd 0:1:0:0: done 0xf79ed980 SUCCESS        0 sd 0:1:0:0:
+	>         command: Read (10): 28 00 33 f6 24 ac 00 00 1a 00
+	> scsi host busy 1 failed 0
+	> sd 0:1:0:0: Notifying upper driver of completion (result 0)
+	> sd_rw_intr: sda: res=0x0
+	> 26 sectors total, 13312 bytes done.
+	> use_sg is 4
+	> attempt to access beyond end of device
+	> sda6: rw=0, want=1044134458, limit=951401367
+	> Buffer I/O error on device sda6, logical block 522067228
+	> attempt to access beyond end of device
+
 Release Date	: Mon Apr 11 12:27:22 EST 2006 - Seokmann Ju <sju@lsil.com>
 Current Version : 2.20.4.8 (scsi module), 2.20.2.6 (cmm module)
 Older Version	: 2.20.4.7 (scsi module), 2.20.2.6 (cmm module)
-- 
cgit v1.2.3


From aa677bc7445147f663ebde69d248a30839bada76 Mon Sep 17 00:00:00 2001
From: "Ju, Seokmann" <Seokmann.Ju@lsil.com>
Date: Tue, 25 Jul 2006 08:44:58 -0600
Subject: [SCSI] megaraid_{mm,mbox}: a fix on INQUIRY with EVPD

With this patch, driver will protect data corruption created by
INQUIRY with EVPD request to megaraid controllers.  As specified in
the changelog, megaraid F/W already has fixed the issue and being
under process of release. Meanwhile, driver will protect the system
with this patch.

Signed-Off By: Seokmann Ju <seokmann.ju@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
---
 Documentation/scsi/ChangeLog.megaraid | 7 +++++++
 1 file changed, 7 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/scsi/ChangeLog.megaraid b/Documentation/scsi/ChangeLog.megaraid
index fd8939e0045..0edb048b2ea 100644
--- a/Documentation/scsi/ChangeLog.megaraid
+++ b/Documentation/scsi/ChangeLog.megaraid
@@ -59,6 +59,13 @@ Older Version	: 2.20.4.8 (scsi module), 2.20.2.6 (cmm module)
 	> Buffer I/O error on device sda6, logical block 522067228
 	> attempt to access beyond end of device
 
+2.	When INQUIRY with EVPD bit set issued to the MegaRAID controller,
+	system memory gets corrupted.
+	Root Cause: MegaRAID F/W handle the INQUIRY with EVPD bit set
+	incorrectly.
+	Fix: MegaRAID F/W has fixed the problem and being process of release,
+	soon. Meanwhile, driver will filter out the request.
+
 Release Date	: Mon Apr 11 12:27:22 EST 2006 - Seokmann Ju <sju@lsil.com>
 Current Version : 2.20.4.8 (scsi module), 2.20.2.6 (cmm module)
 Older Version	: 2.20.4.7 (scsi module), 2.20.2.6 (cmm module)
-- 
cgit v1.2.3


From 0b4972d59170e13ab0236e8a7148112052590c01 Mon Sep 17 00:00:00 2001
From: "Ju, Seokmann" <Seokmann.Ju@lsil.com>
Date: Tue, 25 Jul 2006 08:45:06 -0600
Subject: [SCSI] megaraid_{mm,mbox}: a fix on "kernel unaligned access address"
 issue

There was an issue in the data structure defined by megaraid driver
casuing "kernel unaligned access.." messages to be displayed during
IOCTL on IA64 platform.

The issue has been reported/fixed by Sakurai Hiroomi
[sakurai_hiro@soft.fujitsu.com].

Signed-Off By: Seokmann Ju <seokmann.ju@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
---
 Documentation/scsi/ChangeLog.megaraid | 55 +++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/scsi/ChangeLog.megaraid b/Documentation/scsi/ChangeLog.megaraid
index 0edb048b2ea..a056bbe67c7 100644
--- a/Documentation/scsi/ChangeLog.megaraid
+++ b/Documentation/scsi/ChangeLog.megaraid
@@ -66,6 +66,61 @@ Older Version	: 2.20.4.8 (scsi module), 2.20.2.6 (cmm module)
 	Fix: MegaRAID F/W has fixed the problem and being process of release,
 	soon. Meanwhile, driver will filter out the request.
 
+3.	One of member in the data structure of the driver leads unaligne
+	issue on 64-bit platform.
+	Customer reporeted "kernel unaligned access addrss" issue when
+	application communicates with MegaRAID HBA driver.
+	Root Cause: in uioc_t structure, one of member had misaligned and it
+	led system to display the error message.
+	Fix: A patch submitted to community from following folk.
+
+	> -----Original Message-----
+	> From: linux-scsi-owner@vger.kernel.org
+	> [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Sakurai Hiroomi
+	> Sent: Wednesday, July 12, 2006 4:20 AM
+	> To: linux-scsi@vger.kernel.org; linux-kernel@vger.kernel.org
+	> Subject: Re: Help: strange messages from kernel on IA64 platform
+	>
+	> Hi,
+	>
+	> I saw same message.
+	>
+	> When GAM(Global Array Manager) is started, The following
+	> message output.
+	> kernel: kernel unaligned access to 0xe0000001fe1080d4,
+	> ip=0xa000000200053371
+	>
+	> The uioc structure used by ioctl is defined by packed,
+	> the allignment of each member are disturbed.
+	> In a 64 bit structure, the allignment of member doesn't fit 64 bit
+	> boundary. this causes this messages.
+	> In a 32 bit structure, we don't see the message because the allinment
+	> of member fit 32 bit boundary even if packed is specified.
+	>
+	> patch
+	> I Add 32 bit dummy member to fit 64 bit boundary. I tested.
+	> We confirmed this patch fix the problem by IA64 server.
+	>
+	> **************************************************************
+	> ****************
+	> --- linux-2.6.9/drivers/scsi/megaraid/megaraid_ioctl.h.orig
+	> 2006-04-03 17:13:03.000000000 +0900
+	> +++ linux-2.6.9/drivers/scsi/megaraid/megaraid_ioctl.h
+	> 2006-04-03 17:14:09.000000000 +0900
+	> @@ -132,6 +132,10 @@
+	>  /* Driver Data: */
+	>          void __user *           user_data;
+	>          uint32_t                user_data_len;
+	> +
+	> +        /* 64bit alignment */
+	> +        uint32_t                pad_0xBC;
+	> +
+	>          mraid_passthru_t        __user *user_pthru;
+	>
+	>          mraid_passthru_t        *pthru32;
+	> **************************************************************
+	> ****************
+
 Release Date	: Mon Apr 11 12:27:22 EST 2006 - Seokmann Ju <sju@lsil.com>
 Current Version : 2.20.4.8 (scsi module), 2.20.2.6 (cmm module)
 Older Version	: 2.20.4.7 (scsi module), 2.20.2.6 (cmm module)
-- 
cgit v1.2.3


From b783fd925cdd56d24d164e5bdcb072f2a67aedf4 Mon Sep 17 00:00:00 2001
From: Andi Kleen <ak@suse.de>
Date: Fri, 28 Jul 2006 14:44:54 +0200
Subject: [PATCH] x86_64: Document backtracer selection options

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/x86_64/boot-options.txt | 7 +++++++
 1 file changed, 7 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/x86_64/boot-options.txt b/Documentation/x86_64/boot-options.txt
index 6887d44d266..6da24e7a56c 100644
--- a/Documentation/x86_64/boot-options.txt
+++ b/Documentation/x86_64/boot-options.txt
@@ -238,6 +238,13 @@ Debugging
   pagefaulttrace Dump all page faults. Only useful for extreme debugging
 		and will create a lot of output.
 
+  call_trace=[old|both|newfallback|new]
+		old: use old inexact backtracer
+		new: use new exact dwarf2 unwinder
+ 		both: print entries from both
+		newfallback: use new unwinder but fall back to old if it gets
+			stuck (default)
+
 Misc
 
   noreplacement  Don't replace instructions with more appropriate ones
-- 
cgit v1.2.3


From 163ecdff060f2fa9e8f5238882fd0137493556a6 Mon Sep 17 00:00:00 2001
From: Shailabh Nagar <nagar@watson.ibm.com>
Date: Sun, 30 Jul 2006 03:03:11 -0700
Subject: [PATCH] delay accounting: temporarily enable by default

Enable delay accounting by default so that feature gets coverage testing
without requiring special measures.

Earlier, it was off by default and had to be enabled via a boot time param.
 This patch reverses the default behaviour to improve coverage testing.  It
can be removed late in the kernel development cycle if its believed users
shouldn't have to incur any cost if they don't want delay accounting.  Or
it can be retained forever if the utility of the stats is deemed common
enough to warrant keeping the feature on.

Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/accounting/delay-accounting.txt | 10 ++++++----
 Documentation/kernel-parameters.txt           |  4 ++--
 2 files changed, 8 insertions(+), 6 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/accounting/delay-accounting.txt b/Documentation/accounting/delay-accounting.txt
index be215e58423..1443cd71d26 100644
--- a/Documentation/accounting/delay-accounting.txt
+++ b/Documentation/accounting/delay-accounting.txt
@@ -64,11 +64,13 @@ Compile the kernel with
 	CONFIG_TASK_DELAY_ACCT=y
 	CONFIG_TASKSTATS=y
 
-Enable the accounting at boot time by adding
-the following to the kernel boot options
-	delayacct
+Delay accounting is enabled by default at boot up.
+To disable, add
+   nodelayacct
+to the kernel boot options. The rest of the instructions
+below assume this has not been done.
 
-and after the system has booted up, use a utility
+After the system has booted up, use a utility
 similar to  getdelays.c to access the delays
 seen by a given task or a task group (tgid).
 The utility also allows a given command to be
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index e11f7728ec6..b50595a0550 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -448,8 +448,6 @@ running once the system is up.
 			Format: <area>[,<node>]
 			See also Documentation/networking/decnet.txt.
 
-	delayacct	[KNL] Enable per-task delay accounting
-
 	dhash_entries=	[KNL]
 			Set number of hash buckets for dentry cache.
 
@@ -1031,6 +1029,8 @@ running once the system is up.
 
 	nocache		[ARM]
 
+	nodelayacct	[KNL] Disable per-task delay accounting
+
 	nodisconnect	[HW,SCSI,M68K] Disables SCSI disconnects.
 
 	noexec		[IA-64]
-- 
cgit v1.2.3


From 7c7165c90801609b70492e50b2a9c69a677c573a Mon Sep 17 00:00:00 2001
From: Chandra Seetharaman <sekharan@us.ibm.com>
Date: Sun, 30 Jul 2006 03:03:36 -0700
Subject: [PATCH] cpu hotplug: fix hotplug cpu documentation for proper usage

Update hotplug cpu documentation to clearly state when to use
register_cpu_notifier() and register_hotcpu_notifier.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/cpu-hotplug.txt | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/cpu-hotplug.txt b/Documentation/cpu-hotplug.txt
index 1bcf69996c9..bc107cb157a 100644
--- a/Documentation/cpu-hotplug.txt
+++ b/Documentation/cpu-hotplug.txt
@@ -251,16 +251,24 @@ A: This is what you would need in your kernel code to receive notifications.
 		return NOTIFY_OK;
 	}
 
-	static struct notifier_block foobar_cpu_notifer =
+	static struct notifier_block __cpuinitdata foobar_cpu_notifer =
 	{
 	   .notifier_call = foobar_cpu_callback,
 	};
 
+You need to call register_cpu_notifier() from your init function.
+Init functions could be of two types:
+1. early init (init function called when only the boot processor is online).
+2. late init (init function called _after_ all the CPUs are online).
 
-In your init function,
+For the first case, you should add the following to your init function
 
 	register_cpu_notifier(&foobar_cpu_notifier);
 
+For the second case, you should add the following to your init function
+
+	register_hotcpu_notifier(&foobar_cpu_notifier);
+
 You can fail PREPARE notifiers if something doesn't work to prepare resources.
 This will stop the activity and send a following CANCELED event back.
 
-- 
cgit v1.2.3


From 2b54960bdf8fbb57d94dd61f4ac7513535ca7168 Mon Sep 17 00:00:00 2001
From: Randy Dunlap <rdunlap@xenotime.net>
Date: Sun, 30 Jul 2006 03:03:40 -0700
Subject: [PATCH] fix kernel-api doc for kernel/resource.c

insert_resource() was unexported, so kernel-doc needs to be told to search
kernel/resource.c for internal functions instead of exported functions so that
it won't report an error.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/DocBook/kernel-api.tmpl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'Documentation')

diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index 1ae4dc0fd85..db67ba6f5b1 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -300,7 +300,7 @@ X!Ekernel/module.c
      </sect1>
 
      <sect1><title>Resources Management</title>
-!Ekernel/resource.c
+!Ikernel/resource.c
      </sect1>
 
      <sect1><title>MTRR Handling</title>
-- 
cgit v1.2.3


From d75763d24063cafe28ace8863560da9c968ee099 Mon Sep 17 00:00:00 2001
From: Randy Dunlap <rdunlap@xenotime.net>
Date: Sun, 30 Jul 2006 03:03:41 -0700
Subject: [PATCH] pci/search: cleanups, add to kernel-api.tmpl

Clean up kernel-doc comments in drivers/pci/search.c (line sizes and typos).

Enable that source file in DocBook/kernel-api.tmpl.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/DocBook/kernel-api.tmpl | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index db67ba6f5b1..356e34e6ce3 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -312,9 +312,7 @@ X!Ekernel/module.c
 !Edrivers/pci/pci-driver.c
 !Edrivers/pci/remove.c
 !Edrivers/pci/pci-acpi.c
-<!-- kerneldoc does not understand __devinit
-X!Edrivers/pci/search.c
- -->
+!Edrivers/pci/search.c
 !Edrivers/pci/msi.c
 !Edrivers/pci/bus.c
 <!-- FIXME: Removed for now since no structured comments in source
-- 
cgit v1.2.3


From 0fcb78c22f06340cdba884d7381adb3a0148bbb6 Mon Sep 17 00:00:00 2001
From: Rolf Eike Beer <eike-kernel@sf-tec.de>
Date: Sun, 30 Jul 2006 03:03:42 -0700
Subject: [PATCH] Add DocBook documentation for workqueue functions

kernel/workqueue.c was omitted from generating kernel documentation.  This
adds a new section "Workqueues and Kevents" and adds documentation for some
of the functions.

Some functions in this file already had DocBook-style comments, now they
finally become visible.

Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/DocBook/kernel-api.tmpl | 3 +++
 1 file changed, 3 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index 356e34e6ce3..f8fe882e33d 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -58,6 +58,9 @@
 !Iinclude/linux/ktime.h
 !Iinclude/linux/hrtimer.h
 !Ekernel/hrtimer.c
+     </sect1>
+     <sect1><title>Workqueues and Kevents</title>
+!Ekernel/workqueue.c
      </sect1>
      <sect1><title>Internal Functions</title>
 !Ikernel/exit.c
-- 
cgit v1.2.3


From bc7455fa3b5ada2a47d24755cc431f4dfff052cb Mon Sep 17 00:00:00 2001
From: Randy Dunlap <rdunlap@xenotime.net>
Date: Sun, 30 Jul 2006 03:03:45 -0700
Subject: [PATCH] Doc/SubmittingPatches cleanups

A few cleanups to SubmittingPatches:
- mention SubmitChecklist
- remove mention of my simple patch script tools
- remove last-updated line

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/SubmittingPatches | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index c2c85bcb3d4..2cd7f02ffd0 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -10,7 +10,9 @@ kernel, the process can sometimes be daunting if you're not familiar
 with "the system."  This text is a collection of suggestions which
 can greatly increase the chances of your change being accepted.
 
-If you are submitting a driver, also read Documentation/SubmittingDrivers.
+Read Documentation/SubmitChecklist for a list of items to check
+before submitting code.  If you are submitting a driver, also read
+Documentation/SubmittingDrivers.
 
 
@@ -74,9 +76,6 @@ There are a number of scripts which can aid in this:
 Quilt:
 http://savannah.nongnu.org/projects/quilt
 
-Randy Dunlap's patch scripts:
-http://www.xenotime.net/linux/scripts/patching-scripts-002.tar.gz
-
 Andrew Morton's patch scripts:
 http://www.zip.com.au/~akpm/linux/patches/
 Instead of these scripts, quilt is the recommended patch management
@@ -484,7 +483,7 @@ Greg Kroah-Hartman "How to piss off a kernel subsystem maintainer".
   <http://www.kroah.com/log/2005/10/19/>
   <http://www.kroah.com/log/2006/01/11/>
 
-NO!!!! No more huge patch bombs to linux-kernel@vger.kernel.org people!.
+NO!!!! No more huge patch bombs to linux-kernel@vger.kernel.org people!
   <http://marc.theaimsgroup.com/?l=linux-kernel&m=112112749912944&w=2>
 
 Kernel Documentation/CodingStyle
@@ -493,4 +492,3 @@ Kernel Documentation/CodingStyle
 Linus Torvald's mail on the canonical patch format:
   <http://lkml.org/lkml/2005/4/7/183>
 --
-Last updated on 17 Nov 2005.
-- 
cgit v1.2.3


From 953a7f20667a8b6217ea2ac49c0877e957a0130a Mon Sep 17 00:00:00 2001
From: Pete Zaitcev <zaitcev@redhat.com>
Date: Sun, 30 Jul 2006 03:04:01 -0700
Subject: [PATCH] Typo in ub clause of devices.txt

Change "Thrid" into "Third", and realign similarly to other entries.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Cc: <device@lanana.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/devices.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/devices.txt b/Documentation/devices.txt
index 4aaf68fafeb..66c725f530f 100644
--- a/Documentation/devices.txt
+++ b/Documentation/devices.txt
@@ -2565,10 +2565,10 @@ Your cooperation is appreciated.
 		243 = /dev/usb/dabusb3	Fourth dabusb device
 
 180 block	USB block devices
-		0 = /dev/uba		First USB block device
-		8 = /dev/ubb		Second USB block device
-		16 = /dev/ubc		Thrid USB block device
-		...
+		  0 = /dev/uba		First USB block device
+		  8 = /dev/ubb		Second USB block device
+		 16 = /dev/ubc		Third USB block device
+		    ...
 
 181 char	Conrad Electronic parallel port radio clocks
 		  0 = /dev/pcfclock0	First Conrad radio clock
-- 
cgit v1.2.3


From 0b0bf7a3ccb6f0b38ead71980e79f875046047b7 Mon Sep 17 00:00:00 2001
From: Roland McGrath <roland@redhat.com>
Date: Sun, 30 Jul 2006 03:04:06 -0700
Subject: [PATCH] vDSO hash-style fix

The latest toolchains can produce a new ELF section in DSOs and
dynamically-linked executables.  The new section ".gnu.hash" replaces
".hash", and allows for more efficient runtime symbol lookups by the
dynamic linker.  The new ld option --hash-style={sysv|gnu|both} controls
whether to produce the old ".hash", the new ".gnu.hash", or both.  In some
new systems such as Fedora Core 6, gcc by default passes --hash-style=gnu
to the linker, so that a standard invocation of "gcc -shared" results in
producing a DSO with only ".gnu.hash".  The new ".gnu.hash" sections need
to be dealt with the same way as ".hash" sections in all respects; only the
dynamic linker cares about their contents.  To work with older dynamic
linkers (i.e.  preexisting releases of glibc), a binary must have the old
".hash" section.  The --hash-style=both option produces binaries that a new
dynamic linker can use more efficiently, but an old dynamic linker can
still handle.

The new section runs afoul of the custom linker scripts used to build vDSO
images for the kernel.  On ia64, the failure mode for this is a boot-time
panic because the vDSO's PT_IA_64_UNWIND segment winds up ill-formed.

This patch addresses the problem in two ways.

First, it mentions ".gnu.hash" in all the linker scripts alongside ".hash".
 This produces correct vDSO images with --hash-style=sysv (or old tools),
with --hash-style=gnu, or with --hash-style=both.

Second, it passes the --hash-style=sysv option when building the vDSO
images, so that ".gnu.hash" is not actually produced.  This is the most
conservative choice for compatibility with any old userland.  There is some
concern that some ancient glibc builds (though not any known old production
system) might choke on --hash-style=both binaries.  The optimizations
provided by the new style of hash section do not really matter for a DSO
with a tiny number of symbols, as the vDSO has.  If someone wants to use
=gnu or =both for their vDSO builds and worry less about that
compatibility, just change the option and the linker script changes will
make any choice work fine.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/kbuild/makefiles.txt | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/kbuild/makefiles.txt b/Documentation/kbuild/makefiles.txt
index 14ef3868a32..0706699c9da 100644
--- a/Documentation/kbuild/makefiles.txt
+++ b/Documentation/kbuild/makefiles.txt
@@ -407,6 +407,20 @@ more details, with real examples.
 	The second argument is optional, and if supplied will be used
 	if first argument is not supported.
 
+    ld-option
+    	ld-option is used to check if $(CC) when used to link object files
+	supports the given option.  An optional second option may be
+	specified if first option are not supported.
+
+	Example:
+		#arch/i386/kernel/Makefile
+		vsyscall-flags += $(call ld-option, -Wl$(comma)--hash-style=sysv)
+
+	In the above example vsyscall-flags will be assigned the option
+	-Wl$(comma)--hash-style=sysv if it is supported by $(CC).
+	The second argument is optional, and if supplied will be used
+	if first argument is not supported.
+
     cc-option
 	cc-option is used to check if $(CC) support a given option, and not
 	supported to use an optional second option.
-- 
cgit v1.2.3


From 0a5eca6530eb4d0120981936058537c24a2f92ce Mon Sep 17 00:00:00 2001
From: Thomas Horsley <tom.horsley@ccur.com>
Date: Sun, 30 Jul 2006 03:04:12 -0700
Subject: [PATCH] documentation: Documentation/initrd.txt

I spent a long time the other day trying to examine an initrd image on a
fedora core 5 system because the initrd.txt file is apparently obsolete.
Here is a patch which I hope will reduce future confusion for others.

Signed-off-by: Thomas Horsley <tom.horsley@ccur.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/initrd.txt | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/initrd.txt b/Documentation/initrd.txt
index b1b6440237a..15f1b35deb3 100644
--- a/Documentation/initrd.txt
+++ b/Documentation/initrd.txt
@@ -72,6 +72,22 @@ initrd adds the following new options:
     initrd is mounted as root, and the normal boot procedure is followed,
     with the RAM disk still mounted as root.
 
+Compressed cpio images
+----------------------
+
+Recent kernels have support for populating a ramdisk from a compressed cpio
+archive, on such systems, the creation of a ramdisk image doesn't need to
+involve special block devices or loopbacks, you merely create a directory on
+disk with the desired initrd content, cd to that directory, and run (as an
+example):
+
+find . | cpio --quiet -c -o | gzip -9 -n > /boot/imagefile.img
+
+Examining the contents of an existing image file is just as simple:
+
+mkdir /tmp/imagefile
+cd /tmp/imagefile
+gzip -cd /boot/imagefile.img | cpio -imd --quiet
 
 Installation
 ------------
-- 
cgit v1.2.3


From 9c9a43ed2734081124407c779b36a4761c41139b Mon Sep 17 00:00:00 2001
From: Mattia Dongili <malattia@linux.it>
Date: Wed, 5 Jul 2006 23:12:20 +0200
Subject: [CPUFREQ] return error when failing to set minfreq

I just stumbled on this bug/feature, this is how to reproduce it:

# echo 450000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq
# echo 450000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
# echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
# cpufreq-info -p
450000 450000 powersave
# echo 1800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq ; echo $?
0
# cpufreq-info -p
450000 450000 powersave

Here it is. The kernel refuses to set a min_freq higher than the
max_freq but it allows a max_freq lower than min_freq (lowering min_freq
also).

This behaviour is pretty straightforward (but undocumented) and it
doesn't return an error altough failing to accomplish the requested
action (set min_freq).
The problem (IMO) is basically that userspace is not allowed to set a
full policy atomically while the kernel always does that thus it must
enforce an ordering on operations.

The attached patch returns -EINVAL if trying to increase frequencies
starting from scaling_min_freq and documents the correct ordering of writes.

Signed-off-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Dominik Brodowski <linux at dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>

--
---
 Documentation/cpu-freq/user-guide.txt | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

(limited to 'Documentation')

diff --git a/Documentation/cpu-freq/user-guide.txt b/Documentation/cpu-freq/user-guide.txt
index 7fedc00c3d3..555c8cf3650 100644
--- a/Documentation/cpu-freq/user-guide.txt
+++ b/Documentation/cpu-freq/user-guide.txt
@@ -153,10 +153,13 @@ scaling_governor,		and by "echoing" the name of another
 				that some governors won't load - they only
 				work on some specific architectures or
 				processors.
-scaling_min_freq and 
+scaling_min_freq and
 scaling_max_freq		show the current "policy limits" (in
 				kHz). By echoing new values into these
 				files, you can change these limits.
+				NOTE: when setting a policy you need to
+				first set scaling_max_freq, then
+				scaling_min_freq.
 
 
 If you have selected the "userspace" governor which allows you to
-- 
cgit v1.2.3


From 0e74b06aff598def819b44225ebfbb907fd10179 Mon Sep 17 00:00:00 2001
From: "Luiz Fernando N. Capitulino" <lcapitulino@mandriva.com.br>
Date: Thu, 27 Jul 2006 21:59:17 -0300
Subject: USB: doc: usb-help.txt update.

 http://www.suse.cz/development/linux-usb/ doesn't exist anymore.

Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 Documentation/usb/usb-help.txt | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/usb/usb-help.txt b/Documentation/usb/usb-help.txt
index b7c32497369..a7408593829 100644
--- a/Documentation/usb/usb-help.txt
+++ b/Documentation/usb/usb-help.txt
@@ -5,8 +5,7 @@ For USB help other than the readme files that are located in
 Documentation/usb/*, see the following:
 
 Linux-USB project:  http://www.linux-usb.org
-  mirrors at        http://www.suse.cz/development/linux-usb/
-         and        http://usb.in.tum.de/linux-usb/
+  mirrors at        http://usb.in.tum.de/linux-usb/
          and        http://it.linux-usb.org
 Linux USB Guide:    http://linux-usb.sourceforge.net
 Linux-USB device overview (working devices and drivers):
-- 
cgit v1.2.3


From 064e875a4cb1dad7b3a00661877fe8cd95d1a59a Mon Sep 17 00:00:00 2001
From: "Luiz Fernando N. Capitulino" <lcapitulino@mandriva.com.br>
Date: Thu, 27 Jul 2006 22:01:34 -0300
Subject: USB: doc: fixes devio.c location in proc_usb_info.txt.

Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 Documentation/usb/proc_usb_info.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'Documentation')

diff --git a/Documentation/usb/proc_usb_info.txt b/Documentation/usb/proc_usb_info.txt
index f86550fe38e..22c5331260c 100644
--- a/Documentation/usb/proc_usb_info.txt
+++ b/Documentation/usb/proc_usb_info.txt
@@ -59,7 +59,7 @@ bind to an interface (or perhaps several) using an ioctl call.  You
 would issue more ioctls to the device to communicate to it using
 control, bulk, or other kinds of USB transfers.  The IOCTLs are
 listed in the <linux/usbdevice_fs.h> file, and at this writing the
-source code (linux/drivers/usb/devio.c) is the primary reference
+source code (linux/drivers/usb/core/devio.c) is the primary reference
 for how to access devices through those files.
 
 Note that since by default these BBB/DDD files are writable only by
-- 
cgit v1.2.3


From 8ddc7c5326064434048ec1ecfe57659e08345cc1 Mon Sep 17 00:00:00 2001
From: Or Gerlitz <ogerlitz@voltaire.com>
Date: Thu, 13 Jul 2006 11:00:39 +0300
Subject: IB/ipoib: Remove broken link from Kconfig and documentation

Remove references to the IPoIB IETF working group as it has been closed.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
---
 Documentation/infiniband/ipoib.txt | 2 --
 1 file changed, 2 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/infiniband/ipoib.txt b/Documentation/infiniband/ipoib.txt
index 187035560d7..864ff328378 100644
--- a/Documentation/infiniband/ipoib.txt
+++ b/Documentation/infiniband/ipoib.txt
@@ -51,8 +51,6 @@ Debugging Information
 
 References
 
-  IETF IP over InfiniBand (ipoib) Working Group
-    http://ietf.org/html.charters/ipoib-charter.html
   Transmission of IP over InfiniBand (IPoIB) (RFC 4391)
     http://ietf.org/rfc/rfc4391.txt 
   IP over InfiniBand (IPoIB) Architecture (RFC 4392)
-- 
cgit v1.2.3


From 8b23d04dd26e6d170adc644cf12a0d9dfa1f5094 Mon Sep 17 00:00:00 2001
From: Maxime Bizon <mbizon@freebox.fr>
Date: Sat, 5 Aug 2006 12:14:32 -0700
Subject: [PATCH] doc: update panic_on_oops documentation

Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/sysctl/kernel.txt | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index b0c7ab93dcb..7345c338080 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -211,9 +211,8 @@ Controls the kernel's behaviour when an oops or BUG is encountered.
 
 0: try to continue operation
 
-1: delay a few seconds (to give klogd time to record the oops output) and
-   then panic.  If the `panic' sysctl is also non-zero then the machine will
-   be rebooted.
+1: panic immediatly.  If the `panic' sysctl is also non-zero then the
+   machine will be rebooted.
 
 ==============================================================
 
-- 
cgit v1.2.3


From e9539ee73a9b85bdc6eab9e15dd1ee639a815406 Mon Sep 17 00:00:00 2001
From: Brandon Philips <brandon@ifup.org>
Date: Tue, 1 Aug 2006 14:04:17 -0500
Subject: genhd.c reference in Documentation/kobjects.txt

block/genhd.c no longer in drivers/.  Update Documentation/kobjects.txt

Signed-off-by: Brandon Philips <brandon@ifup.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 Documentation/kobject.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'Documentation')

diff --git a/Documentation/kobject.txt b/Documentation/kobject.txt
index 8d9bffbd192..949f7b5a205 100644
--- a/Documentation/kobject.txt
+++ b/Documentation/kobject.txt
@@ -247,7 +247,7 @@ the object-specific fields, which include:
 - default_attrs: Default attributes to be exported via sysfs when the
   object is registered.Note that the last attribute has to be
   initialized to NULL ! You can find a complete implementation
-  in drivers/block/genhd.c
+  in block/genhd.c
 
 
 Instances of struct kobj_type are not registered; only referenced by
-- 
cgit v1.2.3


From b64ef8afa58f397e1eaba2bd9ecaa6812064d464 Mon Sep 17 00:00:00 2001
From: Edgar Hucek <hostmaster@ed-soft.at>
Date: Sun, 13 Aug 2006 23:24:16 -0700
Subject: [PATCH] add imacfb documentation and detection

Add basic Machine detection to imacfb and some Ducumentation bits for
imacfb.

Signed-off-by: Edgar Hucek <hostmaster@ed-soft.at>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
 Documentation/fb/imacfb.txt | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 Documentation/fb/imacfb.txt

(limited to 'Documentation')

diff --git a/Documentation/fb/imacfb.txt b/Documentation/fb/imacfb.txt
new file mode 100644
index 00000000000..759028545a7
--- /dev/null
+++ b/Documentation/fb/imacfb.txt
@@ -0,0 +1,31 @@
+
+What is imacfb?
+===============
+
+This is a generic EFI platform driver for Intel based Apple computers.
+Imacfb is only for EFI booted Intel Macs.
+
+Supported Hardware
+==================
+
+iMac 17"/20"
+Macbook
+Macbook Pro 15"/17"
+MacMini
+
+How to use it?
+==============
+
+Imacfb does not have any kind of autodetection of your machine.
+You have to add the fillowing kernel parameters in your elilo.conf:
+	Macbook :
+		video=imacfb:macbook
+	MacMini :
+		video=imacfb:mini
+	Macbook Pro 15", iMac 17" :
+		video=imacfb:i17
+	Macbook Pro 17", iMac 20" :
+		video=imacfb:i20
+
+--
+Edgar Hucek <gimli@dark-green.com>
-- 
cgit v1.2.3


From fb33f82568d32016b3b3c00819574f9526e52be3 Mon Sep 17 00:00:00 2001
From: "Jan \"Yenya\" Kasprzak" <kas@fi.muni.cz>
Date: Tue, 15 Aug 2006 01:33:50 -0700
Subject: [NET]: Terminology in ip-sysctl.txt

this minor patch fixes the description of net.ipv4.tcp_mem sysctl
in ip-sysctl.txt - the headline names the values "min, pressure, max",
while the description uses the "low, pressure, high" values.
Both tcp_rmem and tcp_wmem descriptions use the "min, pressure, max"
values, so I have changed the tcp_mem to match this and not vice versa.

Signed-off-by: Jan "Yenya" Kasprzak <kas@fi.muni.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ip-sysctl.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index d46338af600..3e0c017e787 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -294,15 +294,15 @@ tcp_rmem - vector of 3 INTEGERs: min, default, max
 	Default: 87380*2 bytes.
 
 tcp_mem - vector of 3 INTEGERs: min, pressure, max
-	low: below this number of pages TCP is not bothered about its
+	min: below this number of pages TCP is not bothered about its
 	memory appetite.
 
 	pressure: when amount of memory allocated by TCP exceeds this number
 	of pages, TCP moderates its memory consumption and enters memory
 	pressure mode, which is exited when memory consumption falls
-	under "low".
+	under "min".
 
-	high: number of pages allowed for queueing by all TCP sockets.
+	max: number of pages allowed for queueing by all TCP sockets.
 
 	Defaults are calculated at boot time from amount of available
 	memory.
-- 
cgit v1.2.3


From f583165f6a926e9f27ff8d15c0e4b22e83f0d599 Mon Sep 17 00:00:00 2001
From: Jon Loeliger <jdl@jdl.com>
Date: Thu, 17 Aug 2006 08:42:35 -0500
Subject: [POWERPC] Convert to mac-address for ethernet MAC address data.

Also accept "local-mac-address".  However the old "address"
is now obsolete, but accepted for backwards compatibility.
It should be removed after all device trees have been
converted to use "mac-address".

Signed-off-by: Jon Loeliger <jdl@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 Documentation/powerpc/booting-without-of.txt | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt
index 3c62e66e1fc..8c48b8a27b9 100644
--- a/Documentation/powerpc/booting-without-of.txt
+++ b/Documentation/powerpc/booting-without-of.txt
@@ -1196,7 +1196,7 @@ platforms are moved over to use the flattened-device-tree model.
     - model : Model of the device.  Can be "TSEC", "eTSEC", or "FEC"
     - compatible : Should be "gianfar"
     - reg : Offset and length of the register set for the device
-    - address : List of bytes representing the ethernet address of
+    - mac-address : List of bytes representing the ethernet address of
       this controller
     - interrupts : <a b> where a is the interrupt number and b is a
       field that represents an encoding of the sense and level
@@ -1216,7 +1216,7 @@ platforms are moved over to use the flattened-device-tree model.
 		model = "TSEC";
 		compatible = "gianfar";
 		reg = <24000 1000>;
-		address = [ 00 E0 0C 00 73 00 ];
+		mac-address = [ 00 E0 0C 00 73 00 ];
 		interrupts = <d 3 e 3 12 3>;
 		interrupt-parent = <40000>;
 		phy-handle = <2452000>
@@ -1498,7 +1498,7 @@ not necessary as they are usually the same as the root node.
 			model = "TSEC";
 			compatible = "gianfar";
 			reg = <24000 1000>;
-			address = [ 00 E0 0C 00 73 00 ];
+			mac-address = [ 00 E0 0C 00 73 00 ];
 			interrupts = <d 3 e 3 12 3>;
 			interrupt-parent = <40000>;
 			phy-handle = <2452000>;
@@ -1511,7 +1511,7 @@ not necessary as they are usually the same as the root node.
 			model = "TSEC";
 			compatible = "gianfar";
 			reg = <25000 1000>;
-			address = [ 00 E0 0C 00 73 01 ];
+			mac-address = [ 00 E0 0C 00 73 01 ];
 			interrupts = <13 3 14 3 18 3>;
 			interrupt-parent = <40000>;
 			phy-handle = <2452001>;
@@ -1524,7 +1524,7 @@ not necessary as they are usually the same as the root node.
 			model = "FEC";
 			compatible = "gianfar";
 			reg = <26000 1000>;
-			address = [ 00 E0 0C 00 73 02 ];
+			mac-address = [ 00 E0 0C 00 73 02 ];
 			interrupts = <19 3>;
 			interrupt-parent = <40000>;
 			phy-handle = <2452002>;
-- 
cgit v1.2.3


From c712a9de94a5df5bc0087c14ad0b1aac2c147991 Mon Sep 17 00:00:00 2001
From: Alexey Dobriyan <adobriyan@gmail.com>
Date: Wed, 23 Aug 2006 00:48:33 -0400
Subject: Input: remove dead URLs from Doclumentation/input/joystick.txt

Closes #2804.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
---
 Documentation/input/joystick.txt | 1 -
 1 file changed, 1 deletion(-)

(limited to 'Documentation')

diff --git a/Documentation/input/joystick.txt b/Documentation/input/joystick.txt
index d53b857a371..841c353297e 100644
--- a/Documentation/input/joystick.txt
+++ b/Documentation/input/joystick.txt
@@ -39,7 +39,6 @@ them. Bug reports and success stories are also welcome.
 
   The input project website is at:
 
-	http://www.suse.cz/development/input/
 	http://atrey.karlin.mff.cuni.cz/~vojtech/input/
 
   There is also a mailing list for the driver at:
-- 
cgit v1.2.3


From 897522ea1c20691b6a65f32f03ae4e77e508b31c Mon Sep 17 00:00:00 2001
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Fri, 25 Aug 2006 00:52:06 -0700
Subject: [CONNECTOR]: Add userspace example code into Documentation/connector/

I was asked several times to include userspace example code into
Documentation, so if there is no policy against it, consider attached patch
for 2.6.18. This program works with included Documentation/connector/cn_test.c
connector module.

Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/connector/ucon.c | 206 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 206 insertions(+)
 create mode 100644 Documentation/connector/ucon.c

(limited to 'Documentation')

diff --git a/Documentation/connector/ucon.c b/Documentation/connector/ucon.c
new file mode 100644
index 00000000000..d738cde2a8d
--- /dev/null
+++ b/Documentation/connector/ucon.c
@@ -0,0 +1,206 @@
+/*
+ * 	ucon.c
+ *
+ * Copyright (c) 2004+ Evgeniy Polyakov <johnpol@2ka.mipt.ru>
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <asm/types.h>
+
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/poll.h>
+
+#include <linux/netlink.h>
+#include <linux/rtnetlink.h>
+
+#include <arpa/inet.h>
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <errno.h>
+#include <time.h>
+
+#include <linux/connector.h>
+
+#define DEBUG
+#define NETLINK_CONNECTOR 	11
+
+#ifdef DEBUG
+#define ulog(f, a...) fprintf(stdout, f, ##a)
+#else
+#define ulog(f, a...) do {} while (0)
+#endif
+
+static int need_exit;
+static __u32 seq;
+
+static int netlink_send(int s, struct cn_msg *msg)
+{
+	struct nlmsghdr *nlh;
+	unsigned int size;
+	int err;
+	char buf[128];
+	struct cn_msg *m;
+
+	size = NLMSG_SPACE(sizeof(struct cn_msg) + msg->len);
+
+	nlh = (struct nlmsghdr *)buf;
+	nlh->nlmsg_seq = seq++;
+	nlh->nlmsg_pid = getpid();
+	nlh->nlmsg_type = NLMSG_DONE;
+	nlh->nlmsg_len = NLMSG_LENGTH(size - sizeof(*nlh));
+	nlh->nlmsg_flags = 0;
+
+	m = NLMSG_DATA(nlh);
+#if 0
+	ulog("%s: [%08x.%08x] len=%u, seq=%u, ack=%u.\n",
+	       __func__, msg->id.idx, msg->id.val, msg->len, msg->seq, msg->ack);
+#endif
+	memcpy(m, msg, sizeof(*m) + msg->len);
+
+	err = send(s, nlh, size, 0);
+	if (err == -1)
+		ulog("Failed to send: %s [%d].\n",
+			strerror(errno), errno);
+
+	return err;
+}
+
+int main(int argc, char *argv[])
+{
+	int s;
+	char buf[1024];
+	int len;
+	struct nlmsghdr *reply;
+	struct sockaddr_nl l_local;
+	struct cn_msg *data;
+	FILE *out;
+	time_t tm;
+	struct pollfd pfd;
+
+	if (argc < 2)
+		out = stdout;
+	else {
+		out = fopen(argv[1], "a+");
+		if (!out) {
+			ulog("Unable to open %s for writing: %s\n",
+				argv[1], strerror(errno));
+			out = stdout;
+		}
+	}
+
+	memset(buf, 0, sizeof(buf));
+
+	s = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR);
+	if (s == -1) {
+		perror("socket");
+		return -1;
+	}
+
+	l_local.nl_family = AF_NETLINK;
+	l_local.nl_groups = 0x123; /* bitmask of requested groups */
+	l_local.nl_pid = 0;
+
+	if (bind(s, (struct sockaddr *)&l_local, sizeof(struct sockaddr_nl)) == -1) {
+		perror("bind");
+		close(s);
+		return -1;
+	}
+
+#if 0
+	{
+		int on = 0x57; /* Additional group number */
+		setsockopt(s, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP, &on, sizeof(on));
+	}
+#endif
+	if (0) {
+		int i, j;
+
+		memset(buf, 0, sizeof(buf));
+
+		data = (struct cn_msg *)buf;
+
+		data->id.idx = 0x123;
+		data->id.val = 0x456;
+		data->seq = seq++;
+		data->ack = 0;
+		data->len = 0;
+
+		for (j=0; j<10; ++j) {
+			for (i=0; i<1000; ++i) {
+				len = netlink_send(s, data);
+			}
+
+			ulog("%d messages have been sent to %08x.%08x.\n", i, data->id.idx, data->id.val);
+		}
+
+		return 0;
+	}
+
+
+	pfd.fd = s;
+
+	while (!need_exit) {
+		pfd.events = POLLIN;
+		pfd.revents = 0;
+		switch (poll(&pfd, 1, -1)) {
+			case 0:
+				need_exit = 1;
+				break;
+			case -1:
+				if (errno != EINTR) {
+					need_exit = 1;
+					break;
+				}
+				continue;
+		}
+		if (need_exit)
+			break;
+
+		memset(buf, 0, sizeof(buf));
+		len = recv(s, buf, sizeof(buf), 0);
+		if (len == -1) {
+			perror("recv buf");
+			close(s);
+			return -1;
+		}
+		reply = (struct nlmsghdr *)buf;
+
+		switch (reply->nlmsg_type) {
+		case NLMSG_ERROR:
+			fprintf(out, "Error message received.\n");
+			fflush(out);
+			break;
+		case NLMSG_DONE:
+			data = (struct cn_msg *)NLMSG_DATA(reply);
+
+			time(&tm);
+			fprintf(out, "%.24s : [%x.%x] [%08u.%08u].\n",
+				ctime(&tm), data->id.idx, data->id.val, data->seq, data->ack);
+			fflush(out);
+			break;
+		default:
+			break;
+		}
+	}
+
+	close(s);
+	return 0;
+}
-- 
cgit v1.2.3


From a2e0b56316fa90e137802fdad6a7c6a9b85c86c3 Mon Sep 17 00:00:00 2001
From: Alexey Dobriyan <adobriyan@gmail.com>
Date: Sun, 27 Aug 2006 01:23:28 -0700
Subject: [PATCH] Fix docs for fs.suid_dumpable

Sergey Vlasov noticed that there is not kernel.suid_dumpable, but
fs.suid_dumpable.

How KERN_SETUID_DUMPABLE ended up in fs_table[]? Hell knows...

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/sysctl/fs.txt     | 20 ++++++++++++++++++++
 Documentation/sysctl/kernel.txt | 20 --------------------
 2 files changed, 20 insertions(+), 20 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt
index 0b62c62142c..5c3a5190596 100644
--- a/Documentation/sysctl/fs.txt
+++ b/Documentation/sysctl/fs.txt
@@ -25,6 +25,7 @@ Currently, these files are in /proc/sys/fs:
 - inode-state
 - overflowuid
 - overflowgid
+- suid_dumpable
 - super-max
 - super-nr
 
@@ -131,6 +132,25 @@ The default is 65534.
 
 ==============================================================
 
+suid_dumpable:
+
+This value can be used to query and set the core dump mode for setuid
+or otherwise protected/tainted binaries. The modes are
+
+0 - (default) - traditional behaviour. Any process which has changed
+	privilege levels or is execute only will not be dumped
+1 - (debug) - all processes dump core when possible. The core dump is
+	owned by the current user and no security is applied. This is
+	intended for system debugging situations only. Ptrace is unchecked.
+2 - (suidsafe) - any binary which normally would not be dumped is dumped
+	readable by root only. This allows the end user to remove
+	such a dump but not access it directly. For security reasons
+	core dumps in this mode will not overwrite one another or
+	other files. This mode is appropriate when adminstrators are
+	attempting to debug problems in a normal environment.
+
+==============================================================
+
 super-max & super-nr:
 
 These numbers control the maximum number of superblocks, and
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 7345c338080..89bf8c20a58 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -50,7 +50,6 @@ show up in /proc/sys/kernel:
 - shmmax                      [ sysv ipc ]
 - shmmni
 - stop-a                      [ SPARC only ]
-- suid_dumpable
 - sysrq                       ==> Documentation/sysrq.txt
 - tainted
 - threads-max
@@ -310,25 +309,6 @@ kernel.  This value defaults to SHMMAX.
 
 ==============================================================
 
-suid_dumpable:
-
-This value can be used to query and set the core dump mode for setuid
-or otherwise protected/tainted binaries. The modes are
-
-0 - (default) - traditional behaviour. Any process which has changed
-	privilege levels or is execute only will not be dumped
-1 - (debug) - all processes dump core when possible. The core dump is
-	owned by the current user and no security is applied. This is
-	intended for system debugging situations only. Ptrace is unchecked.
-2 - (suidsafe) - any binary which normally would not be dumped is dumped
-	readable by root only. This allows the end user to remove
-	such a dump but not access it directly. For security reasons
-	core dumps in this mode will not overwrite one another or
-	other files. This mode is appropriate when adminstrators are
-	attempting to debug problems in a normal environment.
-
-==============================================================
-
 tainted: 
 
 Non-zero if the kernel has been tainted.  Numeric values, which
-- 
cgit v1.2.3


From e88d78f6ba50d773096e26ca3f5c2464853c682d Mon Sep 17 00:00:00 2001
From: Tom Zanussi <zanussi@us.ibm.com>
Date: Sun, 27 Aug 2006 01:23:47 -0700
Subject: [PATCH] Documentation update for relay interface

Here's updated documentation for the relay interface, rewritten to match
the relayfs->relay changes.  It also moves relayfs.txt to relay.txt in the
process.

It includes the changes to relayfs.txt previously posted by Randy Dunlap,
thanks for those.

The relay-apps examples have also been updated to match, and can be found
on the sourceforge relayfs website.

Signed-off-by: Tom Zanussi <zanussi@us.ibm.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/filesystems/00-INDEX    |   4 +-
 Documentation/filesystems/relay.txt   | 479 ++++++++++++++++++++++++++++++++++
 Documentation/filesystems/relayfs.txt | 442 -------------------------------
 3 files changed, 481 insertions(+), 444 deletions(-)
 create mode 100644 Documentation/filesystems/relay.txt
 delete mode 100644 Documentation/filesystems/relayfs.txt

(limited to 'Documentation')

diff --git a/Documentation/filesystems/00-INDEX b/Documentation/filesystems/00-INDEX
index 66fdc0744fe..16dec61d767 100644
--- a/Documentation/filesystems/00-INDEX
+++ b/Documentation/filesystems/00-INDEX
@@ -62,8 +62,8 @@ ramfs-rootfs-initramfs.txt
 	- info on the 'in memory' filesystems ramfs, rootfs and initramfs.
 reiser4.txt
 	- info on the Reiser4 filesystem based on dancing tree algorithms.
-relayfs.txt
-	- info on relayfs, for efficient streaming from kernel to user space.
+relay.txt
+	- info on relay, for efficient streaming from kernel to user space.
 romfs.txt
 	- description of the ROMFS filesystem.
 smbfs.txt
diff --git a/Documentation/filesystems/relay.txt b/Documentation/filesystems/relay.txt
new file mode 100644
index 00000000000..d6788dae034
--- /dev/null
+++ b/Documentation/filesystems/relay.txt
@@ -0,0 +1,479 @@
+relay interface (formerly relayfs)
+==================================
+
+The relay interface provides a means for kernel applications to
+efficiently log and transfer large quantities of data from the kernel
+to userspace via user-defined 'relay channels'.
+
+A 'relay channel' is a kernel->user data relay mechanism implemented
+as a set of per-cpu kernel buffers ('channel buffers'), each
+represented as a regular file ('relay file') in user space.  Kernel
+clients write into the channel buffers using efficient write
+functions; these automatically log into the current cpu's channel
+buffer.  User space applications mmap() or read() from the relay files
+and retrieve the data as it becomes available.  The relay files
+themselves are files created in a host filesystem, e.g. debugfs, and
+are associated with the channel buffers using the API described below.
+
+The format of the data logged into the channel buffers is completely
+up to the kernel client; the relay interface does however provide
+hooks which allow kernel clients to impose some structure on the
+buffer data.  The relay interface doesn't implement any form of data
+filtering - this also is left to the kernel client.  The purpose is to
+keep things as simple as possible.
+
+This document provides an overview of the relay interface API.  The
+details of the function parameters are documented along with the
+functions in the relay interface code - please see that for details.
+
+Semantics
+=========
+
+Each relay channel has one buffer per CPU, each buffer has one or more
+sub-buffers.  Messages are written to the first sub-buffer until it is
+too full to contain a new message, in which case it it is written to
+the next (if available).  Messages are never split across sub-buffers.
+At this point, userspace can be notified so it empties the first
+sub-buffer, while the kernel continues writing to the next.
+
+When notified that a sub-buffer is full, the kernel knows how many
+bytes of it are padding i.e. unused space occurring because a complete
+message couldn't fit into a sub-buffer.  Userspace can use this
+knowledge to copy only valid data.
+
+After copying it, userspace can notify the kernel that a sub-buffer
+has been consumed.
+
+A relay channel can operate in a mode where it will overwrite data not
+yet collected by userspace, and not wait for it to be consumed.
+
+The relay channel itself does not provide for communication of such
+data between userspace and kernel, allowing the kernel side to remain
+simple and not impose a single interface on userspace.  It does
+provide a set of examples and a separate helper though, described
+below.
+
+The read() interface both removes padding and internally consumes the
+read sub-buffers; thus in cases where read(2) is being used to drain
+the channel buffers, special-purpose communication between kernel and
+user isn't necessary for basic operation.
+
+One of the major goals of the relay interface is to provide a low
+overhead mechanism for conveying kernel data to userspace.  While the
+read() interface is easy to use, it's not as efficient as the mmap()
+approach; the example code attempts to make the tradeoff between the
+two approaches as small as possible.
+
+klog and relay-apps example code
+================================
+
+The relay interface itself is ready to use, but to make things easier,
+a couple simple utility functions and a set of examples are provided.
+
+The relay-apps example tarball, available on the relay sourceforge
+site, contains a set of self-contained examples, each consisting of a
+pair of .c files containing boilerplate code for each of the user and
+kernel sides of a relay application.  When combined these two sets of
+boilerplate code provide glue to easily stream data to disk, without
+having to bother with mundane housekeeping chores.
+
+The 'klog debugging functions' patch (klog.patch in the relay-apps
+tarball) provides a couple of high-level logging functions to the
+kernel which allow writing formatted text or raw data to a channel,
+regardless of whether a channel to write into exists or not, or even
+whether the relay interface is compiled into the kernel or not.  These
+functions allow you to put unconditional 'trace' statements anywhere
+in the kernel or kernel modules; only when there is a 'klog handler'
+registered will data actually be logged (see the klog and kleak
+examples for details).
+
+It is of course possible to use the relay interface from scratch,
+i.e. without using any of the relay-apps example code or klog, but
+you'll have to implement communication between userspace and kernel,
+allowing both to convey the state of buffers (full, empty, amount of
+padding).  The read() interface both removes padding and internally
+consumes the read sub-buffers; thus in cases where read(2) is being
+used to drain the channel buffers, special-purpose communication
+between kernel and user isn't necessary for basic operation.  Things
+such as buffer-full conditions would still need to be communicated via
+some channel though.
+
+klog and the relay-apps examples can be found in the relay-apps
+tarball on http://relayfs.sourceforge.net
+
+The relay interface user space API
+==================================
+
+The relay interface implements basic file operations for user space
+access to relay channel buffer data.  Here are the file operations
+that are available and some comments regarding their behavior:
+
+open()	    enables user to open an _existing_ channel buffer.
+
+mmap()      results in channel buffer being mapped into the caller's
+	    memory space. Note that you can't do a partial mmap - you
+	    must map the entire file, which is NRBUF * SUBBUFSIZE.
+
+read()      read the contents of a channel buffer.  The bytes read are
+	    'consumed' by the reader, i.e. they won't be available
+	    again to subsequent reads.  If the channel is being used
+	    in no-overwrite mode (the default), it can be read at any
+	    time even if there's an active kernel writer.  If the
+	    channel is being used in overwrite mode and there are
+	    active channel writers, results may be unpredictable -
+	    users should make sure that all logging to the channel has
+	    ended before using read() with overwrite mode.  Sub-buffer
+	    padding is automatically removed and will not be seen by
+	    the reader.
+
+sendfile()  transfer data from a channel buffer to an output file
+	    descriptor. Sub-buffer padding is automatically removed
+	    and will not be seen by the reader.
+
+poll()      POLLIN/POLLRDNORM/POLLERR supported.  User applications are
+	    notified when sub-buffer boundaries are crossed.
+
+close()     decrements the channel buffer's refcount.  When the refcount
+	    reaches 0, i.e. when no process or kernel client has the
+	    buffer open, the channel buffer is freed.
+
+In order for a user application to make use of relay files, the
+host filesystem must be mounted.  For example,
+
+	mount -t debugfs debugfs /debug
+
+NOTE:   the host filesystem doesn't need to be mounted for kernel
+	clients to create or use channels - it only needs to be
+	mounted when user space applications need access to the buffer
+	data.
+
+
+The relay interface kernel API
+==============================
+
+Here's a summary of the API the relay interface provides to in-kernel clients:
+
+TBD(curr. line MT:/API/)
+  channel management functions:
+
+    relay_open(base_filename, parent, subbuf_size, n_subbufs,
+               callbacks)
+    relay_close(chan)
+    relay_flush(chan)
+    relay_reset(chan)
+
+  channel management typically called on instigation of userspace:
+
+    relay_subbufs_consumed(chan, cpu, subbufs_consumed)
+
+  write functions:
+
+    relay_write(chan, data, length)
+    __relay_write(chan, data, length)
+    relay_reserve(chan, length)
+
+  callbacks:
+
+    subbuf_start(buf, subbuf, prev_subbuf, prev_padding)
+    buf_mapped(buf, filp)
+    buf_unmapped(buf, filp)
+    create_buf_file(filename, parent, mode, buf, is_global)
+    remove_buf_file(dentry)
+
+  helper functions:
+
+    relay_buf_full(buf)
+    subbuf_start_reserve(buf, length)
+
+
+Creating a channel
+------------------
+
+relay_open() is used to create a channel, along with its per-cpu
+channel buffers.  Each channel buffer will have an associated file
+created for it in the host filesystem, which can be and mmapped or
+read from in user space.  The files are named basename0...basenameN-1
+where N is the number of online cpus, and by default will be created
+in the root of the filesystem (if the parent param is NULL).  If you
+want a directory structure to contain your relay files, you should
+create it using the host filesystem's directory creation function,
+e.g. debugfs_create_dir(), and pass the parent directory to
+relay_open().  Users are responsible for cleaning up any directory
+structure they create, when the channel is closed - again the host
+filesystem's directory removal functions should be used for that,
+e.g. debugfs_remove().
+
+In order for a channel to be created and the host filesystem's files
+associated with its channel buffers, the user must provide definitions
+for two callback functions, create_buf_file() and remove_buf_file().
+create_buf_file() is called once for each per-cpu buffer from
+relay_open() and allows the user to create the file which will be used
+to represent the corresponding channel buffer.  The callback should
+return the dentry of the file created to represent the channel buffer.
+remove_buf_file() must also be defined; it's responsible for deleting
+the file(s) created in create_buf_file() and is called during
+relay_close().
+
+Here are some typical definitions for these callbacks, in this case
+using debugfs:
+
+/*
+ * create_buf_file() callback.  Creates relay file in debugfs.
+ */
+static struct dentry *create_buf_file_handler(const char *filename,
+                                              struct dentry *parent,
+                                              int mode,
+                                              struct rchan_buf *buf,
+                                              int *is_global)
+{
+        return debugfs_create_file(filename, mode, parent, buf,
+	                           &relay_file_operations);
+}
+
+/*
+ * remove_buf_file() callback.  Removes relay file from debugfs.
+ */
+static int remove_buf_file_handler(struct dentry *dentry)
+{
+        debugfs_remove(dentry);
+
+        return 0;
+}
+
+/*
+ * relay interface callbacks
+ */
+static struct rchan_callbacks relay_callbacks =
+{
+        .create_buf_file = create_buf_file_handler,
+        .remove_buf_file = remove_buf_file_handler,
+};
+
+And an example relay_open() invocation using them:
+
+  chan = relay_open("cpu", NULL, SUBBUF_SIZE, N_SUBBUFS, &relay_callbacks);
+
+If the create_buf_file() callback fails, or isn't defined, channel
+creation and thus relay_open() will fail.
+
+The total size of each per-cpu buffer is calculated by multiplying the
+number of sub-buffers by the sub-buffer size passed into relay_open().
+The idea behind sub-buffers is that they're basically an extension of
+double-buffering to N buffers, and they also allow applications to
+easily implement random-access-on-buffer-boundary schemes, which can
+be important for some high-volume applications.  The number and size
+of sub-buffers is completely dependent on the application and even for
+the same application, different conditions will warrant different
+values for these parameters at different times.  Typically, the right
+values to use are best decided after some experimentation; in general,
+though, it's safe to assume that having only 1 sub-buffer is a bad
+idea - you're guaranteed to either overwrite data or lose events
+depending on the channel mode being used.
+
+The create_buf_file() implementation can also be defined in such a way
+as to allow the creation of a single 'global' buffer instead of the
+default per-cpu set.  This can be useful for applications interested
+mainly in seeing the relative ordering of system-wide events without
+the need to bother with saving explicit timestamps for the purpose of
+merging/sorting per-cpu files in a postprocessing step.
+
+To have relay_open() create a global buffer, the create_buf_file()
+implementation should set the value of the is_global outparam to a
+non-zero value in addition to creating the file that will be used to
+represent the single buffer.  In the case of a global buffer,
+create_buf_file() and remove_buf_file() will be called only once.  The
+normal channel-writing functions, e.g. relay_write(), can still be
+used - writes from any cpu will transparently end up in the global
+buffer - but since it is a global buffer, callers should make sure
+they use the proper locking for such a buffer, either by wrapping
+writes in a spinlock, or by copying a write function from relay.h and
+creating a local version that internally does the proper locking.
+
+Channel 'modes'
+---------------
+
+relay channels can be used in either of two modes - 'overwrite' or
+'no-overwrite'.  The mode is entirely determined by the implementation
+of the subbuf_start() callback, as described below.  The default if no
+subbuf_start() callback is defined is 'no-overwrite' mode.  If the
+default mode suits your needs, and you plan to use the read()
+interface to retrieve channel data, you can ignore the details of this
+section, as it pertains mainly to mmap() implementations.
+
+In 'overwrite' mode, also known as 'flight recorder' mode, writes
+continuously cycle around the buffer and will never fail, but will
+unconditionally overwrite old data regardless of whether it's actually
+been consumed.  In no-overwrite mode, writes will fail, i.e. data will
+be lost, if the number of unconsumed sub-buffers equals the total
+number of sub-buffers in the channel.  It should be clear that if
+there is no consumer or if the consumer can't consume sub-buffers fast
+enough, data will be lost in either case; the only difference is
+whether data is lost from the beginning or the end of a buffer.
+
+As explained above, a relay channel is made of up one or more
+per-cpu channel buffers, each implemented as a circular buffer
+subdivided into one or more sub-buffers.  Messages are written into
+the current sub-buffer of the channel's current per-cpu buffer via the
+write functions described below.  Whenever a message can't fit into
+the current sub-buffer, because there's no room left for it, the
+client is notified via the subbuf_start() callback that a switch to a
+new sub-buffer is about to occur.  The client uses this callback to 1)
+initialize the next sub-buffer if appropriate 2) finalize the previous
+sub-buffer if appropriate and 3) return a boolean value indicating
+whether or not to actually move on to the next sub-buffer.
+
+To implement 'no-overwrite' mode, the userspace client would provide
+an implementation of the subbuf_start() callback something like the
+following:
+
+static int subbuf_start(struct rchan_buf *buf,
+                        void *subbuf,
+			void *prev_subbuf,
+			unsigned int prev_padding)
+{
+	if (prev_subbuf)
+		*((unsigned *)prev_subbuf) = prev_padding;
+
+	if (relay_buf_full(buf))
+		return 0;
+
+	subbuf_start_reserve(buf, sizeof(unsigned int));
+
+	return 1;
+}
+
+If the current buffer is full, i.e. all sub-buffers remain unconsumed,
+the callback returns 0 to indicate that the buffer switch should not
+occur yet, i.e. until the consumer has had a chance to read the
+current set of ready sub-buffers.  For the relay_buf_full() function
+to make sense, the consumer is reponsible for notifying the relay
+interface when sub-buffers have been consumed via
+relay_subbufs_consumed().  Any subsequent attempts to write into the
+buffer will again invoke the subbuf_start() callback with the same
+parameters; only when the consumer has consumed one or more of the
+ready sub-buffers will relay_buf_full() return 0, in which case the
+buffer switch can continue.
+
+The implementation of the subbuf_start() callback for 'overwrite' mode
+would be very similar:
+
+static int subbuf_start(struct rchan_buf *buf,
+                        void *subbuf,
+			void *prev_subbuf,
+			unsigned int prev_padding)
+{
+	if (prev_subbuf)
+		*((unsigned *)prev_subbuf) = prev_padding;
+
+	subbuf_start_reserve(buf, sizeof(unsigned int));
+
+	return 1;
+}
+
+In this case, the relay_buf_full() check is meaningless and the
+callback always returns 1, causing the buffer switch to occur
+unconditionally.  It's also meaningless for the client to use the
+relay_subbufs_consumed() function in this mode, as it's never
+consulted.
+
+The default subbuf_start() implementation, used if the client doesn't
+define any callbacks, or doesn't define the subbuf_start() callback,
+implements the simplest possible 'no-overwrite' mode, i.e. it does
+nothing but return 0.
+
+Header information can be reserved at the beginning of each sub-buffer
+by calling the subbuf_start_reserve() helper function from within the
+subbuf_start() callback.  This reserved area can be used to store
+whatever information the client wants.  In the example above, room is
+reserved in each sub-buffer to store the padding count for that
+sub-buffer.  This is filled in for the previous sub-buffer in the
+subbuf_start() implementation; the padding value for the previous
+sub-buffer is passed into the subbuf_start() callback along with a
+pointer to the previous sub-buffer, since the padding value isn't
+known until a sub-buffer is filled.  The subbuf_start() callback is
+also called for the first sub-buffer when the channel is opened, to
+give the client a chance to reserve space in it.  In this case the
+previous sub-buffer pointer passed into the callback will be NULL, so
+the client should check the value of the prev_subbuf pointer before
+writing into the previous sub-buffer.
+
+Writing to a channel
+--------------------
+
+Kernel clients write data into the current cpu's channel buffer using
+relay_write() or __relay_write().  relay_write() is the main logging
+function - it uses local_irqsave() to protect the buffer and should be
+used if you might be logging from interrupt context.  If you know
+you'll never be logging from interrupt context, you can use
+__relay_write(), which only disables preemption.  These functions
+don't return a value, so you can't determine whether or not they
+failed - the assumption is that you wouldn't want to check a return
+value in the fast logging path anyway, and that they'll always succeed
+unless the buffer is full and no-overwrite mode is being used, in
+which case you can detect a failed write in the subbuf_start()
+callback by calling the relay_buf_full() helper function.
+
+relay_reserve() is used to reserve a slot in a channel buffer which
+can be written to later.  This would typically be used in applications
+that need to write directly into a channel buffer without having to
+stage data in a temporary buffer beforehand.  Because the actual write
+may not happen immediately after the slot is reserved, applications
+using relay_reserve() can keep a count of the number of bytes actually
+written, either in space reserved in the sub-buffers themselves or as
+a separate array.  See the 'reserve' example in the relay-apps tarball
+at http://relayfs.sourceforge.net for an example of how this can be
+done.  Because the write is under control of the client and is
+separated from the reserve, relay_reserve() doesn't protect the buffer
+at all - it's up to the client to provide the appropriate
+synchronization when using relay_reserve().
+
+Closing a channel
+-----------------
+
+The client calls relay_close() when it's finished using the channel.
+The channel and its associated buffers are destroyed when there are no
+longer any references to any of the channel buffers.  relay_flush()
+forces a sub-buffer switch on all the channel buffers, and can be used
+to finalize and process the last sub-buffers before the channel is
+closed.
+
+Misc
+----
+
+Some applications may want to keep a channel around and re-use it
+rather than open and close a new channel for each use.  relay_reset()
+can be used for this purpose - it resets a channel to its initial
+state without reallocating channel buffer memory or destroying
+existing mappings.  It should however only be called when it's safe to
+do so, i.e. when the channel isn't currently being written to.
+
+Finally, there are a couple of utility callbacks that can be used for
+different purposes.  buf_mapped() is called whenever a channel buffer
+is mmapped from user space and buf_unmapped() is called when it's
+unmapped.  The client can use this notification to trigger actions
+within the kernel application, such as enabling/disabling logging to
+the channel.
+
+
+Resources
+=========
+
+For news, example code, mailing list, etc. see the relay interface homepage:
+
+    http://relayfs.sourceforge.net
+
+
+Credits
+=======
+
+The ideas and specs for the relay interface came about as a result of
+discussions on tracing involving the following:
+
+Michel Dagenais		<michel.dagenais@polymtl.ca>
+Richard Moore		<richardj_moore@uk.ibm.com>
+Bob Wisniewski		<bob@watson.ibm.com>
+Karim Yaghmour		<karim@opersys.com>
+Tom Zanussi		<zanussi@us.ibm.com>
+
+Also thanks to Hubertus Franke for a lot of useful suggestions and bug
+reports.
diff --git a/Documentation/filesystems/relayfs.txt b/Documentation/filesystems/relayfs.txt
deleted file mode 100644
index 5832377b734..00000000000
--- a/Documentation/filesystems/relayfs.txt
+++ /dev/null
@@ -1,442 +0,0 @@
-
-relayfs - a high-speed data relay filesystem
-============================================
-
-relayfs is a filesystem designed to provide an efficient mechanism for
-tools and facilities to relay large and potentially sustained streams
-of data from kernel space to user space.
-
-The main abstraction of relayfs is the 'channel'.  A channel consists
-of a set of per-cpu kernel buffers each represented by a file in the
-relayfs filesystem.  Kernel clients write into a channel using
-efficient write functions which automatically log to the current cpu's
-channel buffer.  User space applications mmap() the per-cpu files and
-retrieve the data as it becomes available.
-
-The format of the data logged into the channel buffers is completely
-up to the relayfs client; relayfs does however provide hooks which
-allow clients to impose some structure on the buffer data.  Nor does
-relayfs implement any form of data filtering - this also is left to
-the client.  The purpose is to keep relayfs as simple as possible.
-
-This document provides an overview of the relayfs API.  The details of
-the function parameters are documented along with the functions in the
-filesystem code - please see that for details.
-
-Semantics
-=========
-
-Each relayfs channel has one buffer per CPU, each buffer has one or
-more sub-buffers. Messages are written to the first sub-buffer until
-it is too full to contain a new message, in which case it it is
-written to the next (if available).  Messages are never split across
-sub-buffers.  At this point, userspace can be notified so it empties
-the first sub-buffer, while the kernel continues writing to the next.
-
-When notified that a sub-buffer is full, the kernel knows how many
-bytes of it are padding i.e. unused.  Userspace can use this knowledge
-to copy only valid data.
-
-After copying it, userspace can notify the kernel that a sub-buffer
-has been consumed.
-
-relayfs can operate in a mode where it will overwrite data not yet
-collected by userspace, and not wait for it to consume it.
-
-relayfs itself does not provide for communication of such data between
-userspace and kernel, allowing the kernel side to remain simple and
-not impose a single interface on userspace. It does provide a set of
-examples and a separate helper though, described below.
-
-klog and relay-apps example code
-================================
-
-relayfs itself is ready to use, but to make things easier, a couple
-simple utility functions and a set of examples are provided.
-
-The relay-apps example tarball, available on the relayfs sourceforge
-site, contains a set of self-contained examples, each consisting of a
-pair of .c files containing boilerplate code for each of the user and
-kernel sides of a relayfs application; combined these two sets of
-boilerplate code provide glue to easily stream data to disk, without
-having to bother with mundane housekeeping chores.
-
-The 'klog debugging functions' patch (klog.patch in the relay-apps
-tarball) provides a couple of high-level logging functions to the
-kernel which allow writing formatted text or raw data to a channel,
-regardless of whether a channel to write into exists or not, or
-whether relayfs is compiled into the kernel or is configured as a
-module.  These functions allow you to put unconditional 'trace'
-statements anywhere in the kernel or kernel modules; only when there
-is a 'klog handler' registered will data actually be logged (see the
-klog and kleak examples for details).
-
-It is of course possible to use relayfs from scratch i.e. without
-using any of the relay-apps example code or klog, but you'll have to
-implement communication between userspace and kernel, allowing both to
-convey the state of buffers (full, empty, amount of padding).
-
-klog and the relay-apps examples can be found in the relay-apps
-tarball on http://relayfs.sourceforge.net
-
-
-The relayfs user space API
-==========================
-
-relayfs implements basic file operations for user space access to
-relayfs channel buffer data.  Here are the file operations that are
-available and some comments regarding their behavior:
-
-open()	 enables user to open an _existing_ buffer.
-
-mmap()	 results in channel buffer being mapped into the caller's
-	 memory space. Note that you can't do a partial mmap - you must
-	 map the entire file, which is NRBUF * SUBBUFSIZE.
-
-read()	 read the contents of a channel buffer.  The bytes read are
-	 'consumed' by the reader i.e. they won't be available again
-	 to subsequent reads.  If the channel is being used in
-	 no-overwrite mode (the default), it can be read at any time
-	 even if there's an active kernel writer.  If the channel is
-	 being used in overwrite mode and there are active channel
-	 writers, results may be unpredictable - users should make
-	 sure that all logging to the channel has ended before using
-	 read() with overwrite mode.
-
-poll()	 POLLIN/POLLRDNORM/POLLERR supported.  User applications are
-	 notified when sub-buffer boundaries are crossed.
-
-close() decrements the channel buffer's refcount.  When the refcount
-	reaches 0 i.e. when no process or kernel client has the buffer
-	open, the channel buffer is freed.
-
-
-In order for a user application to make use of relayfs files, the
-relayfs filesystem must be mounted.  For example,
-
-	mount -t relayfs relayfs /mnt/relay
-
-NOTE:	relayfs doesn't need to be mounted for kernel clients to create
-	or use channels - it only needs to be mounted when user space
-	applications need access to the buffer data.
-
-
-The relayfs kernel API
-======================
-
-Here's a summary of the API relayfs provides to in-kernel clients:
-
-
-  channel management functions:
-
-    relay_open(base_filename, parent, subbuf_size, n_subbufs,
-               callbacks)
-    relay_close(chan)
-    relay_flush(chan)
-    relay_reset(chan)
-    relayfs_create_dir(name, parent)
-    relayfs_remove_dir(dentry)
-    relayfs_create_file(name, parent, mode, fops, data)
-    relayfs_remove_file(dentry)
-
-  channel management typically called on instigation of userspace:
-
-    relay_subbufs_consumed(chan, cpu, subbufs_consumed)
-
-  write functions:
-
-    relay_write(chan, data, length)
-    __relay_write(chan, data, length)
-    relay_reserve(chan, length)
-
-  callbacks:
-
-    subbuf_start(buf, subbuf, prev_subbuf, prev_padding)
-    buf_mapped(buf, filp)
-    buf_unmapped(buf, filp)
-    create_buf_file(filename, parent, mode, buf, is_global)
-    remove_buf_file(dentry)
-
-  helper functions:
-
-    relay_buf_full(buf)
-    subbuf_start_reserve(buf, length)
-
-
-Creating a channel
-------------------
-
-relay_open() is used to create a channel, along with its per-cpu
-channel buffers.  Each channel buffer will have an associated file
-created for it in the relayfs filesystem, which can be opened and
-mmapped from user space if desired.  The files are named
-basename0...basenameN-1 where N is the number of online cpus, and by
-default will be created in the root of the filesystem.  If you want a
-directory structure to contain your relayfs files, you can create it
-with relayfs_create_dir() and pass the parent directory to
-relay_open().  Clients are responsible for cleaning up any directory
-structure they create when the channel is closed - use
-relayfs_remove_dir() for that.
-
-The total size of each per-cpu buffer is calculated by multiplying the
-number of sub-buffers by the sub-buffer size passed into relay_open().
-The idea behind sub-buffers is that they're basically an extension of
-double-buffering to N buffers, and they also allow applications to
-easily implement random-access-on-buffer-boundary schemes, which can
-be important for some high-volume applications.  The number and size
-of sub-buffers is completely dependent on the application and even for
-the same application, different conditions will warrant different
-values for these parameters at different times.  Typically, the right
-values to use are best decided after some experimentation; in general,
-though, it's safe to assume that having only 1 sub-buffer is a bad
-idea - you're guaranteed to either overwrite data or lose events
-depending on the channel mode being used.
-
-Channel 'modes'
----------------
-
-relayfs channels can be used in either of two modes - 'overwrite' or
-'no-overwrite'.  The mode is entirely determined by the implementation
-of the subbuf_start() callback, as described below.  In 'overwrite'
-mode, also known as 'flight recorder' mode, writes continuously cycle
-around the buffer and will never fail, but will unconditionally
-overwrite old data regardless of whether it's actually been consumed.
-In no-overwrite mode, writes will fail i.e. data will be lost, if the
-number of unconsumed sub-buffers equals the total number of
-sub-buffers in the channel.  It should be clear that if there is no
-consumer or if the consumer can't consume sub-buffers fast enought,
-data will be lost in either case; the only difference is whether data
-is lost from the beginning or the end of a buffer.
-
-As explained above, a relayfs channel is made of up one or more
-per-cpu channel buffers, each implemented as a circular buffer
-subdivided into one or more sub-buffers.  Messages are written into
-the current sub-buffer of the channel's current per-cpu buffer via the
-write functions described below.  Whenever a message can't fit into
-the current sub-buffer, because there's no room left for it, the
-client is notified via the subbuf_start() callback that a switch to a
-new sub-buffer is about to occur.  The client uses this callback to 1)
-initialize the next sub-buffer if appropriate 2) finalize the previous
-sub-buffer if appropriate and 3) return a boolean value indicating
-whether or not to actually go ahead with the sub-buffer switch.
-
-To implement 'no-overwrite' mode, the userspace client would provide
-an implementation of the subbuf_start() callback something like the
-following:
-
-static int subbuf_start(struct rchan_buf *buf,
-                        void *subbuf,
-			void *prev_subbuf,
-			unsigned int prev_padding)
-{
-	if (prev_subbuf)
-		*((unsigned *)prev_subbuf) = prev_padding;
-
-	if (relay_buf_full(buf))
-		return 0;
-
-	subbuf_start_reserve(buf, sizeof(unsigned int));
-
-	return 1;
-}
-
-If the current buffer is full i.e. all sub-buffers remain unconsumed,
-the callback returns 0 to indicate that the buffer switch should not
-occur yet i.e. until the consumer has had a chance to read the current
-set of ready sub-buffers.  For the relay_buf_full() function to make
-sense, the consumer is reponsible for notifying relayfs when
-sub-buffers have been consumed via relay_subbufs_consumed().  Any
-subsequent attempts to write into the buffer will again invoke the
-subbuf_start() callback with the same parameters; only when the
-consumer has consumed one or more of the ready sub-buffers will
-relay_buf_full() return 0, in which case the buffer switch can
-continue.
-
-The implementation of the subbuf_start() callback for 'overwrite' mode
-would be very similar:
-
-static int subbuf_start(struct rchan_buf *buf,
-                        void *subbuf,
-			void *prev_subbuf,
-			unsigned int prev_padding)
-{
-	if (prev_subbuf)
-		*((unsigned *)prev_subbuf) = prev_padding;
-
-	subbuf_start_reserve(buf, sizeof(unsigned int));
-
-	return 1;
-}
-
-In this case, the relay_buf_full() check is meaningless and the
-callback always returns 1, causing the buffer switch to occur
-unconditionally.  It's also meaningless for the client to use the
-relay_subbufs_consumed() function in this mode, as it's never
-consulted.
-
-The default subbuf_start() implementation, used if the client doesn't
-define any callbacks, or doesn't define the subbuf_start() callback,
-implements the simplest possible 'no-overwrite' mode i.e. it does
-nothing but return 0.
-
-Header information can be reserved at the beginning of each sub-buffer
-by calling the subbuf_start_reserve() helper function from within the
-subbuf_start() callback.  This reserved area can be used to store
-whatever information the client wants.  In the example above, room is
-reserved in each sub-buffer to store the padding count for that
-sub-buffer.  This is filled in for the previous sub-buffer in the
-subbuf_start() implementation; the padding value for the previous
-sub-buffer is passed into the subbuf_start() callback along with a
-pointer to the previous sub-buffer, since the padding value isn't
-known until a sub-buffer is filled.  The subbuf_start() callback is
-also called for the first sub-buffer when the channel is opened, to
-give the client a chance to reserve space in it.  In this case the
-previous sub-buffer pointer passed into the callback will be NULL, so
-the client should check the value of the prev_subbuf pointer before
-writing into the previous sub-buffer.
-
-Writing to a channel
---------------------
-
-kernel clients write data into the current cpu's channel buffer using
-relay_write() or __relay_write().  relay_write() is the main logging
-function - it uses local_irqsave() to protect the buffer and should be
-used if you might be logging from interrupt context.  If you know
-you'll never be logging from interrupt context, you can use
-__relay_write(), which only disables preemption.  These functions
-don't return a value, so you can't determine whether or not they
-failed - the assumption is that you wouldn't want to check a return
-value in the fast logging path anyway, and that they'll always succeed
-unless the buffer is full and no-overwrite mode is being used, in
-which case you can detect a failed write in the subbuf_start()
-callback by calling the relay_buf_full() helper function.
-
-relay_reserve() is used to reserve a slot in a channel buffer which
-can be written to later.  This would typically be used in applications
-that need to write directly into a channel buffer without having to
-stage data in a temporary buffer beforehand.  Because the actual write
-may not happen immediately after the slot is reserved, applications
-using relay_reserve() can keep a count of the number of bytes actually
-written, either in space reserved in the sub-buffers themselves or as
-a separate array.  See the 'reserve' example in the relay-apps tarball
-at http://relayfs.sourceforge.net for an example of how this can be
-done.  Because the write is under control of the client and is
-separated from the reserve, relay_reserve() doesn't protect the buffer
-at all - it's up to the client to provide the appropriate
-synchronization when using relay_reserve().
-
-Closing a channel
------------------
-
-The client calls relay_close() when it's finished using the channel.
-The channel and its associated buffers are destroyed when there are no
-longer any references to any of the channel buffers.  relay_flush()
-forces a sub-buffer switch on all the channel buffers, and can be used
-to finalize and process the last sub-buffers before the channel is
-closed.
-
-Creating non-relay files
-------------------------
-
-relay_open() automatically creates files in the relayfs filesystem to
-represent the per-cpu kernel buffers; it's often useful for
-applications to be able to create their own files alongside the relay
-files in the relayfs filesystem as well e.g. 'control' files much like
-those created in /proc or debugfs for similar purposes, used to
-communicate control information between the kernel and user sides of a
-relayfs application.  For this purpose the relayfs_create_file() and
-relayfs_remove_file() API functions exist.  For relayfs_create_file(),
-the caller passes in a set of user-defined file operations to be used
-for the file and an optional void * to a user-specified data item,
-which will be accessible via inode->u.generic_ip (see the relay-apps
-tarball for examples).  The file_operations are a required parameter
-to relayfs_create_file() and thus the semantics of these files are
-completely defined by the caller.
-
-See the relay-apps tarball at http://relayfs.sourceforge.net for
-examples of how these non-relay files are meant to be used.
-
-Creating relay files in other filesystems
------------------------------------------
-
-By default of course, relay_open() creates relay files in the relayfs
-filesystem.  Because relay_file_operations is exported, however, it's
-also possible to create and use relay files in other pseudo-filesytems
-such as debugfs.
-
-For this purpose, two callback functions are provided,
-create_buf_file() and remove_buf_file().  create_buf_file() is called
-once for each per-cpu buffer from relay_open() to allow the client to
-create a file to be used to represent the corresponding buffer; if
-this callback is not defined, the default implementation will create
-and return a file in the relayfs filesystem to represent the buffer.
-The callback should return the dentry of the file created to represent
-the relay buffer.  Note that the parent directory passed to
-relay_open() (and passed along to the callback), if specified, must
-exist in the same filesystem the new relay file is created in.  If
-create_buf_file() is defined, remove_buf_file() must also be defined;
-it's responsible for deleting the file(s) created in create_buf_file()
-and is called during relay_close().
-
-The create_buf_file() implementation can also be defined in such a way
-as to allow the creation of a single 'global' buffer instead of the
-default per-cpu set.  This can be useful for applications interested
-mainly in seeing the relative ordering of system-wide events without
-the need to bother with saving explicit timestamps for the purpose of
-merging/sorting per-cpu files in a postprocessing step.
-
-To have relay_open() create a global buffer, the create_buf_file()
-implementation should set the value of the is_global outparam to a
-non-zero value in addition to creating the file that will be used to
-represent the single buffer.  In the case of a global buffer,
-create_buf_file() and remove_buf_file() will be called only once.  The
-normal channel-writing functions e.g. relay_write() can still be used
-- writes from any cpu will transparently end up in the global buffer -
-but since it is a global buffer, callers should make sure they use the
-proper locking for such a buffer, either by wrapping writes in a
-spinlock, or by copying a write function from relayfs_fs.h and
-creating a local version that internally does the proper locking.
-
-See the 'exported-relayfile' examples in the relay-apps tarball for
-examples of creating and using relay files in debugfs.
-
-Misc
-----
-
-Some applications may want to keep a channel around and re-use it
-rather than open and close a new channel for each use.  relay_reset()
-can be used for this purpose - it resets a channel to its initial
-state without reallocating channel buffer memory or destroying
-existing mappings.  It should however only be called when it's safe to
-do so i.e. when the channel isn't currently being written to.
-
-Finally, there are a couple of utility callbacks that can be used for
-different purposes.  buf_mapped() is called whenever a channel buffer
-is mmapped from user space and buf_unmapped() is called when it's
-unmapped.  The client can use this notification to trigger actions
-within the kernel application, such as enabling/disabling logging to
-the channel.
-
-
-Resources
-=========
-
-For news, example code, mailing list, etc. see the relayfs homepage:
-
-    http://relayfs.sourceforge.net
-
-
-Credits
-=======
-
-The ideas and specs for relayfs came about as a result of discussions
-on tracing involving the following:
-
-Michel Dagenais		<michel.dagenais@polymtl.ca>
-Richard Moore		<richardj_moore@uk.ibm.com>
-Bob Wisniewski		<bob@watson.ibm.com>
-Karim Yaghmour		<karim@opersys.com>
-Tom Zanussi		<zanussi@us.ibm.com>
-
-Also thanks to Hubertus Franke for a lot of useful suggestions and bug
-reports.
-- 
cgit v1.2.3


From 4c4d50f7b39cc58f1064b93a61ad617451ae41df Mon Sep 17 00:00:00 2001
From: Paul Jackson <pj@sgi.com>
Date: Sun, 27 Aug 2006 01:23:51 -0700
Subject: [PATCH] cpuset: top_cpuset tracks hotplug changes to cpu_online_map

Change the list of cpus allowed to tasks in the top (root) cpuset to
dynamically track what cpus are online, using a CPU hotplug notifier.  Make
this top cpus file read-only.

On systems that have cpusets configured in their kernel, but that aren't
actively using cpusets (for some distros, this covers the majority of
systems) all tasks end up in the top cpuset.

If that system does support CPU hotplug, then these tasks cannot make use
of CPUs that are added after system boot, because the CPUs are not allowed
in the top cpuset.  This is a surprising regression over earlier kernels
that didn't have cpusets enabled.

In order to keep the behaviour of cpusets consistent between systems
actively making use of them and systems not using them, this patch changes
the behaviour of the 'cpus' file in the top (root) cpuset, making it read
only, and making it automatically track the value of cpu_online_map.  Thus
tasks in the top cpuset will have automatic use of hot plugged CPUs allowed
by their cpuset.

Thanks to Anton Blanchard and Nathan Lynch for reporting this problem,
driving the fix, and earlier versions of this patch.

Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: Nathan Lynch <ntl@pobox.com>
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/cpusets.txt | 6 ++++++
 1 file changed, 6 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/cpusets.txt b/Documentation/cpusets.txt
index 159e2a0c3e8..76b44290c15 100644
--- a/Documentation/cpusets.txt
+++ b/Documentation/cpusets.txt
@@ -217,6 +217,12 @@ exclusive cpuset.  Also, the use of a Linux virtual file system (vfs)
 to represent the cpuset hierarchy provides for a familiar permission
 and name space for cpusets, with a minimum of additional kernel code.
 
+The cpus file in the root (top_cpuset) cpuset is read-only.
+It automatically tracks the value of cpu_online_map, using a CPU
+hotplug notifier.  If and when memory nodes can be hotplugged,
+we expect to make the mems file in the root cpuset read-only
+as well, and have it track the value of node_online_map.
+
 
 1.4 What are exclusive cpusets ?
 --------------------------------
-- 
cgit v1.2.3


From 3efbdd136e52ee4028b5bb5b848a6043cf61cd6e Mon Sep 17 00:00:00 2001
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Wed, 30 Aug 2006 08:58:00 +1000
Subject: [POWERPC] Fix MPIC sense codes in documentation

The booting-without-of.txt had incorrect definition for the sense codes
for an OpenPIC controller

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 Documentation/powerpc/booting-without-of.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt
index 8c48b8a27b9..5c0ba235f5a 100644
--- a/Documentation/powerpc/booting-without-of.txt
+++ b/Documentation/powerpc/booting-without-of.txt
@@ -1136,10 +1136,10 @@ Sense and level information should be encoded as follows:
    Devices connected to openPIC-compatible controllers should encode
    sense and polarity as follows:
 
-	0 = high to low edge sensitive type enabled
+	0 = low to high edge sensitive type enabled
 	1 = active low level sensitive type enabled
-	2 = low to high edge sensitive type enabled
-	3 = active high level sensitive type enabled
+	2 = active high level sensitive type enabled
+	3 = high to low edge sensitive type enabled
 
    ISA PIC interrupt controllers should adhere to the ISA PIC
    encodings listed below:
-- 
cgit v1.2.3


From 40dd2d20f220eda1cd0da8ea3f0f9db8971ba237 Mon Sep 17 00:00:00 2001
From: Andi Kleen <ak@suse.de>
Date: Wed, 30 Aug 2006 19:37:15 +0200
Subject: [PATCH] x86: Disable MMCONFIG on Intel SDV using DMI blacklist

As a replacement for the earlier removal of the e820 MCFG check
we blacklist the Intel SDV with the original BIOS bug that
motivated that check. On those machines don't use MMCONFIG.

This also adds a new pci=mmconf parameter to override the blacklist.

Cc: Greg KH <gregkh@suse.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/kernel-parameters.txt | 2 ++
 1 file changed, 2 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b50595a0550..7947cede871 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1183,6 +1183,8 @@ running once the system is up.
 				Mechanism 2.
 		nommconf	[IA-32,X86_64] Disable use of MMCONFIG for PCI
 				Configuration
+		mmconf		[IA-32,X86_64] Force MMCONFIG. This is useful
+				to override the builtin blacklist.
 		nomsi		[MSI] If the PCI_MSI kernel config parameter is
 				enabled, this kernel boot option can be used to
 				disable the use of MSI interrupts system-wide.
-- 
cgit v1.2.3


From 1e5f5e5cd65eec6ce5c24a9c29f3e52673b121a6 Mon Sep 17 00:00:00 2001
From: Adrian Bunk <bunk@stusta.de>
Date: Thu, 31 Aug 2006 21:27:46 -0700
Subject: [PATCH] schedule obsolete OSS drivers for removal, 2nd round

This patch schedules obsolete OSS drivers (with ALSA drivers that support
the same hardware) for removal.

A rationale of the patch is in
  http://lkml.org/lkml/2006/7/11/186

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/feature-removal-schedule.txt | 7 +++++++
 1 file changed, 7 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 87851efb022..d1cd5f93e02 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -120,6 +120,13 @@ Who:    Adrian Bunk <bunk@stusta.de>
 
 ---------------------------
 
+What:  drivers depending on OSS_OBSOLETE_DRIVER
+When:  options in 2.6.20, code in 2.6.22
+Why:   OSS drivers with ALSA replacements
+Who:   Adrian Bunk <bunk@stusta.de>
+
+---------------------------
+
 What:	pci_module_init(driver)
 When:	January 2007
 Why:	Is replaced by pci_register_driver(pci_driver).
-- 
cgit v1.2.3


From af45f32d25cc1e374184675eadc9f740221d8392 Mon Sep 17 00:00:00 2001
From: Greg KH <greg@kroah.com>
Date: Tue, 12 Sep 2006 20:35:52 -0700
Subject: [PATCH] We can not allow anonymous contributions to the kernel

The DCO does not mean anything if we allow anonymous contributors to the
kernel.  As this is an open source project, we need to do everything in the
open.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/SubmittingPatches | 2 ++
 1 file changed, 2 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 2cd7f02ffd0..d42ab4c9e89 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -308,6 +308,8 @@ then you just add a line saying
 
 	Signed-off-by: Random J Developer <random@developer.example.org>
 
+using your real name (sorry, no pseudonyms or anonymous contributions.)
+
 Some people also put extra tags at the end.  They'll just be ignored for
 now, but you can do this to mark internal company procedures or just
 point out some special detail about the sign-off. 
-- 
cgit v1.2.3


From 354332ee44834819ff31a0afbeffda0c32244f8f Mon Sep 17 00:00:00 2001
From: Jeremy Fitzhardinge <jeremy@goop.org>
Date: Tue, 12 Sep 2006 20:35:57 -0700
Subject: [PATCH] x86: reserve a boot-loader ID number for Xen

Claim an ID number for Xen in the LOADER_TYPE field.

Also, keep the table in zero-page.txt consistent with boot.txt.

[hpa says: 6 was skipped because I couldn't rule out that it hadn't been
 unofficially used.  It seemed easier to skip it for now.]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/i386/boot.txt      | 1 +
 Documentation/i386/zero-page.txt | 4 ++++
 2 files changed, 5 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/i386/boot.txt b/Documentation/i386/boot.txt
index 10312bebe55..c51314b1a46 100644
--- a/Documentation/i386/boot.txt
+++ b/Documentation/i386/boot.txt
@@ -181,6 +181,7 @@ filled out, however:
 	5  ELILO
 	7  GRuB
 	8  U-BOOT
+	9  Xen
 
 	Please contact <hpa@zytor.com> if you need a bootloader ID
 	value assigned.
diff --git a/Documentation/i386/zero-page.txt b/Documentation/i386/zero-page.txt
index df28c741678..c04a421f4a7 100644
--- a/Documentation/i386/zero-page.txt
+++ b/Documentation/i386/zero-page.txt
@@ -63,6 +63,10 @@ Offset	Type		Description
 				2 for bootsect-loader
 				3 for SYSLINUX
 				4 for ETHERBOOT
+				5 for ELILO
+				7 for GRuB
+				8 for U-BOOT
+				9 for Xen
 				V = version
 0x211	char		loadflags:
 			bit0 = 1: kernel is loaded high (bzImage)
-- 
cgit v1.2.3


From 1883c5aba9973331e3ff0050e05707fe8e84fe0d Mon Sep 17 00:00:00 2001
From: Mike Miller <mike.miller@hp.com>
Date: Tue, 12 Sep 2006 20:36:07 -0700
Subject: [PATCH] cciss: version update, new hw

Add support for new hardware and bumps the version to 3.6.10.  It seems
there were several changes introduced including soft_irq.  I decided to
bump the major number to reflect these changes.  Since we're still
supporting older vendor kernels I need some way differentiate between
kernel versions <=2.6.10 and newer kernels >=2.6.16.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Documentation/cciss.txt | 1 +
 1 file changed, 1 insertion(+)

(limited to 'Documentation')

diff --git a/Documentation/cciss.txt b/Documentation/cciss.txt
index 15378422fc4..9c629ffa0e5 100644
--- a/Documentation/cciss.txt
+++ b/Documentation/cciss.txt
@@ -20,6 +20,7 @@ This driver is known to work with the following cards:
 	* SA P400i
 	* SA E200
 	* SA E200i
+	* SA E500
 
 If nodes are not already created in the /dev/cciss directory, run as root:
 
-- 
cgit v1.2.3


From b3a8a40da5751525936c88f60bbc6a007f9eee37 Mon Sep 17 00:00:00 2001
From: Stephen Hemminger <shemminger@osdl.org>
Date: Wed, 13 Sep 2006 19:51:02 -0700
Subject: [TCP]: Turn ABC off.

Turn Appropriate Byte Count off by default because it unfairly
penalizes applications that do small writes.  Add better documentation
to describe what it is so users will understand why they might want to
turn it on.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ip-sysctl.txt | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

(limited to 'Documentation')

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 3e0c017e787..90ed78110fd 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -102,9 +102,15 @@ inet_peer_gc_maxtime - INTEGER
 TCP variables: 
 
 tcp_abc - INTEGER
-	Controls Appropriate Byte Count defined in RFC3465. If set to
-	0 then does congestion avoid once per ack. 1 is conservative
-	value, and 2 is more agressive.
+	Controls Appropriate Byte Count (ABC) defined in RFC3465.
+	ABC is a way of increasing congestion window (cwnd) more slowly
+	in response to partial acknowledgments.
+	Possible values are:
+		0 increase cwnd once per acknowledgment (no ABC)
+		1 increase cwnd once per acknowledgment of full sized segment
+		2 allow increase cwnd by two if acknowledgment is
+		  of two segments to compensate for delayed acknowledgments.
+	Default: 0 (off)
 
 tcp_syn_retries - INTEGER
 	Number of times initial SYNs for an active TCP connection attempt
-- 
cgit v1.2.3


From 72c4a13aaa0f6271e6b962a66befd68bac923bc3 Mon Sep 17 00:00:00 2001
From: Simon Horman <horms@verge.net.au>
Date: Wed, 13 Sep 2006 19:57:18 -0700
Subject: [IPVS]: Document the ports option to ip_vs_ftp in
 kernel-parameters.txt

I'm not sure if documenting this here is appropriate, but
if it is, here is some text to put there.

Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/kernel-parameters.txt | 6 ++++++
 1 file changed, 6 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 7947cede871..87a17337c7f 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -697,6 +697,12 @@ running once the system is up.
 	ips=		[HW,SCSI] Adaptec / IBM ServeRAID controller
 			See header of drivers/scsi/ips.c.
 
+	ports=		[IP_VS_FTP] IPVS ftp helper module
+			Default is 21.
+			Up to 8 (IP_VS_APP_MAX_PORTS) ports
+			may be specified.
+			Format: <port>,<port>....
+
 	irqfixup	[HW]
 			When an interrupt is not handled search all handlers
 			for it. Intended to get systems with badly broken
-- 
cgit v1.2.3


From 080f22c0dc544e498e57ad281a9de063fa839957 Mon Sep 17 00:00:00 2001
From: Stephen Hemminger <shemminger@osdl.org>
Date: Wed, 13 Sep 2006 21:13:54 -0700
Subject: [NET]: Mark frame diverter for future removal.

The code for frame diverter is unmaintained and has bitrotted.
The number of users is very small and the code has lots of problems.
If anyone is using it, they maybe exposing themselves to bad packet attacks.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/feature-removal-schedule.txt | 13 +++++++++++++
 1 file changed, 13 insertions(+)

(limited to 'Documentation')

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index d1cd5f93e02..552507fe9a7 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -281,3 +281,16 @@ Why:	The deferred output hooks are a layering violation causing unusual
 Who:	Patrick McHardy <kaber@trash.net>
 
 ---------------------------
+
+What:	frame diverter
+When:	November 2006
+Why:	The frame diverter is included in most distribution kernels, but is
+	broken. It does not correctly handle many things:
+	- IPV6
+	- non-linear skb's
+	- network device RCU on removal
+	- input frames not correctly checked for protocol errors
+	It also adds allocation overhead even if not enabled.
+	It is not clear if anyone is still using it.
+Who:	Stephen Hemminger <shemminger@osdl.org>
+
-- 
cgit v1.2.3