diff options
Diffstat (limited to 'Documentation')
104 files changed, 2965 insertions, 1073 deletions
diff --git a/Documentation/ABI/testing/ima_policy b/Documentation/ABI/testing/ima_policy index 6434f0df012..6cd6daefaae 100644 --- a/Documentation/ABI/testing/ima_policy +++ b/Documentation/ABI/testing/ima_policy @@ -20,7 +20,7 @@ Description: lsm: [[subj_user=] [subj_role=] [subj_type=] [obj_user=] [obj_role=] [obj_type=]] - base: func:= [BPRM_CHECK][FILE_MMAP][INODE_PERMISSION] + base: func:= [BPRM_CHECK][FILE_MMAP][FILE_CHECK] mask:= [MAY_READ] [MAY_WRITE] [MAY_APPEND] [MAY_EXEC] fsmagic:= hex value uid:= decimal value @@ -40,11 +40,11 @@ Description: measure func=BPRM_CHECK measure func=FILE_MMAP mask=MAY_EXEC - measure func=INODE_PERM mask=MAY_READ uid=0 + measure func=FILE_CHECK mask=MAY_READ uid=0 The default policy measures all executables in bprm_check, all files mmapped executable in file_mmap, and all files - open for read by root in inode_permission. + open for read by root in do_filp_open. Examples of LSM specific definitions: @@ -54,8 +54,8 @@ Description: dont_measure obj_type=var_log_t dont_measure obj_type=auditd_log_t - measure subj_user=system_u func=INODE_PERM mask=MAY_READ - measure subj_role=system_r func=INODE_PERM mask=MAY_READ + measure subj_user=system_u func=FILE_CHECK mask=MAY_READ + measure subj_role=system_r func=FILE_CHECK mask=MAY_READ Smack: - measure subj_user=_ func=INODE_PERM mask=MAY_READ + measure subj_user=_ func=FILE_CHECK mask=MAY_READ diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block index d2f90334bb9..4873c759d53 100644 --- a/Documentation/ABI/testing/sysfs-block +++ b/Documentation/ABI/testing/sysfs-block @@ -128,3 +128,17 @@ Description: preferred request size for workloads where sustained throughput is desired. If no optimal I/O size is reported this file contains 0. + +What: /sys/block/<disk>/queue/nomerges +Date: January 2010 +Contact: +Description: + Standard I/O elevator operations include attempts to + merge contiguous I/Os. For known random I/O loads these + attempts will always fail and result in extra cycles + being spent in the kernel. This allows one to turn off + this behavior on one of two ways: When set to 1, complex + merge checks are disabled, but the simple one-shot merges + with the previous I/O request are enabled. When set to 2, + all merge tries are disabled. The default value is 0 - + which enables all types of merge tries. diff --git a/Documentation/ABI/testing/sysfs-bus-usb b/Documentation/ABI/testing/sysfs-bus-usb index deb6b489e4e..a986e9bbba3 100644 --- a/Documentation/ABI/testing/sysfs-bus-usb +++ b/Documentation/ABI/testing/sysfs-bus-usb @@ -21,25 +21,27 @@ Contact: Alan Stern <stern@rowland.harvard.edu> Description: Each USB device directory will contain a file named power/level. This file holds a power-level setting for - the device, one of "on", "auto", or "suspend". + the device, either "on" or "auto". "on" means that the device is not allowed to autosuspend, although normal suspends for system sleep will still be honored. "auto" means the device will autosuspend and autoresume in the usual manner, according to the - capabilities of its driver. "suspend" means the device - is forced into a suspended state and it will not autoresume - in response to I/O requests. However remote-wakeup requests - from the device may still be enabled (the remote-wakeup - setting is controlled separately by the power/wakeup - attribute). + capabilities of its driver. During normal use, devices should be left in the "auto" - level. The other levels are meant for administrative uses. + level. The "on" level is meant for administrative uses. If you want to suspend a device immediately but leave it free to wake up in response to I/O requests, you should write "0" to power/autosuspend. + Device not capable of proper suspend and resume should be + left in the "on" level. Although the USB spec requires + devices to support suspend/resume, many of them do not. + In fact so many don't that by default, the USB core + initializes all non-hub devices in the "on" level. Some + drivers may change this setting when they are bound. + What: /sys/bus/usb/devices/.../power/persist Date: May 2007 KernelVersion: 2.6.23 @@ -157,3 +159,14 @@ Description: device. This is useful to ensure auto probing won't match the driver to the device. For example: # echo "046d c315" > /sys/bus/usb/drivers/foo/remove_id + +What: /sys/bus/usb/device/.../avoid_reset +Date: December 2009 +Contact: Oliver Neukum <oliver@neukum.org> +Description: + Writing 1 to this file tells the kernel that this + device will morph into another mode when it is reset. + Drivers will not use reset for error handling for + such devices. +Users: + usb_modeswitch diff --git a/Documentation/ABI/testing/sysfs-devices-power b/Documentation/ABI/testing/sysfs-devices-power new file mode 100644 index 00000000000..6123c523bfd --- /dev/null +++ b/Documentation/ABI/testing/sysfs-devices-power @@ -0,0 +1,79 @@ +What: /sys/devices/.../power/ +Date: January 2009 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/devices/.../power directory contains attributes + allowing the user space to check and modify some power + management related properties of given device. + +What: /sys/devices/.../power/wakeup +Date: January 2009 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/devices/.../power/wakeup attribute allows the user + space to check if the device is enabled to wake up the system + from sleep states, such as the memory sleep state (suspend to + RAM) and hibernation (suspend to disk), and to enable or disable + it to do that as desired. + + Some devices support "wakeup" events, which are hardware signals + used to activate the system from a sleep state. Such devices + have one of the following two values for the sysfs power/wakeup + file: + + + "enabled\n" to issue the events; + + "disabled\n" not to do so; + + In that cases the user space can change the setting represented + by the contents of this file by writing either "enabled", or + "disabled" to it. + + For the devices that are not capable of generating system wakeup + events this file contains "\n". In that cases the user space + cannot modify the contents of this file and the device cannot be + enabled to wake up the system. + +What: /sys/devices/.../power/control +Date: January 2009 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/devices/.../power/control attribute allows the user + space to control the run-time power management of the device. + + All devices have one of the following two values for the + power/control file: + + + "auto\n" to allow the device to be power managed at run time; + + "on\n" to prevent the device from being power managed; + + The default for all devices is "auto", which means that they may + be subject to automatic power management, depending on their + drivers. Changing this attribute to "on" prevents the driver + from power managing the device at run time. Doing that while + the device is suspended causes it to be woken up. + +What: /sys/devices/.../power/async +Date: January 2009 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/devices/.../async attribute allows the user space to + enable or diasble the device's suspend and resume callbacks to + be executed asynchronously (ie. in separate threads, in parallel + with the main suspend/resume thread) during system-wide power + transitions (eg. suspend to RAM, hibernation). + + All devices have one of the following two values for the + power/async file: + + + "enabled\n" to permit the asynchronous suspend/resume; + + "disabled\n" to forbid it; + + The value of this attribute may be changed by writing either + "enabled", or "disabled" to it. + + It generally is unsafe to permit the asynchronous suspend/resume + of a device unless it is certain that all of the PM dependencies + of the device are known to the PM core. However, for some + devices this attribute is set to "enabled" by bus type code or + device drivers and in that cases it should be safe to leave the + default value. diff --git a/Documentation/ABI/testing/sysfs-platform-asus-laptop b/Documentation/ABI/testing/sysfs-platform-asus-laptop index a1cb660c50c..1d775390e85 100644 --- a/Documentation/ABI/testing/sysfs-platform-asus-laptop +++ b/Documentation/ABI/testing/sysfs-platform-asus-laptop @@ -1,4 +1,4 @@ -What: /sys/devices/platform/asus-laptop/display +What: /sys/devices/platform/asus_laptop/display Date: January 2007 KernelVersion: 2.6.20 Contact: "Corentin Chary" <corentincj@iksaif.net> @@ -13,7 +13,7 @@ Description: Ex: - 0 (0000b) means no display - 3 (0011b) CRT+LCD. -What: /sys/devices/platform/asus-laptop/gps +What: /sys/devices/platform/asus_laptop/gps Date: January 2007 KernelVersion: 2.6.20 Contact: "Corentin Chary" <corentincj@iksaif.net> @@ -21,7 +21,7 @@ Description: Control the gps device. 1 means on, 0 means off. Users: Lapsus -What: /sys/devices/platform/asus-laptop/ledd +What: /sys/devices/platform/asus_laptop/ledd Date: January 2007 KernelVersion: 2.6.20 Contact: "Corentin Chary" <corentincj@iksaif.net> @@ -29,11 +29,11 @@ Description: Some models like the W1N have a LED display that can be used to display several informations. To control the LED display, use the following : - echo 0x0T000DDD > /sys/devices/platform/asus-laptop/ + echo 0x0T000DDD > /sys/devices/platform/asus_laptop/ where T control the 3 letters display, and DDD the 3 digits display. The DDD table can be found in Documentation/laptops/asus-laptop.txt -What: /sys/devices/platform/asus-laptop/bluetooth +What: /sys/devices/platform/asus_laptop/bluetooth Date: January 2007 KernelVersion: 2.6.20 Contact: "Corentin Chary" <corentincj@iksaif.net> @@ -42,7 +42,7 @@ Description: This may control the led, the device or both. Users: Lapsus -What: /sys/devices/platform/asus-laptop/wlan +What: /sys/devices/platform/asus_laptop/wlan Date: January 2007 KernelVersion: 2.6.20 Contact: "Corentin Chary" <corentincj@iksaif.net> diff --git a/Documentation/ABI/testing/sysfs-platform-eeepc-laptop b/Documentation/ABI/testing/sysfs-platform-eeepc-laptop index 7445dfb321b..5b026c69587 100644 --- a/Documentation/ABI/testing/sysfs-platform-eeepc-laptop +++ b/Documentation/ABI/testing/sysfs-platform-eeepc-laptop @@ -1,4 +1,4 @@ -What: /sys/devices/platform/eeepc-laptop/disp +What: /sys/devices/platform/eeepc/disp Date: May 2008 KernelVersion: 2.6.26 Contact: "Corentin Chary" <corentincj@iksaif.net> @@ -9,21 +9,21 @@ Description: - 3 = LCD+CRT If you run X11, you should use xrandr instead. -What: /sys/devices/platform/eeepc-laptop/camera +What: /sys/devices/platform/eeepc/camera Date: May 2008 KernelVersion: 2.6.26 Contact: "Corentin Chary" <corentincj@iksaif.net> Description: Control the camera. 1 means on, 0 means off. -What: /sys/devices/platform/eeepc-laptop/cardr +What: /sys/devices/platform/eeepc/cardr Date: May 2008 KernelVersion: 2.6.26 Contact: "Corentin Chary" <corentincj@iksaif.net> Description: Control the card reader. 1 means on, 0 means off. -What: /sys/devices/platform/eeepc-laptop/cpufv +What: /sys/devices/platform/eeepc/cpufv Date: Jun 2009 KernelVersion: 2.6.31 Contact: "Corentin Chary" <corentincj@iksaif.net> @@ -42,7 +42,7 @@ Description: `------------ Availables modes For example, 0x301 means: mode 1 selected, 3 available modes. -What: /sys/devices/platform/eeepc-laptop/available_cpufv +What: /sys/devices/platform/eeepc/available_cpufv Date: Jun 2009 KernelVersion: 2.6.31 Contact: "Corentin Chary" <corentincj@iksaif.net> diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power index dcff4d0623a..d6a801f45b4 100644 --- a/Documentation/ABI/testing/sysfs-power +++ b/Documentation/ABI/testing/sysfs-power @@ -101,3 +101,16 @@ Description: CAUTION: Using it will cause your machine's real-time (CMOS) clock to be set to a random invalid time after a resume. + +What: /sys/power/pm_async +Date: January 2009 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/power/pm_async file controls the switch allowing the + user space to enable or disable asynchronous suspend and resume + of devices. If enabled, this feature will cause some device + drivers' suspend and resume callbacks to be executed in parallel + with each other and with the main suspend thread. It is enabled + if this file contains "1", which is the default. It may be + disabled by writing "0" to this file, in which case all devices + will be suspended and resumed synchronously. diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index f9a6e2c75f1..1b2dd4fc3db 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -45,7 +45,7 @@ </sect1> <sect1><title>Atomic and pointer manipulation</title> -!Iarch/x86/include/asm/atomic_32.h +!Iarch/x86/include/asm/atomic.h !Iarch/x86/include/asm/unaligned.h </sect1> diff --git a/Documentation/DocBook/deviceiobook.tmpl b/Documentation/DocBook/deviceiobook.tmpl index 3ed88126ab8..c1ed6a49e59 100644 --- a/Documentation/DocBook/deviceiobook.tmpl +++ b/Documentation/DocBook/deviceiobook.tmpl @@ -316,7 +316,7 @@ CPU B: spin_unlock_irqrestore(&dev_lock, flags) <chapter id="pubfunctions"> <title>Public Functions Provided</title> -!Iarch/x86/include/asm/io_32.h +!Iarch/x86/include/asm/io.h !Elib/iomap.c </chapter> diff --git a/Documentation/DocBook/mac80211.tmpl b/Documentation/DocBook/mac80211.tmpl index f3f37f141db..affb15a344a 100644 --- a/Documentation/DocBook/mac80211.tmpl +++ b/Documentation/DocBook/mac80211.tmpl @@ -144,7 +144,7 @@ usage should require reading the full document. this though and the recommendation to allow only a single interface in STA mode at first! </para> -!Finclude/net/mac80211.h ieee80211_if_init_conf +!Finclude/net/mac80211.h ieee80211_vif </chapter> <chapter id="rx-tx"> @@ -234,7 +234,6 @@ usage should require reading the full document. <title>Multiple queues and QoS support</title> <para>TBD</para> !Finclude/net/mac80211.h ieee80211_tx_queue_params -!Finclude/net/mac80211.h ieee80211_tx_queue_stats </chapter> <chapter id="AP"> diff --git a/Documentation/DocBook/mtdnand.tmpl b/Documentation/DocBook/mtdnand.tmpl index f508a8a27fe..5e7d84b4850 100644 --- a/Documentation/DocBook/mtdnand.tmpl +++ b/Documentation/DocBook/mtdnand.tmpl @@ -174,7 +174,7 @@ </para> <programlisting> static struct mtd_info *board_mtd; -static unsigned long baseaddr; +static void __iomem *baseaddr; </programlisting> <para> Static example @@ -182,7 +182,7 @@ static unsigned long baseaddr; <programlisting> static struct mtd_info board_mtd; static struct nand_chip board_chip; -static unsigned long baseaddr; +static void __iomem *baseaddr; </programlisting> </sect1> <sect1 id="Partition_defines"> @@ -283,8 +283,8 @@ int __init board_init (void) } /* map physical address */ - baseaddr = (unsigned long)ioremap(CHIP_PHYSICAL_ADDRESS, 1024); - if(!baseaddr){ + baseaddr = ioremap(CHIP_PHYSICAL_ADDRESS, 1024); + if (!baseaddr) { printk("Ioremap to access NAND chip failed\n"); err = -EIO; goto out_mtd; @@ -316,7 +316,7 @@ int __init board_init (void) goto out; out_ior: - iounmap((void *)baseaddr); + iounmap(baseaddr); out_mtd: kfree (board_mtd); out: @@ -341,7 +341,7 @@ static void __exit board_cleanup (void) nand_release (board_mtd); /* unmap physical address */ - iounmap((void *)baseaddr); + iounmap(baseaddr); /* Free the MTD device structure */ kfree (board_mtd); diff --git a/Documentation/DocBook/v4l/io.xml b/Documentation/DocBook/v4l/io.xml index f92f24323b2..e870330cbf7 100644 --- a/Documentation/DocBook/v4l/io.xml +++ b/Documentation/DocBook/v4l/io.xml @@ -589,7 +589,8 @@ number of a video input as in &v4l2-input; field <entry></entry> <entry>A place holder for future extensions and custom (driver defined) buffer types -<constant>V4L2_BUF_TYPE_PRIVATE</constant> and higher.</entry> +<constant>V4L2_BUF_TYPE_PRIVATE</constant> and higher. Applications +should set this to 0.</entry> </row> </tbody> </tgroup> diff --git a/Documentation/DocBook/v4l/vidioc-qbuf.xml b/Documentation/DocBook/v4l/vidioc-qbuf.xml index 18708177815..b843bd7b389 100644 --- a/Documentation/DocBook/v4l/vidioc-qbuf.xml +++ b/Documentation/DocBook/v4l/vidioc-qbuf.xml @@ -54,12 +54,10 @@ to enqueue an empty (capturing) or filled (output) buffer in the driver's incoming queue. The semantics depend on the selected I/O method.</para> - <para>To enqueue a <link linkend="mmap">memory mapped</link> -buffer applications set the <structfield>type</structfield> field of a -&v4l2-buffer; to the same buffer type as previously &v4l2-format; -<structfield>type</structfield> and &v4l2-requestbuffers; -<structfield>type</structfield>, the <structfield>memory</structfield> -field to <constant>V4L2_MEMORY_MMAP</constant> and the + <para>To enqueue a buffer applications set the <structfield>type</structfield> +field of a &v4l2-buffer; to the same buffer type as was previously used +with &v4l2-format; <structfield>type</structfield> and &v4l2-requestbuffers; +<structfield>type</structfield>. Applications must also set the <structfield>index</structfield> field. Valid index numbers range from zero to the number of buffers allocated with &VIDIOC-REQBUFS; (&v4l2-requestbuffers; <structfield>count</structfield>) minus one. The @@ -70,8 +68,19 @@ intended for output (<structfield>type</structfield> is <constant>V4L2_BUF_TYPE_VBI_OUTPUT</constant>) applications must also initialize the <structfield>bytesused</structfield>, <structfield>field</structfield> and -<structfield>timestamp</structfield> fields. See <xref - linkend="buffer" /> for details. When +<structfield>timestamp</structfield> fields, see <xref +linkend="buffer" /> for details. +Applications must also set <structfield>flags</structfield> to 0. If a driver +supports capturing from specific video inputs and you want to specify a video +input, then <structfield>flags</structfield> should be set to +<constant>V4L2_BUF_FLAG_INPUT</constant> and the field +<structfield>input</structfield> must be initialized to the desired input. +The <structfield>reserved</structfield> field must be set to 0. +</para> + + <para>To enqueue a <link linkend="mmap">memory mapped</link> +buffer applications set the <structfield>memory</structfield> +field to <constant>V4L2_MEMORY_MMAP</constant>. When <constant>VIDIOC_QBUF</constant> is called with a pointer to this structure the driver sets the <constant>V4L2_BUF_FLAG_MAPPED</constant> and @@ -81,14 +90,10 @@ structure the driver sets the &EINVAL;.</para> <para>To enqueue a <link linkend="userp">user pointer</link> -buffer applications set the <structfield>type</structfield> field of a -&v4l2-buffer; to the same buffer type as previously &v4l2-format; -<structfield>type</structfield> and &v4l2-requestbuffers; -<structfield>type</structfield>, the <structfield>memory</structfield> -field to <constant>V4L2_MEMORY_USERPTR</constant> and the +buffer applications set the <structfield>memory</structfield> +field to <constant>V4L2_MEMORY_USERPTR</constant>, the <structfield>m.userptr</structfield> field to the address of the -buffer and <structfield>length</structfield> to its size. When the -buffer is intended for output additional fields must be set as above. +buffer and <structfield>length</structfield> to its size. When <constant>VIDIOC_QBUF</constant> is called with a pointer to this structure the driver sets the <constant>V4L2_BUF_FLAG_QUEUED</constant> flag and clears the <constant>V4L2_BUF_FLAG_MAPPED</constant> and @@ -96,13 +101,14 @@ flag and clears the <constant>V4L2_BUF_FLAG_MAPPED</constant> and <structfield>flags</structfield> field, or it returns an error code. This ioctl locks the memory pages of the buffer in physical memory, they cannot be swapped out to disk. Buffers remain locked until -dequeued, until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl are +dequeued, until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl is called, or until the device is closed.</para> <para>Applications call the <constant>VIDIOC_DQBUF</constant> ioctl to dequeue a filled (capturing) or displayed (output) buffer from the driver's outgoing queue. They just set the -<structfield>type</structfield> and <structfield>memory</structfield> +<structfield>type</structfield>, <structfield>memory</structfield> +and <structfield>reserved</structfield> fields of a &v4l2-buffer; as above, when <constant>VIDIOC_DQBUF</constant> is called with a pointer to this structure the driver fills the remaining fields or returns an error code.</para> diff --git a/Documentation/DocBook/v4l/vidioc-querybuf.xml b/Documentation/DocBook/v4l/vidioc-querybuf.xml index d834993e619..e649805a490 100644 --- a/Documentation/DocBook/v4l/vidioc-querybuf.xml +++ b/Documentation/DocBook/v4l/vidioc-querybuf.xml @@ -54,12 +54,13 @@ buffer at any time after buffers have been allocated with the &VIDIOC-REQBUFS; ioctl.</para> <para>Applications set the <structfield>type</structfield> field - of a &v4l2-buffer; to the same buffer type as previously + of a &v4l2-buffer; to the same buffer type as was previously used with &v4l2-format; <structfield>type</structfield> and &v4l2-requestbuffers; <structfield>type</structfield>, and the <structfield>index</structfield> field. Valid index numbers range from zero to the number of buffers allocated with &VIDIOC-REQBUFS; (&v4l2-requestbuffers; <structfield>count</structfield>) minus one. +The <structfield>reserved</structfield> field should to set to 0. After calling <constant>VIDIOC_QUERYBUF</constant> with a pointer to this structure drivers return an error code or fill the rest of the structure.</para> @@ -68,8 +69,8 @@ the structure.</para> <constant>V4L2_BUF_FLAG_MAPPED</constant>, <constant>V4L2_BUF_FLAG_QUEUED</constant> and <constant>V4L2_BUF_FLAG_DONE</constant> flags will be valid. The -<structfield>memory</structfield> field will be set to -<constant>V4L2_MEMORY_MMAP</constant>, the <structfield>m.offset</structfield> +<structfield>memory</structfield> field will be set to the current +I/O method, the <structfield>m.offset</structfield> contains the offset of the buffer from the start of the device memory, the <structfield>length</structfield> field its size. The driver may or may not set the remaining fields and flags, they are meaningless in diff --git a/Documentation/DocBook/v4l/vidioc-reqbufs.xml b/Documentation/DocBook/v4l/vidioc-reqbufs.xml index bab38084454..1c081637207 100644 --- a/Documentation/DocBook/v4l/vidioc-reqbufs.xml +++ b/Documentation/DocBook/v4l/vidioc-reqbufs.xml @@ -54,23 +54,23 @@ I/O. Memory mapped buffers are located in device memory and must be allocated with this ioctl before they can be mapped into the application's address space. User buffers are allocated by applications themselves, and this ioctl is merely used to switch the -driver into user pointer I/O mode.</para> +driver into user pointer I/O mode and to setup some internal structures.</para> - <para>To allocate device buffers applications initialize three -fields of a <structname>v4l2_requestbuffers</structname> structure. + <para>To allocate device buffers applications initialize all +fields of the <structname>v4l2_requestbuffers</structname> structure. They set the <structfield>type</structfield> field to the respective stream or buffer type, the <structfield>count</structfield> field to -the desired number of buffers, and <structfield>memory</structfield> -must be set to <constant>V4L2_MEMORY_MMAP</constant>. When the ioctl -is called with a pointer to this structure the driver attempts to -allocate the requested number of buffers and stores the actual number +the desired number of buffers, <structfield>memory</structfield> +must be set to the requested I/O method and the reserved array +must be zeroed. When the ioctl +is called with a pointer to this structure the driver will attempt to allocate +the requested number of buffers and it stores the actual number allocated in the <structfield>count</structfield> field. It can be smaller than the number requested, even zero, when the driver runs out -of free memory. A larger number is possible when the driver requires -more buffers to function correctly.<footnote> - <para>For example video output requires at least two buffers, +of free memory. A larger number is also possible when the driver requires +more buffers to function correctly. For example video output requires at least two buffers, one displayed and one filled by the application.</para> - </footnote> When memory mapping I/O is not supported the ioctl + <para>When the I/O method is not supported the ioctl returns an &EINVAL;.</para> <para>Applications can call <constant>VIDIOC_REQBUFS</constant> @@ -81,14 +81,6 @@ in progress, an implicit &VIDIOC-STREAMOFF;. <!-- mhs: I see no reason why munmap()ping one or even all buffers must imply streamoff.--></para> - <para>To negotiate user pointer I/O, applications initialize only -the <structfield>type</structfield> field and set -<structfield>memory</structfield> to -<constant>V4L2_MEMORY_USERPTR</constant>. When the ioctl is called -with a pointer to this structure the driver prepares for user pointer -I/O, when this I/O method is not supported the ioctl returns an -&EINVAL;.</para> - <table pgwide="1" frame="none" id="v4l2-requestbuffers"> <title>struct <structname>v4l2_requestbuffers</structname></title> <tgroup cols="3"> @@ -97,9 +89,7 @@ I/O, when this I/O method is not supported the ioctl returns an <row> <entry>__u32</entry> <entry><structfield>count</structfield></entry> - <entry>The number of buffers requested or granted. This -field is only used when <structfield>memory</structfield> is set to -<constant>V4L2_MEMORY_MMAP</constant>.</entry> + <entry>The number of buffers requested or granted.</entry> </row> <row> <entry>&v4l2-buf-type;</entry> @@ -120,7 +110,7 @@ as the &v4l2-format; <structfield>type</structfield> field. See <xref <entry><structfield>reserved</structfield>[2]</entry> <entry>A place holder for future extensions and custom (driver defined) buffer types <constant>V4L2_BUF_TYPE_PRIVATE</constant> and -higher.</entry> +higher. This array should be zeroed by applications.</entry> </row> </tbody> </tgroup> diff --git a/Documentation/IO-mapping.txt b/Documentation/IO-mapping.txt index 78a440695e1..1b5aa10df84 100644 --- a/Documentation/IO-mapping.txt +++ b/Documentation/IO-mapping.txt @@ -157,7 +157,7 @@ For such memory, you can do things like * access only the 640k-1MB area, so anything else * has to be remapped. */ - char * baseptr = ioremap(0xFC000000, 1024*1024); + void __iomem *baseptr = ioremap(0xFC000000, 1024*1024); /* write a 'A' to the offset 10 of the area */ writeb('A',baseptr+10); diff --git a/Documentation/DMA-mapping.txt b/Documentation/PCI/PCI-DMA-mapping.txt index ecad88d9fe5..ecad88d9fe5 100644 --- a/Documentation/DMA-mapping.txt +++ b/Documentation/PCI/PCI-DMA-mapping.txt diff --git a/Documentation/RCU/00-INDEX b/Documentation/RCU/00-INDEX index 9bb62f7b89c..71b6f500ddb 100644 --- a/Documentation/RCU/00-INDEX +++ b/Documentation/RCU/00-INDEX @@ -6,16 +6,22 @@ checklist.txt - Review Checklist for RCU Patches listRCU.txt - Using RCU to Protect Read-Mostly Linked Lists +lockdep.txt + - RCU and lockdep checking NMI-RCU.txt - Using RCU to Protect Dynamic NMI Handlers +rcubarrier.txt + - RCU and Unloadable Modules +rculist_nulls.txt + - RCU list primitives for use with SLAB_DESTROY_BY_RCU rcuref.txt - Reference-count design for elements of lists/arrays protected by RCU rcu.txt - RCU Concepts -rcubarrier.txt - - Unloading modules that use RCU callbacks RTFP.txt - List of RCU papers (bibliography) going back to 1980. +stallwarn.txt + - RCU CPU stall warnings (CONFIG_RCU_CPU_STALL_DETECTOR) torture.txt - RCU Torture Test Operation (CONFIG_RCU_TORTURE_TEST) trace.txt diff --git a/Documentation/RCU/RTFP.txt b/Documentation/RCU/RTFP.txt index d2b85237c76..5aea459e3dd 100644 --- a/Documentation/RCU/RTFP.txt +++ b/Documentation/RCU/RTFP.txt @@ -25,10 +25,10 @@ to be referencing the data structure. However, this mechanism was not optimized for modern computer systems, which is not surprising given that these overheads were not so expensive in the mid-80s. Nonetheless, passive serialization appears to be the first deferred-destruction -mechanism to be used in production. Furthermore, the relevant patent has -lapsed, so this approach may be used in non-GPL software, if desired. -(In contrast, use of RCU is permitted only in software licensed under -GPL. Sorry!!!) +mechanism to be used in production. Furthermore, the relevant patent +has lapsed, so this approach may be used in non-GPL software, if desired. +(In contrast, implementation of RCU is permitted only in software licensed +under either GPL or LGPL. Sorry!!!) In 1990, Pugh [Pugh90] noted that explicitly tracking which threads were reading a given data structure permitted deferred free to operate @@ -150,6 +150,18 @@ preemptible RCU [PaulEMcKenney2007PreemptibleRCU], and the three-part LWN "What is RCU?" series [PaulEMcKenney2007WhatIsRCUFundamentally, PaulEMcKenney2008WhatIsRCUUsage, and PaulEMcKenney2008WhatIsRCUAPI]. +2008 saw a journal paper on real-time RCU [DinakarGuniguntala2008IBMSysJ], +a history of how Linux changed RCU more than RCU changed Linux +[PaulEMcKenney2008RCUOSR], and a design overview of hierarchical RCU +[PaulEMcKenney2008HierarchicalRCU]. + +2009 introduced user-level RCU algorithms [PaulEMcKenney2009MaliciousURCU], +which Mathieu Desnoyers is now maintaining [MathieuDesnoyers2009URCU] +[MathieuDesnoyersPhD]. TINY_RCU [PaulEMcKenney2009BloatWatchRCU] made +its appearance, as did expedited RCU [PaulEMcKenney2009expeditedRCU]. +The problem of resizeable RCU-protected hash tables may now be on a path +to a solution [JoshTriplett2009RPHash]. + Bibtex Entries @article{Kung80 @@ -730,6 +742,11 @@ Revised: " } +# +# "What is RCU?" LWN series. +# +######################################################################## + @article{DinakarGuniguntala2008IBMSysJ ,author="D. Guniguntala and P. E. McKenney and J. Triplett and J. Walpole" ,title="The read-copy-update mechanism for supporting real-time applications on shared-memory multiprocessor systems with {Linux}" @@ -820,3 +837,39 @@ Revised: Uniprocessor assumptions allow simplified RCU implementation. " } + +@unpublished{PaulEMcKenney2009expeditedRCU +,Author="Paul E. McKenney" +,Title="[{PATCH} -tip 0/3] expedited 'big hammer' {RCU} grace periods" +,month="June" +,day="25" +,year="2009" +,note="Available: +\url{http://lkml.org/lkml/2009/6/25/306} +[Viewed August 16, 2009]" +,annotation=" + First posting of expedited RCU to be accepted into -tip. +" +} + +@unpublished{JoshTriplett2009RPHash +,Author="Josh Triplett" +,Title="Scalable concurrent hash tables via relativistic programming" +,month="September" +,year="2009" +,note="Linux Plumbers Conference presentation" +,annotation=" + RP fun with hash tables. +" +} + +@phdthesis{MathieuDesnoyersPhD +, title = "Low-Impact Operating System Tracing" +, author = "Mathieu Desnoyers" +, school = "Ecole Polytechnique de Montr\'{e}al" +, month = "December" +, year = 2009 +,note="Available: +\url{http://www.lttng.org/pub/thesis/desnoyers-dissertation-2009-12.pdf} +[Viewed December 9, 2009]" +} diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index 51525a30e8b..cbc180f9019 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt @@ -8,13 +8,12 @@ would cause. This list is based on experiences reviewing such patches over a rather long period of time, but improvements are always welcome! 0. Is RCU being applied to a read-mostly situation? If the data - structure is updated more than about 10% of the time, then - you should strongly consider some other approach, unless - detailed performance measurements show that RCU is nonetheless - the right tool for the job. Yes, you might think of RCU - as simply cutting overhead off of the readers and imposing it - on the writers. That is exactly why normal uses of RCU will - do much more reading than updating. + structure is updated more than about 10% of the time, then you + should strongly consider some other approach, unless detailed + performance measurements show that RCU is nonetheless the right + tool for the job. Yes, RCU does reduce read-side overhead by + increasing write-side overhead, which is exactly why normal uses + of RCU will do much more reading than updating. Another exception is where performance is not an issue, and RCU provides a simpler implementation. An example of this situation @@ -35,13 +34,13 @@ over a rather long period of time, but improvements are always welcome! If you choose #b, be prepared to describe how you have handled memory barriers on weakly ordered machines (pretty much all of - them -- even x86 allows reads to be reordered), and be prepared - to explain why this added complexity is worthwhile. If you - choose #c, be prepared to explain how this single task does not - become a major bottleneck on big multiprocessor machines (for - example, if the task is updating information relating to itself - that other tasks can read, there by definition can be no - bottleneck). + them -- even x86 allows later loads to be reordered to precede + earlier stores), and be prepared to explain why this added + complexity is worthwhile. If you choose #c, be prepared to + explain how this single task does not become a major bottleneck on + big multiprocessor machines (for example, if the task is updating + information relating to itself that other tasks can read, there + by definition can be no bottleneck). 2. Do the RCU read-side critical sections make proper use of rcu_read_lock() and friends? These primitives are needed @@ -51,8 +50,10 @@ over a rather long period of time, but improvements are always welcome! actuarial risk of your kernel. As a rough rule of thumb, any dereference of an RCU-protected - pointer must be covered by rcu_read_lock() or rcu_read_lock_bh() - or by the appropriate update-side lock. + pointer must be covered by rcu_read_lock(), rcu_read_lock_bh(), + rcu_read_lock_sched(), or by the appropriate update-side lock. + Disabling of preemption can serve as rcu_read_lock_sched(), but + is less readable. 3. Does the update code tolerate concurrent accesses? @@ -62,25 +63,27 @@ over a rather long period of time, but improvements are always welcome! of ways to handle this concurrency, depending on the situation: a. Use the RCU variants of the list and hlist update - primitives to add, remove, and replace elements on an - RCU-protected list. Alternatively, use the RCU-protected - trees that have been added to the Linux kernel. + primitives to add, remove, and replace elements on + an RCU-protected list. Alternatively, use the other + RCU-protected data structures that have been added to + the Linux kernel. This is almost always the best approach. b. Proceed as in (a) above, but also maintain per-element locks (that are acquired by both readers and writers) that guard per-element state. Of course, fields that - the readers refrain from accessing can be guarded by the - update-side lock. + the readers refrain from accessing can be guarded by + some other lock acquired only by updaters, if desired. This works quite well, also. c. Make updates appear atomic to readers. For example, - pointer updates to properly aligned fields will appear - atomic, as will individual atomic primitives. Operations - performed under a lock and sequences of multiple atomic - primitives will -not- appear to be atomic. + pointer updates to properly aligned fields will + appear atomic, as will individual atomic primitives. + Sequences of perations performed under a lock will -not- + appear to be atomic to RCU readers, nor will sequences + of multiple atomic primitives. This can work, but is starting to get a bit tricky. @@ -98,9 +101,9 @@ over a rather long period of time, but improvements are always welcome! a new structure containing updated values. 4. Weakly ordered CPUs pose special challenges. Almost all CPUs - are weakly ordered -- even i386 CPUs allow reads to be reordered. - RCU code must take all of the following measures to prevent - memory-corruption problems: + are weakly ordered -- even x86 CPUs allow later loads to be + reordered to precede earlier stores. RCU code must take all of + the following measures to prevent memory-corruption problems: a. Readers must maintain proper ordering of their memory accesses. The rcu_dereference() primitive ensures that @@ -113,14 +116,25 @@ over a rather long period of time, but improvements are always welcome! The rcu_dereference() primitive is also an excellent documentation aid, letting the person reading the code know exactly which pointers are protected by RCU. - - The rcu_dereference() primitive is used by the various - "_rcu()" list-traversal primitives, such as the - list_for_each_entry_rcu(). Note that it is perfectly - legal (if redundant) for update-side code to use - rcu_dereference() and the "_rcu()" list-traversal - primitives. This is particularly useful in code - that is common to readers and updaters. + Please note that compilers can also reorder code, and + they are becoming increasingly aggressive about doing + just that. The rcu_dereference() primitive therefore + also prevents destructive compiler optimizations. + + The rcu_dereference() primitive is used by the + various "_rcu()" list-traversal primitives, such + as the list_for_each_entry_rcu(). Note that it is + perfectly legal (if redundant) for update-side code to + use rcu_dereference() and the "_rcu()" list-traversal + primitives. This is particularly useful in code that + is common to readers and updaters. However, lockdep + will complain if you access rcu_dereference() outside + of an RCU read-side critical section. See lockdep.txt + to learn what to do about this. + + Of course, neither rcu_dereference() nor the "_rcu()" + list-traversal primitives can substitute for a good + concurrency design coordinating among multiple updaters. b. If the list macros are being used, the list_add_tail_rcu() and list_add_rcu() primitives must be used in order @@ -135,11 +149,14 @@ over a rather long period of time, but improvements are always welcome! readers. Similarly, if the hlist macros are being used, the hlist_del_rcu() primitive is required. - The list_replace_rcu() primitive may be used to - replace an old structure with a new one in an - RCU-protected list. + The list_replace_rcu() and hlist_replace_rcu() primitives + may be used to replace an old structure with a new one + in their respective types of RCU-protected lists. + + d. Rules similar to (4b) and (4c) apply to the "hlist_nulls" + type of RCU-protected linked lists. - d. Updates must ensure that initialization of a given + e. Updates must ensure that initialization of a given structure happens before pointers to that structure are publicized. Use the rcu_assign_pointer() primitive when publicizing a pointer to a structure that can @@ -151,16 +168,31 @@ over a rather long period of time, but improvements are always welcome! it cannot block. 6. Since synchronize_rcu() can block, it cannot be called from - any sort of irq context. Ditto for synchronize_sched() and - synchronize_srcu(). - -7. If the updater uses call_rcu(), then the corresponding readers - must use rcu_read_lock() and rcu_read_unlock(). If the updater - uses call_rcu_bh(), then the corresponding readers must use - rcu_read_lock_bh() and rcu_read_unlock_bh(). If the updater - uses call_rcu_sched(), then the corresponding readers must - disable preemption. Mixing things up will result in confusion - and broken kernels. + any sort of irq context. The same rule applies for + synchronize_rcu_bh(), synchronize_sched(), synchronize_srcu(), + synchronize_rcu_expedited(), synchronize_rcu_bh_expedited(), + synchronize_sched_expedite(), and synchronize_srcu_expedited(). + + The expedited forms of these primitives have the same semantics + as the non-expedited forms, but expediting is both expensive + and unfriendly to real-time workloads. Use of the expedited + primitives should be restricted to rare configuration-change + operations that would not normally be undertaken while a real-time + workload is running. + +7. If the updater uses call_rcu() or synchronize_rcu(), then the + corresponding readers must use rcu_read_lock() and + rcu_read_unlock(). If the updater uses call_rcu_bh() or + synchronize_rcu_bh(), then the corresponding readers must + use rcu_read_lock_bh() and rcu_read_unlock_bh(). If the + updater uses call_rcu_sched() or synchronize_sched(), then + the corresponding readers must disable preemption, possibly + by calling rcu_read_lock_sched() and rcu_read_unlock_sched(). + If the updater uses synchronize_srcu(), the the corresponding + readers must use srcu_read_lock() and srcu_read_unlock(), + and with the same srcu_struct. The rules for the expedited + primitives are the same as for their non-expedited counterparts. + Mixing things up will result in confusion and broken kernels. One exception to this rule: rcu_read_lock() and rcu_read_unlock() may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh() @@ -212,6 +244,8 @@ over a rather long period of time, but improvements are always welcome! e. Periodically invoke synchronize_rcu(), permitting a limited number of updates per grace period. + The same cautions apply to call_rcu_bh() and call_rcu_sched(). + 9. All RCU list-traversal primitives, which include rcu_dereference(), list_for_each_entry_rcu(), list_for_each_continue_rcu(), and list_for_each_safe_rcu(), @@ -219,7 +253,9 @@ over a rather long period of time, but improvements are always welcome! must be protected by appropriate update-side locks. RCU read-side critical sections are delimited by rcu_read_lock() and rcu_read_unlock(), or by similar primitives such as - rcu_read_lock_bh() and rcu_read_unlock_bh(). + rcu_read_lock_bh() and rcu_read_unlock_bh(), in which case + the matching rcu_dereference() primitive must be used in order + to keep lockdep happy, in this case, rcu_dereference_bh(). The reason that it is permissible to use RCU list-traversal primitives when the update-side lock is held is that doing so @@ -229,7 +265,8 @@ over a rather long period of time, but improvements are always welcome! 10. Conversely, if you are in an RCU read-side critical section, and you don't hold the appropriate update-side lock, you -must- use the "_rcu()" variants of the list macros. Failing to do so - will break Alpha and confuse people reading your code. + will break Alpha, cause aggressive compilers to generate bad code, + and confuse people trying to read your code. 11. Note that synchronize_rcu() -only- guarantees to wait until all currently executing rcu_read_lock()-protected RCU read-side @@ -239,15 +276,21 @@ over a rather long period of time, but improvements are always welcome! rcu_read_lock()-protected read-side critical sections, do -not- use synchronize_rcu(). - If you want to wait for some of these other things, you might - instead need to use synchronize_irq() or synchronize_sched(). + Similarly, disabling preemption is not an acceptable substitute + for rcu_read_lock(). Code that attempts to use preemption + disabling where it should be using rcu_read_lock() will break + in real-time kernel builds. + + If you want to wait for interrupt handlers, NMI handlers, and + code under the influence of preempt_disable(), you instead + need to use synchronize_irq() or synchronize_sched(). 12. Any lock acquired by an RCU callback must be acquired elsewhere with softirq disabled, e.g., via spin_lock_irqsave(), spin_lock_bh(), etc. Failing to disable irq on a given - acquisition of that lock will result in deadlock as soon as the - RCU callback happens to interrupt that acquisition's critical - section. + acquisition of that lock will result in deadlock as soon as + the RCU softirq handler happens to run your RCU callback while + interrupting that acquisition's critical section. 13. RCU callbacks can be and are executed in parallel. In many cases, the callback code simply wrappers around kfree(), so that this @@ -265,29 +308,30 @@ over a rather long period of time, but improvements are always welcome! not the case, a self-spawning RCU callback would prevent the victim CPU from ever going offline.) -14. SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu()) - may only be invoked from process context. Unlike other forms of - RCU, it -is- permissible to block in an SRCU read-side critical - section (demarked by srcu_read_lock() and srcu_read_unlock()), - hence the "SRCU": "sleepable RCU". Please note that if you - don't need to sleep in read-side critical sections, you should - be using RCU rather than SRCU, because RCU is almost always - faster and easier to use than is SRCU. +14. SRCU (srcu_read_lock(), srcu_read_unlock(), srcu_dereference(), + synchronize_srcu(), and synchronize_srcu_expedited()) may only + be invoked from process context. Unlike other forms of RCU, it + -is- permissible to block in an SRCU read-side critical section + (demarked by srcu_read_lock() and srcu_read_unlock()), hence the + "SRCU": "sleepable RCU". Please note that if you don't need + to sleep in read-side critical sections, you should be using + RCU rather than SRCU, because RCU is almost always faster and + easier to use than is SRCU. Also unlike other forms of RCU, explicit initialization and cleanup is required via init_srcu_struct() and cleanup_srcu_struct(). These are passed a "struct srcu_struct" that defines the scope of a given SRCU domain. Once initialized, the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock() - and synchronize_srcu(). A given synchronize_srcu() waits only - for SRCU read-side critical sections governed by srcu_read_lock() - and srcu_read_unlock() calls that have been passd the same - srcu_struct. This property is what makes sleeping read-side - critical sections tolerable -- a given subsystem delays only - its own updates, not those of other subsystems using SRCU. - Therefore, SRCU is less prone to OOM the system than RCU would - be if RCU's read-side critical sections were permitted to - sleep. + synchronize_srcu(), and synchronize_srcu_expedited(). A given + synchronize_srcu() waits only for SRCU read-side critical + sections governed by srcu_read_lock() and srcu_read_unlock() + calls that have been passed the same srcu_struct. This property + is what makes sleeping read-side critical sections tolerable -- + a given subsystem delays only its own updates, not those of other + subsystems using SRCU. Therefore, SRCU is less prone to OOM the + system than RCU would be if RCU's read-side critical sections + were permitted to sleep. The ability to sleep in read-side critical sections does not come for free. First, corresponding srcu_read_lock() and @@ -311,12 +355,12 @@ over a rather long period of time, but improvements are always welcome! destructive operation, and -only- -then- invoke call_rcu(), synchronize_rcu(), or friends. - Because these primitives only wait for pre-existing readers, - it is the caller's responsibility to guarantee safety to - any subsequent readers. + Because these primitives only wait for pre-existing readers, it + is the caller's responsibility to guarantee that any subsequent + readers will execute safely. -16. The various RCU read-side primitives do -not- contain memory - barriers. The CPU (and in some cases, the compiler) is free - to reorder code into and out of RCU read-side critical sections. - It is the responsibility of the RCU update-side primitives to - deal with this. +16. The various RCU read-side primitives do -not- necessarily contain + memory barriers. You should therefore plan for the CPU + and the compiler to freely reorder code into and out of RCU + read-side critical sections. It is the responsibility of the + RCU update-side primitives to deal with this. diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.txt new file mode 100644 index 00000000000..fe24b58627b --- /dev/null +++ b/Documentation/RCU/lockdep.txt @@ -0,0 +1,67 @@ +RCU and lockdep checking + +All flavors of RCU have lockdep checking available, so that lockdep is +aware of when each task enters and leaves any flavor of RCU read-side +critical section. Each flavor of RCU is tracked separately (but note +that this is not the case in 2.6.32 and earlier). This allows lockdep's +tracking to include RCU state, which can sometimes help when debugging +deadlocks and the like. + +In addition, RCU provides the following primitives that check lockdep's +state: + + rcu_read_lock_held() for normal RCU. + rcu_read_lock_bh_held() for RCU-bh. + rcu_read_lock_sched_held() for RCU-sched. + srcu_read_lock_held() for SRCU. + +These functions are conservative, and will therefore return 1 if they +aren't certain (for example, if CONFIG_DEBUG_LOCK_ALLOC is not set). +This prevents things like WARN_ON(!rcu_read_lock_held()) from giving false +positives when lockdep is disabled. + +In addition, a separate kernel config parameter CONFIG_PROVE_RCU enables +checking of rcu_dereference() primitives: + + rcu_dereference(p): + Check for RCU read-side critical section. + rcu_dereference_bh(p): + Check for RCU-bh read-side critical section. + rcu_dereference_sched(p): + Check for RCU-sched read-side critical section. + srcu_dereference(p, sp): + Check for SRCU read-side critical section. + rcu_dereference_check(p, c): + Use explicit check expression "c". + rcu_dereference_raw(p) + Don't check. (Use sparingly, if at all.) + +The rcu_dereference_check() check expression can be any boolean +expression, but would normally include one of the rcu_read_lock_held() +family of functions and a lockdep expression. However, any boolean +expression can be used. For a moderately ornate example, consider +the following: + + file = rcu_dereference_check(fdt->fd[fd], + rcu_read_lock_held() || + lockdep_is_held(&files->file_lock) || + atomic_read(&files->count) == 1); + +This expression picks up the pointer "fdt->fd[fd]" in an RCU-safe manner, +and, if CONFIG_PROVE_RCU is configured, verifies that this expression +is used in: + +1. An RCU read-side critical section, or +2. with files->file_lock held, or +3. on an unshared files_struct. + +In case (1), the pointer is picked up in an RCU-safe manner for vanilla +RCU read-side critical sections, in case (2) the ->file_lock prevents +any change from taking place, and finally, in case (3) the current task +is the only task accessing the file_struct, again preventing any change +from taking place. + +There are currently only "universal" versions of the rcu_assign_pointer() +and RCU list-/tree-traversal primitives, which do not (yet) check for +being in an RCU read-side critical section. In the future, separate +versions of these primitives might be created. diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt index 2a23523ce47..31852705b58 100644 --- a/Documentation/RCU/rcu.txt +++ b/Documentation/RCU/rcu.txt @@ -75,6 +75,8 @@ o I hear that RCU is patented? What is with that? search for the string "Patent" in RTFP.txt to find them. Of these, one was allowed to lapse by the assignee, and the others have been contributed to the Linux kernel under GPL. + There are now also LGPL implementations of user-level RCU + available (http://lttng.org/?q=node/18). o I hear that RCU needs work in order to support realtime kernels? @@ -91,48 +93,4 @@ o Where can I find more information on RCU? o What are all these files in this directory? - - NMI-RCU.txt - - Describes how to use RCU to implement dynamic - NMI handlers, which can be revectored on the fly, - without rebooting. - - RTFP.txt - - List of RCU-related publications and web sites. - - UP.txt - - Discussion of RCU usage in UP kernels. - - arrayRCU.txt - - Describes how to use RCU to protect arrays, with - resizeable arrays whose elements reference other - data structures being of the most interest. - - checklist.txt - - Lists things to check for when inspecting code that - uses RCU. - - listRCU.txt - - Describes how to use RCU to protect linked lists. - This is the simplest and most common use of RCU - in the Linux kernel. - - rcu.txt - - You are reading it! - - rcuref.txt - - Describes how to combine use of reference counts - with RCU. - - whatisRCU.txt - - Overview of how the RCU implementation works. Along - the way, presents a conceptual view of RCU. + See 00-INDEX for the list. diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt new file mode 100644 index 00000000000..1423d2570d7 --- /dev/null +++ b/Documentation/RCU/stallwarn.txt @@ -0,0 +1,58 @@ +Using RCU's CPU Stall Detector + +The CONFIG_RCU_CPU_STALL_DETECTOR kernel config parameter enables +RCU's CPU stall detector, which detects conditions that unduly delay +RCU grace periods. The stall detector's idea of what constitutes +"unduly delayed" is controlled by a pair of C preprocessor macros: + +RCU_SECONDS_TILL_STALL_CHECK + + This macro defines the period of time that RCU will wait from + the beginning of a grace period until it issues an RCU CPU + stall warning. It is normally ten seconds. + +RCU_SECONDS_TILL_STALL_RECHECK + + This macro defines the period of time that RCU will wait after + issuing a stall warning until it issues another stall warning. + It is normally set to thirty seconds. + +RCU_STALL_RAT_DELAY + + The CPU stall detector tries to make the offending CPU rat on itself, + as this often gives better-quality stack traces. However, if + the offending CPU does not detect its own stall in the number + of jiffies specified by RCU_STALL_RAT_DELAY, then other CPUs will + complain. This is normally set to two jiffies. + +The following problems can result in an RCU CPU stall warning: + +o A CPU looping in an RCU read-side critical section. + +o A CPU looping with interrupts disabled. + +o A CPU looping with preemption disabled. + +o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel + without invoking schedule(). + +o A bug in the RCU implementation. + +o A hardware failure. This is quite unlikely, but has occurred + at least once in a former life. A CPU failed in a running system, + becoming unresponsive, but not causing an immediate crash. + This resulted in a series of RCU CPU stall warnings, eventually + leading the realization that the CPU had failed. + +The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning. +SRCU does not do so directly, but its calls to synchronize_sched() will +result in RCU-sched detecting any CPU stalls that might be occurring. + +To diagnose the cause of the stall, inspect the stack traces. The offending +function will usually be near the top of the stack. If you have a series +of stall warnings from a single extended stall, comparing the stack traces +can often help determine where the stall is occurring, which will usually +be in the function nearest the top of the stack that stays the same from +trace to trace. + +RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE. diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt index 9dba3bb90e6..0e50bc2aa1e 100644 --- a/Documentation/RCU/torture.txt +++ b/Documentation/RCU/torture.txt @@ -30,6 +30,18 @@ MODULE PARAMETERS This module has the following parameters: +fqs_duration Duration (in microseconds) of artificially induced bursts + of force_quiescent_state() invocations. In RCU + implementations having force_quiescent_state(), these + bursts help force races between forcing a given grace + period and that grace period ending on its own. + +fqs_holdoff Holdoff time (in microseconds) between consecutive calls + to force_quiescent_state() within a burst. + +fqs_stutter Wait time (in seconds) between consecutive bursts + of calls to force_quiescent_state(). + irqreaders Says to invoke RCU readers from irq level. This is currently done via timers. Defaults to "1" for variants of RCU that permit this. (Or, more accurately, variants of RCU that do diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index d542ca243b8..1dc00ee9716 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -323,14 +323,17 @@ used as follows: Defer Protect a. synchronize_rcu() rcu_read_lock() / rcu_read_unlock() - call_rcu() + call_rcu() rcu_dereference() b. call_rcu_bh() rcu_read_lock_bh() / rcu_read_unlock_bh() + rcu_dereference_bh() -c. synchronize_sched() preempt_disable() / preempt_enable() +c. synchronize_sched() rcu_read_lock_sched() / rcu_read_unlock_sched() + preempt_disable() / preempt_enable() local_irq_save() / local_irq_restore() hardirq enter / hardirq exit NMI enter / NMI exit + rcu_dereference_sched() These three mechanisms are used as follows: @@ -780,9 +783,8 @@ Linux-kernel source code, but it helps to have a full list of the APIs, since there does not appear to be a way to categorize them in docbook. Here is the list, by category. -RCU pointer/list traversal: +RCU list traversal: - rcu_dereference list_for_each_entry_rcu hlist_for_each_entry_rcu hlist_nulls_for_each_entry_rcu @@ -808,7 +810,7 @@ RCU: Critical sections Grace period Barrier rcu_read_lock synchronize_net rcu_barrier rcu_read_unlock synchronize_rcu - synchronize_rcu_expedited + rcu_dereference synchronize_rcu_expedited call_rcu @@ -816,7 +818,7 @@ bh: Critical sections Grace period Barrier rcu_read_lock_bh call_rcu_bh rcu_barrier_bh rcu_read_unlock_bh synchronize_rcu_bh - synchronize_rcu_bh_expedited + rcu_dereference_bh synchronize_rcu_bh_expedited sched: Critical sections Grace period Barrier @@ -825,12 +827,14 @@ sched: Critical sections Grace period Barrier rcu_read_unlock_sched call_rcu_sched [preempt_disable] synchronize_sched_expedited [and friends] + rcu_dereference_sched SRCU: Critical sections Grace period Barrier srcu_read_lock synchronize_srcu N/A srcu_read_unlock synchronize_srcu_expedited + srcu_dereference SRCU: Initialization/cleanup init_srcu_struct diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt index 9d58c7c5edd..eb0fae18ffb 100644 --- a/Documentation/arm/memory.txt +++ b/Documentation/arm/memory.txt @@ -59,7 +59,11 @@ PAGE_OFFSET high_memory-1 Kernel direct-mapped RAM region. This maps the platforms RAM, and typically maps all platform RAM in a 1:1 relationship. -TASK_SIZE PAGE_OFFSET-1 Kernel module space +PKMAP_BASE PAGE_OFFSET-1 Permanent kernel mappings + One way of mapping HIGHMEM pages into kernel + space. + +MODULES_VADDR MODULES_END-1 Kernel module space Kernel modules inserted via insmod are placed here using dynamic mappings. diff --git a/Documentation/block/00-INDEX b/Documentation/block/00-INDEX index 961a0513f8c..a406286f6f3 100644 --- a/Documentation/block/00-INDEX +++ b/Documentation/block/00-INDEX @@ -1,7 +1,5 @@ 00-INDEX - This file -as-iosched.txt - - Anticipatory IO scheduler barrier.txt - I/O Barriers biodoc.txt diff --git a/Documentation/block/as-iosched.txt b/Documentation/block/as-iosched.txt deleted file mode 100644 index 738b72be128..00000000000 --- a/Documentation/block/as-iosched.txt +++ /dev/null @@ -1,172 +0,0 @@ -Anticipatory IO scheduler -------------------------- -Nick Piggin <piggin@cyberone.com.au> 13 Sep 2003 - -Attention! Database servers, especially those using "TCQ" disks should -investigate performance with the 'deadline' IO scheduler. Any system with high -disk performance requirements should do so, in fact. - -If you see unusual performance characteristics of your disk systems, or you -see big performance regressions versus the deadline scheduler, please email -me. Database users don't bother unless you're willing to test a lot of patches -from me ;) its a known issue. - -Also, users with hardware RAID controllers, doing striping, may find -highly variable performance results with using the as-iosched. The -as-iosched anticipatory implementation is based on the notion that a disk -device has only one physical seeking head. A striped RAID controller -actually has a head for each physical device in the logical RAID device. - -However, setting the antic_expire (see tunable parameters below) produces -very similar behavior to the deadline IO scheduler. - -Selecting IO schedulers ------------------------ -Refer to Documentation/block/switching-sched.txt for information on -selecting an io scheduler on a per-device basis. - -Anticipatory IO scheduler Policies ----------------------------------- -The as-iosched implementation implements several layers of policies -to determine when an IO request is dispatched to the disk controller. -Here are the policies outlined, in order of application. - -1. one-way Elevator algorithm. - -The elevator algorithm is similar to that used in deadline scheduler, with -the addition that it allows limited backward movement of the elevator -(i.e. seeks backwards). A seek backwards can occur when choosing between -two IO requests where one is behind the elevator's current position, and -the other is in front of the elevator's position. If the seek distance to -the request in back of the elevator is less than half the seek distance to -the request in front of the elevator, then the request in back can be chosen. -Backward seeks are also limited to a maximum of MAXBACK (1024*1024) sectors. -This favors forward movement of the elevator, while allowing opportunistic -"short" backward seeks. - -2. FIFO expiration times for reads and for writes. - -This is again very similar to the deadline IO scheduler. The expiration -times for requests on these lists is tunable using the parameters read_expire -and write_expire discussed below. When a read or a write expires in this way, -the IO scheduler will interrupt its current elevator sweep or read anticipation -to service the expired request. - -3. Read and write request batching - -A batch is a collection of read requests or a collection of write -requests. The as scheduler alternates dispatching read and write batches -to the driver. In the case a read batch, the scheduler submits read -requests to the driver as long as there are read requests to submit, and -the read batch time limit has not been exceeded (read_batch_expire). -The read batch time limit begins counting down only when there are -competing write requests pending. - -In the case of a write batch, the scheduler submits write requests to -the driver as long as there are write requests available, and the -write batch time limit has not been exceeded (write_batch_expire). -However, the length of write batches will be gradually shortened -when read batches frequently exceed their time limit. - -When changing between batch types, the scheduler waits for all requests -from the previous batch to complete before scheduling requests for the -next batch. - -The read and write fifo expiration times described in policy 2 above -are checked only when in scheduling IO of a batch for the corresponding -(read/write) type. So for example, the read FIFO timeout values are -tested only during read batches. Likewise, the write FIFO timeout -values are tested only during write batches. For this reason, -it is generally not recommended for the read batch time -to be longer than the write expiration time, nor for the write batch -time to exceed the read expiration time (see tunable parameters below). - -When the IO scheduler changes from a read to a write batch, -it begins the elevator from the request that is on the head of the -write expiration FIFO. Likewise, when changing from a write batch to -a read batch, scheduler begins the elevator from the first entry -on the read expiration FIFO. - -4. Read anticipation. - -Read anticipation occurs only when scheduling a read batch. -This implementation of read anticipation allows only one read request -to be dispatched to the disk controller at a time. In -contrast, many write requests may be dispatched to the disk controller -at a time during a write batch. It is this characteristic that can make -the anticipatory scheduler perform anomalously with controllers supporting -TCQ, or with hardware striped RAID devices. Setting the antic_expire -queue parameter (see below) to zero disables this behavior, and the -anticipatory scheduler behaves essentially like the deadline scheduler. - -When read anticipation is enabled (antic_expire is not zero), reads -are dispatched to the disk controller one at a time. -At the end of each read request, the IO scheduler examines its next -candidate read request from its sorted read list. If that next request -is from the same process as the request that just completed, -or if the next request in the queue is "very close" to the -just completed request, it is dispatched immediately. Otherwise, -statistics (average think time, average seek distance) on the process -that submitted the just completed request are examined. If it seems -likely that that process will submit another request soon, and that -request is likely to be near the just completed request, then the IO -scheduler will stop dispatching more read requests for up to (antic_expire) -milliseconds, hoping that process will submit a new request near the one -that just completed. If such a request is made, then it is dispatched -immediately. If the antic_expire wait time expires, then the IO scheduler -will dispatch the next read request from the sorted read queue. - -To decide whether an anticipatory wait is worthwhile, the scheduler -maintains statistics for each process that can be used to compute -mean "think time" (the time between read requests), and mean seek -distance for that process. One observation is that these statistics -are associated with each process, but those statistics are not associated -with a specific IO device. So for example, if a process is doing IO -on several file systems on separate devices, the statistics will be -a combination of IO behavior from all those devices. - - -Tuning the anticipatory IO scheduler ------------------------------------- -When using 'as', the anticipatory IO scheduler there are 5 parameters under -/sys/block/*/queue/iosched/. All are units of milliseconds. - -The parameters are: -* read_expire - Controls how long until a read request becomes "expired". It also controls the - interval between which expired requests are served, so set to 50, a request - might take anywhere < 100ms to be serviced _if_ it is the next on the - expired list. Obviously request expiration strategies won't make the disk - go faster. The result basically equates to the timeslice a single reader - gets in the presence of other IO. 100*((seek time / read_expire) + 1) is - very roughly the % streaming read efficiency your disk should get with - multiple readers. - -* read_batch_expire - Controls how much time a batch of reads is given before pending writes are - served. A higher value is more efficient. This might be set below read_expire - if writes are to be given higher priority than reads, but reads are to be - as efficient as possible when there are no writes. Generally though, it - should be some multiple of read_expire. - -* write_expire, and -* write_batch_expire are equivalent to the above, for writes. - -* antic_expire - Controls the maximum amount of time we can anticipate a good read (one - with a short seek distance from the most recently completed request) before - giving up. Many other factors may cause anticipation to be stopped early, - or some processes will not be "anticipated" at all. Should be a bit higher - for big seek time devices though not a linear correspondence - most - processes have only a few ms thinktime. - -In addition to the tunables above there is a read-only file named est_time -which, when read, will show: - - - The probability of a task exiting without a cooperating task - submitting an anticipated IO. - - - The current mean think time. - - - The seek distance used to determine if an incoming IO is better. - diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt index 8d2158a1c6a..6fab97ea7e6 100644 --- a/Documentation/block/biodoc.txt +++ b/Documentation/block/biodoc.txt @@ -186,7 +186,7 @@ a virtual address mapping (unlike the earlier scheme of virtual address do not have a corresponding kernel virtual address space mapping) and low-memory pages. -Note: Please refer to Documentation/DMA-mapping.txt for a discussion +Note: Please refer to Documentation/PCI/PCI-DMA-mapping.txt for a discussion on PCI high mem DMA aspects and mapping of scatter gather lists, and support for 64 bit PCI. diff --git a/Documentation/block/queue-sysfs.txt b/Documentation/block/queue-sysfs.txt index e164403f60e..f65274081c8 100644 --- a/Documentation/block/queue-sysfs.txt +++ b/Documentation/block/queue-sysfs.txt @@ -25,11 +25,11 @@ size allowed by the hardware. nomerges (RW) ------------- -This enables the user to disable the lookup logic involved with IO merging -requests in the block layer. Merging may still occur through a direct -1-hit cache, since that comes for (almost) free. The IO scheduler will not -waste cycles doing tree/hash lookups for merges if nomerges is 1. Defaults -to 0, enabling all merges. +This enables the user to disable the lookup logic involved with IO +merging requests in the block layer. By default (0) all merges are +enabled. When set to 1 only simple one-hit merges will be tried. When +set to 2 no merge algorithms will be tried (including one-hit or more +complex tree/hash lookups). nr_requests (RW) ---------------- diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt index da42ab414c4..2b5f823abd0 100644 --- a/Documentation/cachetlb.txt +++ b/Documentation/cachetlb.txt @@ -88,12 +88,12 @@ changes occur: This is used primarily during fault processing. 5) void update_mmu_cache(struct vm_area_struct *vma, - unsigned long address, pte_t pte) + unsigned long address, pte_t *ptep) At the end of every page fault, this routine is invoked to tell the architecture specific code that a translation - described by "pte" now exists at virtual address "address" - for address space "vma->vm_mm", in the software page tables. + now exists at virtual address "address" for address space + "vma->vm_mm", in the software page tables. A port may use this information in any way it so chooses. For example, it could use this event to pre-load TLB @@ -377,3 +377,27 @@ maps this page at its virtual address. All the functionality of flush_icache_page can be implemented in flush_dcache_page and update_mmu_cache. In 2.7 the hope is to remove this interface completely. + +The final category of APIs is for I/O to deliberately aliased address +ranges inside the kernel. Such aliases are set up by use of the +vmap/vmalloc API. Since kernel I/O goes via physical pages, the I/O +subsystem assumes that the user mapping and kernel offset mapping are +the only aliases. This isn't true for vmap aliases, so anything in +the kernel trying to do I/O to vmap areas must manually manage +coherency. It must do this by flushing the vmap range before doing +I/O and invalidating it after the I/O returns. + + void flush_kernel_vmap_range(void *vaddr, int size) + flushes the kernel cache for a given virtual address range in + the vmap area. This is to make sure that any data the kernel + modified in the vmap range is made visible to the physical + page. The design is to make this area safe to perform I/O on. + Note that this API does *not* also flush the offset map alias + of the area. + + void invalidate_kernel_vmap_range(void *vaddr, int size) invalidates + the cache for a given virtual address range in the vmap area + which prevents the processor from making the cache stale by + speculatively reading data while the I/O was occurring to the + physical pages. This is only necessary for data reads into the + vmap area. diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt index aed082f49d0..737988fca64 100644 --- a/Documentation/cpu-freq/governors.txt +++ b/Documentation/cpu-freq/governors.txt @@ -145,8 +145,8 @@ show_sampling_rate_max: THIS INTERFACE IS DEPRECATED, DON'T USE IT. up_threshold: defines what the average CPU usage between the samplings of 'sampling_rate' needs to be for the kernel to make a decision on whether it should increase the frequency. For example when it is set -to its default value of '80' it means that between the checking -intervals the CPU needs to be on average more than 80% in use to then +to its default value of '95' it means that between the checking +intervals the CPU needs to be on average more than 95% in use to then decide that the CPU frequency needs to be increased. ignore_nice_load: this parameter takes a value of '0' or '1'. When diff --git a/Documentation/dontdiff b/Documentation/dontdiff index 3ad6acead94..d9bcffd5943 100644 --- a/Documentation/dontdiff +++ b/Documentation/dontdiff @@ -69,7 +69,6 @@ av_permissions.h bbootsect bin2c binkernel.spec -binoffset bootsect bounds.h bsetup diff --git a/Documentation/driver-model/driver.txt b/Documentation/driver-model/driver.txt index 60120fb3b96..d2cd6fb8ba9 100644 --- a/Documentation/driver-model/driver.txt +++ b/Documentation/driver-model/driver.txt @@ -226,5 +226,5 @@ struct driver_attribute driver_attr_debug; This can then be used to add and remove the attribute from the driver's directory using: -int driver_create_file(struct device_driver *, struct driver_attribute *); -void driver_remove_file(struct device_driver *, struct driver_attribute *); +int driver_create_file(struct device_driver *, const struct driver_attribute *); +void driver_remove_file(struct device_driver *, const struct driver_attribute *); diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware index 14b7b5a3bcb..239cbdbf4d1 100644 --- a/Documentation/dvb/get_dvb_firmware +++ b/Documentation/dvb/get_dvb_firmware @@ -26,7 +26,7 @@ use IO::Handle; "dec3000s", "vp7041", "dibusb", "nxt2002", "nxt2004", "or51211", "or51132_qam", "or51132_vsb", "bluebird", "opera1", "cx231xx", "cx18", "cx23885", "pvrusb2", "mpc718", - "af9015"); + "af9015", "ngene"); # Check args syntax() if (scalar(@ARGV) != 1); @@ -39,7 +39,7 @@ for ($i=0; $i < scalar(@components); $i++) { die $@ if $@; print STDERR <<EOF; Firmware(s) $outfile extracted successfully. -Now copy it(they) to either /usr/lib/hotplug/firmware or /lib/firmware +Now copy it(them) to either /usr/lib/hotplug/firmware or /lib/firmware (depending on configuration of firmware hotplug). EOF exit(0); @@ -549,6 +549,24 @@ sub af9015 { close INFILE; } +sub ngene { + my $url = "http://www.digitaldevices.de/download/"; + my $file1 = "ngene_15.fw"; + my $hash1 = "d798d5a757121174f0dbc5f2833c0c85"; + my $file2 = "ngene_17.fw"; + my $hash2 = "26b687136e127b8ac24b81e0eeafc20b"; + + checkstandard(); + + wgetfile($file1, $url . $file1); + verify($file1, $hash1); + + wgetfile($file2, $url . $file2); + verify($file2, $hash2); + + "$file1, $file2"; +} + # --------------------------------------------------------------- # Utilities @@ -667,6 +685,7 @@ sub delzero{ sub syntax() { print STDERR "syntax: get_dvb_firmware <component>\n"; print STDERR "Supported components:\n"; + @components = sort @components; for($i=0; $i < scalar(@components); $i++) { print STDERR "\t" . $components[$i] . "\n"; } diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt index 07930564079..7be15e44d48 100644 --- a/Documentation/fault-injection/fault-injection.txt +++ b/Documentation/fault-injection/fault-injection.txt @@ -143,8 +143,8 @@ o provide a way to configure fault attributes failslab, fail_page_alloc, and fail_make_request use this way. Helper functions: - init_fault_attr_entries(entries, attr, name); - void cleanup_fault_attr_entries(entries); + init_fault_attr_dentries(entries, attr, name); + void cleanup_fault_attr_dentries(entries); - module parameters diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 870d190fe61..73ef30dbe61 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -6,21 +6,6 @@ be removed from this file. --------------------------- -What: USER_SCHED -When: 2.6.34 - -Why: USER_SCHED was implemented as a proof of concept for group scheduling. - The effect of USER_SCHED can already be achieved from userspace with - the help of libcgroup. The removal of USER_SCHED will also simplify - the scheduler code with the removal of one major ifdef. There are also - issues USER_SCHED has with USER_NS. A decision was taken not to fix - those and instead remove USER_SCHED. Also new group scheduling - features will not be implemented for USER_SCHED. - -Who: Dhaval Giani <dhaval@linux.vnet.ibm.com> - ---------------------------- - What: PRISM54 When: 2.6.34 @@ -64,6 +49,17 @@ Who: Robin Getz <rgetz@blackfin.uclinux.org> & Matt Mackall <mpm@selenic.com> --------------------------- +What: Deprecated snapshot ioctls +When: 2.6.36 + +Why: The ioctls in kernel/power/user.c were marked as deprecated long time + ago. Now they notify users about that so that they need to replace + their userspace. After some more time, remove them completely. + +Who: Jiri Slaby <jirislaby@gmail.com> + +--------------------------- + What: The ieee80211_regdom module parameter When: March 2010 / desktop catchup @@ -88,27 +84,6 @@ Who: Luis R. Rodriguez <lrodriguez@atheros.com> --------------------------- -What: CONFIG_WIRELESS_OLD_REGULATORY - old static regulatory information -When: March 2010 / desktop catchup - -Why: The old regulatory infrastructure has been replaced with a new one - which does not require statically defined regulatory domains. We do - not want to keep static regulatory domains in the kernel due to the - the dynamic nature of regulatory law and localization. We kept around - the old static definitions for the regulatory domains of: - - * US - * JP - * EU - - and used by default the US when CONFIG_WIRELESS_OLD_REGULATORY was - set. We will remove this option once the standard Linux desktop catches - up with the new userspace APIs we have implemented. - -Who: Luis R. Rodriguez <lrodriguez@atheros.com> - ---------------------------- - What: dev->power.power_state When: July 2007 Why: Broken design for runtime control over driver power states, confusing @@ -493,3 +468,85 @@ Why: These two features use non-standard interfaces. There are the Who: Corentin Chary <corentin.chary@gmail.com> ---------------------------- + +What: usbvideo quickcam_messenger driver +When: 2.6.35 +Files: drivers/media/video/usbvideo/quickcam_messenger.[ch] +Why: obsolete v4l1 driver replaced by gspca_stv06xx +Who: Hans de Goede <hdegoede@redhat.com> + +---------------------------- + +What: ov511 v4l1 driver +When: 2.6.35 +Files: drivers/media/video/ov511.[ch] +Why: obsolete v4l1 driver replaced by gspca_ov519 +Who: Hans de Goede <hdegoede@redhat.com> + +---------------------------- + +What: w9968cf v4l1 driver +When: 2.6.35 +Files: drivers/media/video/w9968cf*.[ch] +Why: obsolete v4l1 driver replaced by gspca_ov519 +Who: Hans de Goede <hdegoede@redhat.com> + +---------------------------- + +What: ovcamchip sensor framework +When: 2.6.35 +Files: drivers/media/video/ovcamchip/* +Why: Only used by obsoleted v4l1 drivers +Who: Hans de Goede <hdegoede@redhat.com> + +---------------------------- + +What: stv680 v4l1 driver +When: 2.6.35 +Files: drivers/media/video/stv680.[ch] +Why: obsolete v4l1 driver replaced by gspca_stv0680 +Who: Hans de Goede <hdegoede@redhat.com> + +---------------------------- + +What: zc0301 v4l driver +When: 2.6.35 +Files: drivers/media/video/zc0301/* +Why: Duplicate functionality with the gspca_zc3xx driver, zc0301 only + supports 2 USB-ID's (because it only supports a limited set of + sensors) wich are also supported by the gspca_zc3xx driver + (which supports 53 USB-ID's in total) +Who: Hans de Goede <hdegoede@redhat.com> + +---------------------------- + +What: corgikbd, spitzkbd, tosakbd driver +When: 2.6.35 +Files: drivers/input/keyboard/{corgi,spitz,tosa}kbd.c +Why: We now have a generic GPIO based matrix keyboard driver that + are fully capable of handling all the keys on these devices. + The original drivers manipulate the GPIO registers directly + and so are difficult to maintain. +Who: Eric Miao <eric.y.miao@gmail.com> + +---------------------------- + +What: corgi_ssp and corgi_ts driver +When: 2.6.35 +Files: arch/arm/mach-pxa/corgi_ssp.c, drivers/input/touchscreen/corgi_ts.c +Why: The corgi touchscreen is now deprecated in favour of the generic + ads7846.c driver. The noise reduction technique used in corgi_ts.c, + that's to wait till vsync before ADC sampling, is also integrated into + ads7846 driver now. Provided that the original driver is not generic + and is difficult to maintain, it will be removed later. +Who: Eric Miao <eric.y.miao@gmail.com> + +---------------------------- + +What: capifs +When: February 2011 +Files: drivers/isdn/capi/capifs.* +Why: udev fully replaces this special file system that only contains CAPI + NCCI TTY device nodes. User space (pppdcapiplugin) works without + noticing the difference. +Who: Jan Kiszka <jan.kiszka@web.de> diff --git a/Documentation/filesystems/dentry-locking.txt b/Documentation/filesystems/dentry-locking.txt index 4c0c575a401..79334ed5daa 100644 --- a/Documentation/filesystems/dentry-locking.txt +++ b/Documentation/filesystems/dentry-locking.txt @@ -62,7 +62,8 @@ changes are : 2. Insertion of a dentry into the hash table is done using hlist_add_head_rcu() which take care of ordering the writes - the writes to the dentry must be visible before the dentry is - inserted. This works in conjunction with hlist_for_each_rcu() while + inserted. This works in conjunction with hlist_for_each_rcu(), + which has since been replaced by hlist_for_each_entry_rcu(), while walking the hash chain. The only requirement is that all initialization to the dentry must be done before hlist_add_head_rcu() since we don't have dcache_lock protection diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index af6885c3c82..e1def1786e5 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt @@ -196,7 +196,7 @@ nobarrier This also requires an IO stack which can support also be used to enable or disable barriers, for consistency with other ext4 mount options. -inode_readahead=n This tuning parameter controls the maximum +inode_readahead_blks=n This tuning parameter controls the maximum number of inode table blocks that ext4's inode table readahead algorithm will pre-read into the buffer cache. The default value is 32 blocks. diff --git a/Documentation/filesystems/nilfs2.txt b/Documentation/filesystems/nilfs2.txt index 4949fcaa6b6..cf6d0d85ca8 100644 --- a/Documentation/filesystems/nilfs2.txt +++ b/Documentation/filesystems/nilfs2.txt @@ -28,7 +28,7 @@ described in the man pages included in the package. Project web page: http://www.nilfs.org/en/ Download page: http://www.nilfs.org/en/download.html Git tree web page: http://www.nilfs.org/git/ -NILFS mailing lists: http://www.nilfs.org/mailman/listinfo/users +List info: http://vger.kernel.org/vger-lists.html#linux-nilfs Caveats ======= @@ -74,6 +74,9 @@ norecovery Disable recovery of the filesystem on mount. This disables every write access on the device for read-only mounts or snapshots. This option will fail for r/w mounts on an unclean volume. +discard Issue discard/TRIM commands to the underlying block + device when blocks are freed. This is useful for SSD + devices and sparse/thinly-provisioned LUNs. NILFS2 usage ============ diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 220cc6376ef..0d07513a67a 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -177,7 +177,6 @@ read the file /proc/PID/status: CapBnd: ffffffffffffffff voluntary_ctxt_switches: 0 nonvoluntary_ctxt_switches: 1 - Stack usage: 12 kB This shows you nearly the same information you would get if you viewed it with the ps command. In fact, ps uses the proc file system to obtain its @@ -231,7 +230,6 @@ Table 1-2: Contents of the statm files (as of 2.6.30-rc7) Mems_allowed_list Same as previous, but in "list format" voluntary_ctxt_switches number of voluntary context switches nonvoluntary_ctxt_switches number of non voluntary context switches - Stack usage: stack usage high water mark (round up to page size) .............................................................................. Table 1-3: Contents of the statm files (as of 2.6.8-rc3) diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt index b245d524d56..931c806642c 100644 --- a/Documentation/filesystems/sysfs.txt +++ b/Documentation/filesystems/sysfs.txt @@ -91,8 +91,8 @@ struct device_attribute { const char *buf, size_t count); }; -int device_create_file(struct device *, struct device_attribute *); -void device_remove_file(struct device *, struct device_attribute *); +int device_create_file(struct device *, const struct device_attribute *); +void device_remove_file(struct device *, const struct device_attribute *); It also defines this helper for defining device attributes: @@ -316,8 +316,8 @@ DEVICE_ATTR(_name, _mode, _show, _store); Creation/Removal: -int device_create_file(struct device *device, struct device_attribute * attr); -void device_remove_file(struct device * dev, struct device_attribute * attr); +int device_create_file(struct device *dev, const struct device_attribute * attr); +void device_remove_file(struct device *dev, const struct device_attribute * attr); - bus drivers (include/linux/device.h) @@ -358,7 +358,7 @@ DRIVER_ATTR(_name, _mode, _show, _store) Creation/Removal: -int driver_create_file(struct device_driver *, struct driver_attribute *); -void driver_remove_file(struct device_driver *, struct driver_attribute *); +int driver_create_file(struct device_driver *, const struct driver_attribute *); +void driver_remove_file(struct device_driver *, const struct driver_attribute *); diff --git a/Documentation/hwmon/amc6821 b/Documentation/hwmon/amc6821 new file mode 100644 index 00000000000..ced8359c50f --- /dev/null +++ b/Documentation/hwmon/amc6821 @@ -0,0 +1,102 @@ +Kernel driver amc6821 +===================== + +Supported chips: + Texas Instruments AMC6821 + Prefix: 'amc6821' + Addresses scanned: 0x18, 0x19, 0x1a, 0x2c, 0x2d, 0x2e, 0x4c, 0x4d, 0x4e + Datasheet: http://focus.ti.com/docs/prod/folders/print/amc6821.html + +Authors: + Tomaz Mertelj <tomaz.mertelj@guest.arnes.si> + + +Description +----------- + +This driver implements support for the Texas Instruments amc6821 chip. +The chip has one on-chip and one remote temperature sensor and one pwm fan +regulator. +The pwm can be controlled either from software or automatically. + +The driver provides the following sensor accesses in sysfs: + +temp1_input ro on-chip temperature +temp1_min rw " +temp1_max rw " +temp1_crit rw " +temp1_min_alarm ro " +temp1_max_alarm ro " +temp1_crit_alarm ro " + +temp2_input ro remote temperature +temp2_min rw " +temp2_max rw " +temp2_crit rw " +temp2_min_alarm ro " +temp2_max_alarm ro " +temp2_crit_alarm ro " +temp2_fault ro " + +fan1_input ro tachometer speed +fan1_min rw " +fan1_max rw " +fan1_fault ro " +fan1_div rw Fan divisor can be either 2 or 4. + +pwm1 rw pwm1 +pwm1_enable rw regulator mode, 1=open loop, 2=fan controlled + by remote temperature, 3=fan controlled by + combination of the on-chip temperature and + remote-sensor temperature, +pwm1_auto_channels_temp ro 1 if pwm_enable==2, 3 if pwm_enable==3 +pwm1_auto_point1_pwm ro Hardwired to 0, shared for both + temperature channels. +pwm1_auto_point2_pwm rw This value is shared for both temperature + channels. +pwm1_auto_point3_pwm rw Hardwired to 255, shared for both + temperature channels. + +temp1_auto_point1_temp ro Hardwired to temp2_auto_point1_temp + which is rw. Below this temperature fan stops. +temp1_auto_point2_temp rw The low-temperature limit of the proportional + range. Below this temperature + pwm1 = pwm1_auto_point2_pwm. It can go from + 0 degree C to 124 degree C in steps of + 4 degree C. Read it out after writing to get + the actual value. +temp1_auto_point3_temp rw Above this temperature fan runs at maximum + speed. It can go from temp1_auto_point2_temp. + It can only have certain discrete values + which depend on temp1_auto_point2_temp and + pwm1_auto_point2_pwm. Read it out after + writing to get the actual value. + +temp2_auto_point1_temp rw Must be between 0 degree C and 63 degree C and + it defines the passive cooling temperature. + Below this temperature the fan stops in + the closed loop mode. +temp2_auto_point2_temp rw The low-temperature limit of the proportional + range. Below this temperature + pwm1 = pwm1_auto_point2_pwm. It can go from + 0 degree C to 124 degree C in steps + of 4 degree C. + +temp2_auto_point3_temp rw Above this temperature fan runs at maximum + speed. It can only have certain discrete + values which depend on temp2_auto_point2_temp + and pwm1_auto_point2_pwm. Read it out after + writing to get actual value. + + +Module parameters +----------------- + +If your board has a BIOS that initializes the amc6821 correctly, you should +load the module with: init=0. + +If your board BIOS doesn't initialize the chip, or you want +different settings, you can set the following parameters: +init=1, +pwminv: 0 default pwm output, 1 inverts pwm output. + diff --git a/Documentation/hwmon/k10temp b/Documentation/hwmon/k10temp index a7a18d453a5..6526eee525a 100644 --- a/Documentation/hwmon/k10temp +++ b/Documentation/hwmon/k10temp @@ -3,8 +3,8 @@ Kernel driver k10temp Supported chips: * AMD Family 10h processors: - Socket F: Quad-Core/Six-Core/Embedded Opteron - Socket AM2+: Opteron, Phenom (II) X3/X4 + Socket F: Quad-Core/Six-Core/Embedded Opteron (but see below) + Socket AM2+: Quad-Core Opteron, Phenom (II) X3/X4, Athlon X2 (but see below) Socket AM3: Quad-Core Opteron, Athlon/Phenom II X2/X3/X4, Sempron II Socket S1G3: Athlon II, Sempron, Turion II * AMD Family 11h processors: @@ -36,10 +36,15 @@ Description This driver permits reading of the internal temperature sensor of AMD Family 10h and 11h processors. -All these processors have a sensor, but on older revisions of Family 10h -processors, the sensor may return inconsistent values (erratum 319). The -driver will refuse to load on these revisions unless you specify the -"force=1" module parameter. +All these processors have a sensor, but on those for Socket F or AM2+, +the sensor may return inconsistent values (erratum 319). The driver +will refuse to load on these revisions unless you specify the "force=1" +module parameter. + +Due to technical reasons, the driver can detect only the mainboard's +socket type, not the processor's actual capabilities. Therefore, if you +are using an AM3 processor on an AM2+ mainboard, you can safely use the +"force=1" parameter. There is one temperature measurement value, available as temp1_input in sysfs. It is measured in degrees Celsius with a resolution of 1/8th degree. diff --git a/Documentation/i2c/busses/i2c-i801 b/Documentation/i2c/busses/i2c-i801 index 81c0c59a60e..e1bb5b26169 100644 --- a/Documentation/i2c/busses/i2c-i801 +++ b/Documentation/i2c/busses/i2c-i801 @@ -15,7 +15,8 @@ Supported adapters: * Intel 82801I (ICH9) * Intel EP80579 (Tolapai) * Intel 82801JI (ICH10) - * Intel PCH + * Intel 3400/5 Series (PCH) + * Intel Cougar Point (PCH) Datasheets: Publicly available at the Intel website Authors: diff --git a/Documentation/i2c/busses/i2c-parport b/Documentation/i2c/busses/i2c-parport index dceaba1ad93..2461c7b53b2 100644 --- a/Documentation/i2c/busses/i2c-parport +++ b/Documentation/i2c/busses/i2c-parport @@ -29,6 +29,9 @@ can be easily added when needed. Earlier kernels defaulted to type=0 (Philips). But now, if the type parameter is missing, the driver will simply fail to initialize. +SMBus alert support is available on adapters which have this line properly +connected to the parallel port's interrupt pin. + Building your own adapter ------------------------- diff --git a/Documentation/i2c/busses/i2c-parport-light b/Documentation/i2c/busses/i2c-parport-light index 28743647852..bdc9cbb2e0f 100644 --- a/Documentation/i2c/busses/i2c-parport-light +++ b/Documentation/i2c/busses/i2c-parport-light @@ -9,3 +9,14 @@ parport handling is not an option. The drawback is a reduced portability and the impossibility to daisy-chain other parallel port devices. Please see i2c-parport for documentation. + +Module parameters: + +* type: type of adapter (see i2c-parport or modinfo) + +* base: base I/O address + Default is 0x378 which is fairly common for parallel ports, at least on PC. + +* irq: optional IRQ + This must be passed if you want SMBus alert support, assuming your adapter + actually supports this. diff --git a/Documentation/i2c/smbus-protocol b/Documentation/i2c/smbus-protocol index 9df47441f0e..7c19d1a2bea 100644 --- a/Documentation/i2c/smbus-protocol +++ b/Documentation/i2c/smbus-protocol @@ -185,6 +185,22 @@ the protocol. All ARP communications use slave address 0x61 and require PEC checksums. +SMBus Alert +=========== + +SMBus Alert was introduced in Revision 1.0 of the specification. + +The SMBus alert protocol allows several SMBus slave devices to share a +single interrupt pin on the SMBus master, while still allowing the master +to know which slave triggered the interrupt. + +This is implemented the following way in the Linux kernel: +* I2C bus drivers which support SMBus alert should call + i2c_setup_smbus_alert() to setup SMBus alert support. +* I2C drivers for devices which can trigger SMBus alerts should implement + the optional alert() callback. + + I2C Block Transactions ====================== diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients index 0a74603eb67..3219ee0dbfe 100644 --- a/Documentation/i2c/writing-clients +++ b/Documentation/i2c/writing-clients @@ -318,8 +318,9 @@ Plain I2C communication These routines read and write some bytes from/to a client. The client contains the i2c address, so you do not have to include it. The second parameter contains the bytes to read/write, the third the number of bytes -to read/write (must be less than the length of the buffer.) Returned is -the actual number of bytes read/written. +to read/write (must be less than the length of the buffer, also should be +less than 64k since msg.len is u16.) Returned is the actual number of bytes +read/written. int i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msg, int num); diff --git a/Documentation/input/multi-touch-protocol.txt b/Documentation/input/multi-touch-protocol.txt index a12ea3b586e..8490480ce43 100644 --- a/Documentation/input/multi-touch-protocol.txt +++ b/Documentation/input/multi-touch-protocol.txt @@ -27,12 +27,30 @@ set of events/packets. A set of ABS_MT events with the desired properties is defined. The events are divided into categories, to allow for partial implementation. The -minimum set consists of ABS_MT_TOUCH_MAJOR, ABS_MT_POSITION_X and -ABS_MT_POSITION_Y, which allows for multiple fingers to be tracked. If the -device supports it, the ABS_MT_WIDTH_MAJOR may be used to provide the size -of the approaching finger. Anisotropy and direction may be specified with -ABS_MT_TOUCH_MINOR, ABS_MT_WIDTH_MINOR and ABS_MT_ORIENTATION. The -ABS_MT_TOOL_TYPE may be used to specify whether the touching tool is a +minimum set consists of ABS_MT_POSITION_X and ABS_MT_POSITION_Y, which +allows for multiple fingers to be tracked. If the device supports it, the +ABS_MT_TOUCH_MAJOR and ABS_MT_WIDTH_MAJOR may be used to provide the size +of the contact area and approaching finger, respectively. + +The TOUCH and WIDTH parameters have a geometrical interpretation; imagine +looking through a window at someone gently holding a finger against the +glass. You will see two regions, one inner region consisting of the part +of the finger actually touching the glass, and one outer region formed by +the perimeter of the finger. The diameter of the inner region is the +ABS_MT_TOUCH_MAJOR, the diameter of the outer region is +ABS_MT_WIDTH_MAJOR. Now imagine the person pressing the finger harder +against the glass. The inner region will increase, and in general, the +ratio ABS_MT_TOUCH_MAJOR / ABS_MT_WIDTH_MAJOR, which is always smaller than +unity, is related to the finger pressure. For pressure-based devices, +ABS_MT_PRESSURE may be used to provide the pressure on the contact area +instead. + +In addition to the MAJOR parameters, the oval shape of the finger can be +described by adding the MINOR parameters, such that MAJOR and MINOR are the +major and minor axis of an ellipse. Finally, the orientation of the oval +shape can be describe with the ORIENTATION parameter. + +The ABS_MT_TOOL_TYPE may be used to specify whether the touching tool is a finger or a pen or something else. Devices with more granular information may specify general shapes as blobs, i.e., as a sequence of rectangular shapes grouped together by an ABS_MT_BLOB_ID. Finally, for the few devices @@ -42,11 +60,9 @@ report finger tracking from hardware [5]. Here is what a minimal event sequence for a two-finger touch would look like: - ABS_MT_TOUCH_MAJOR ABS_MT_POSITION_X ABS_MT_POSITION_Y SYN_MT_REPORT - ABS_MT_TOUCH_MAJOR ABS_MT_POSITION_X ABS_MT_POSITION_Y SYN_MT_REPORT @@ -87,6 +103,12 @@ the contact. The ratio ABS_MT_TOUCH_MAJOR / ABS_MT_WIDTH_MAJOR approximates the notion of pressure. The fingers of the hand and the palm all have different characteristic widths [1]. +ABS_MT_PRESSURE + +The pressure, in arbitrary units, on the contact area. May be used instead +of TOUCH and WIDTH for pressure-based devices or any device with a spatial +signal intensity distribution. + ABS_MT_ORIENTATION The orientation of the ellipse. The value should describe a signed quarter @@ -170,6 +192,16 @@ There are a few devices that support trackingID in hardware. User space can make use of these native identifiers to reduce bandwidth and cpu usage. +Gestures +-------- + +In the specific application of creating gesture events, the TOUCH and WIDTH +parameters can be used to, e.g., approximate finger pressure or distinguish +between index finger and thumb. With the addition of the MINOR parameters, +one can also distinguish between a sweeping finger and a pointing finger, +and with ORIENTATION, one can detect twisting of fingers. + + Notes ----- diff --git a/Documentation/input/sentelic.txt b/Documentation/input/sentelic.txt index f7160a2fb6a..b35affd5c64 100644 --- a/Documentation/input/sentelic.txt +++ b/Documentation/input/sentelic.txt @@ -1,5 +1,5 @@ -Copyright (C) 2002-2008 Sentelic Corporation. -Last update: Oct-31-2008 +Copyright (C) 2002-2010 Sentelic Corporation. +Last update: Jan-13-2010 ============================================================================== * Finger Sensing Pad Intellimouse Mode(scrolling wheel, 4th and 5th buttons) @@ -44,7 +44,7 @@ B) MSID 6: Horizontal and Vertical scrolling. Packet 1 Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| - 1 |Y|X|y|x|1|M|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 | | |B|F|l|r|u|d| + 1 |Y|X|y|x|1|M|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 | | |B|F|r|l|u|d| |---------------| |---------------| |---------------| |---------------| Byte 1: Bit7 => Y overflow @@ -59,15 +59,15 @@ Byte 2: X Movement(9-bit 2's complement integers) Byte 3: Y Movement(9-bit 2's complement integers) Byte 4: Bit0 => the Vertical scrolling movement downward. Bit1 => the Vertical scrolling movement upward. - Bit2 => the Vertical scrolling movement rightward. - Bit3 => the Vertical scrolling movement leftward. + Bit2 => the Horizontal scrolling movement leftward. + Bit3 => the Horizontal scrolling movement rightward. Bit4 => 1 = 4th mouse button is pressed, Forward one page. 0 = 4th mouse button is not pressed. Bit5 => 1 = 5th mouse button is pressed, Backward one page. 0 = 5th mouse button is not pressed. C) MSID 7: -# FSP uses 2 packets(8 Bytes) data to represent Absolute Position +# FSP uses 2 packets (8 Bytes) to represent Absolute Position. so we have PACKET NUMBER to identify packets. If PACKET NUMBER is 0, the packet is Packet 1. If PACKET NUMBER is 1, the packet is Packet 2. @@ -129,7 +129,7 @@ Byte 3: Message Type => 0x00 (Disabled) Byte 4: Bit7~Bit0 => Don't Care ============================================================================== -* Absolute position for STL3888-A0. +* Absolute position for STL3888-Ax. ============================================================================== Packet 1 (ABSOLUTE POSITION) Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 @@ -179,14 +179,14 @@ Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0]) Bit5~Bit4 => y2_g Bit7~Bit6 => x2_g -Notify Packet for STL3888-A0 +Notify Packet for STL3888-Ax Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| 1 |1|0|1|P|1|M|R|L| 2 |C|C|C|C|C|C|C|C| 3 |0|0|F|F|0|0|0|i| 4 |r|l|d|u|0|0|0|0| |---------------| |---------------| |---------------| |---------------| Byte 1: Bit7~Bit6 => 00, Normal data packet - => 01, Absolute coordination packet + => 01, Absolute coordinates packet => 10, Notify packet Bit5 => 1 Bit4 => when in absolute coordinates mode (valid when EN_PKT_GO is 1): @@ -205,15 +205,106 @@ Byte 4: Bit7 => scroll right button Bit6 => scroll left button Bit5 => scroll down button Bit4 => scroll up button - * Note that if gesture and additional button (Bit4~Bit7) - happen at the same time, the button information will not - be sent. + * Note that if gesture and additional buttoni (Bit4~Bit7) + happen at the same time, the button information will not + be sent. + Bit3~Bit0 => Reserved + +Sample sequence of Multi-finger, Multi-coordinate mode: + + notify packet (valid bit == 1), abs pkt 1, abs pkt 2, abs pkt 1, + abs pkt 2, ..., notify packet (valid bit == 0) + +============================================================================== +* Absolute position for STL3888-B0. +============================================================================== +Packet 1(ABSOLUTE POSITION) + Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 +BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| + 1 |0|1|V|F|1|0|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 |r|l|u|d|X|X|Y|Y| + |---------------| |---------------| |---------------| |---------------| + +Byte 1: Bit7~Bit6 => 00, Normal data packet + => 01, Absolute coordinates packet + => 10, Notify packet + Bit5 => Valid bit, 0 means that the coordinate is invalid or finger up. + When both fingers are up, the last two reports have zero valid + bit. + Bit4 => finger up/down information. 1: finger down, 0: finger up. + Bit3 => 1 + Bit2 => finger index, 0 is the first finger, 1 is the second finger. + Bit1 => Right Button, 1 is pressed, 0 is not pressed. + Bit0 => Left Button, 1 is pressed, 0 is not pressed. +Byte 2: X coordinate (xpos[9:2]) +Byte 3: Y coordinate (ypos[9:2]) +Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0]) + Bit3~Bit2 => X coordinate (ypos[1:0]) + Bit4 => scroll down button + Bit5 => scroll up button + Bit6 => scroll left button + Bit7 => scroll right button + +Packet 2 (ABSOLUTE POSITION) + Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 +BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| + 1 |0|1|V|F|1|1|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 |r|l|u|d|X|X|Y|Y| + |---------------| |---------------| |---------------| |---------------| + +Byte 1: Bit7~Bit6 => 00, Normal data packet + => 01, Absolute coordination packet + => 10, Notify packet + Bit5 => Valid bit, 0 means that the coordinate is invalid or finger up. + When both fingers are up, the last two reports have zero valid + bit. + Bit4 => finger up/down information. 1: finger down, 0: finger up. + Bit3 => 1 + Bit2 => finger index, 0 is the first finger, 1 is the second finger. + Bit1 => Right Button, 1 is pressed, 0 is not pressed. + Bit0 => Left Button, 1 is pressed, 0 is not pressed. +Byte 2: X coordinate (xpos[9:2]) +Byte 3: Y coordinate (ypos[9:2]) +Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0]) + Bit3~Bit2 => X coordinate (ypos[1:0]) + Bit4 => scroll down button + Bit5 => scroll up button + Bit6 => scroll left button + Bit7 => scroll right button + +Notify Packet for STL3888-B0 + Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 +BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| + 1 |1|0|1|P|1|M|R|L| 2 |C|C|C|C|C|C|C|C| 3 |0|0|F|F|0|0|0|i| 4 |r|l|u|d|0|0|0|0| + |---------------| |---------------| |---------------| |---------------| + +Byte 1: Bit7~Bit6 => 00, Normal data packet + => 01, Absolute coordination packet + => 10, Notify packet + Bit5 => 1 + Bit4 => when in absolute coordinate mode (valid when EN_PKT_GO is 1): + 0: left button is generated by the on-pad command + 1: left button is generated by the external button + Bit3 => 1 + Bit2 => Middle Button, 1 is pressed, 0 is not pressed. + Bit1 => Right Button, 1 is pressed, 0 is not pressed. + Bit0 => Left Button, 1 is pressed, 0 is not pressed. +Byte 2: Message Type => 0xB7 (Multi Finger, Multi Coordinate mode) +Byte 3: Bit7~Bit6 => Don't care + Bit5~Bit4 => Number of fingers + Bit3~Bit1 => Reserved + Bit0 => 1: enter gesture mode; 0: leaving gesture mode +Byte 4: Bit7 => scroll right button + Bit6 => scroll left button + Bit5 => scroll up button + Bit4 => scroll down button + * Note that if gesture and additional button(Bit4~Bit7) + happen at the same time, the button information will not + be sent. Bit3~Bit0 => Reserved Sample sequence of Multi-finger, Multi-coordinate mode: notify packet (valid bit == 1), abs pkt 1, abs pkt 2, abs pkt 1, - abs pkt 2, ..., notify packet(valid bit == 0) + abs pkt 2, ..., notify packet (valid bit == 0) ============================================================================== * FSP Enable/Disable packet @@ -409,7 +500,8 @@ offset width default r/w name 0: read only, 1: read/write enable (Note that following registers does not require clock gating being enabled prior to write: 05 06 07 08 09 0c 0f 10 11 12 16 17 18 23 2e - 40 41 42 43.) + 40 41 42 43. In addition to that, this bit must be 1 when gesture + mode is enabled) 0x31 RW on-pad command detection bit7 0 RW on-pad command left button down tag @@ -463,6 +555,10 @@ offset width default r/w name absolute coordinates; otherwise, host only receives packets with relative coordinate.) + bit7 0 RW EN_PS2_F2: PS/2 gesture mode 2nd + finger packet enable + 0: disable, 1: enable + 0x43 RW on-pad control bit0 0 RW on-pad control enable 0: disable, 1: enable diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt index 947374977ca..35c9b51d20e 100644 --- a/Documentation/ioctl/ioctl-number.txt +++ b/Documentation/ioctl/ioctl-number.txt @@ -56,10 +56,11 @@ Following this convention is good because: (5) When following the convention, the driver code can use generic code to copy the parameters between user and kernel space. -This table lists ioctls visible from user land for Linux/i386. It contains -most drivers up to 2.3.14, but I know I am missing some. +This table lists ioctls visible from user land for Linux/x86. It contains +most drivers up to 2.6.31, but I know I am missing some. There has been +no attempt to list non-X86 architectures or ioctls from drivers/staging/. -Code Seq# Include File Comments +Code Seq#(hex) Include File Comments ======================================================== 0x00 00-1F linux/fs.h conflict! 0x00 00-1F scsi/scsi_ioctl.h conflict! @@ -69,119 +70,227 @@ Code Seq# Include File Comments 0x03 all linux/hdreg.h 0x04 D2-DC linux/umsdos_fs.h Dead since 2.6.11, but don't reuse these. 0x06 all linux/lp.h -0x09 all linux/md.h +0x09 all linux/raid/md_u.h +0x10 00-0F drivers/char/s390/vmcp.h 0x12 all linux/fs.h linux/blkpg.h 0x1b all InfiniBand Subsystem <http://www.openib.org/> 0x20 all drivers/cdrom/cm206.h 0x22 all scsi/sg.h '#' 00-3F IEEE 1394 Subsystem Block for the entire subsystem +'$' 00-0F linux/perf_counter.h, linux/perf_event.h '1' 00-1F <linux/timepps.h> PPS kit from Ulrich Windl <ftp://ftp.de.kernel.org/pub/linux/daemons/ntp/PPS/> +'2' 01-04 linux/i2o.h +'3' 00-0F drivers/s390/char/raw3270.h conflict! +'3' 00-1F linux/suspend_ioctls.h conflict! + and kernel/power/user.c '8' all SNP8023 advanced NIC card <mailto:mcr@solidum.com> -'A' 00-1F linux/apm_bios.h +'@' 00-0F linux/radeonfb.h conflict! +'@' 00-0F drivers/video/aty/aty128fb.c conflict! +'A' 00-1F linux/apm_bios.h conflict! +'A' 00-0F linux/agpgart.h conflict! + and drivers/char/agp/compat_ioctl.h +'A' 00-7F sound/asound.h conflict! +'B' 00-1F linux/cciss_ioctl.h conflict! +'B' 00-0F include/linux/pmu.h conflict! 'B' C0-FF advanced bbus <mailto:maassen@uni-freiburg.de> -'C' all linux/soundcard.h +'C' all linux/soundcard.h conflict! +'C' 01-2F linux/capi.h conflict! +'C' F0-FF drivers/net/wan/cosa.h conflict! 'D' all arch/s390/include/asm/dasd.h -'E' all linux/input.h -'F' all linux/fb.h -'H' all linux/hiddev.h -'I' all linux/isdn.h +'D' 40-5F drivers/scsi/dpt/dtpi_ioctl.h +'D' 05 drivers/scsi/pmcraid.h +'E' all linux/input.h conflict! +'E' 00-0F xen/evtchn.h conflict! +'F' all linux/fb.h conflict! +'F' 01-02 drivers/scsi/pmcraid.h conflict! +'F' 20 drivers/video/fsl-diu-fb.h conflict! +'F' 20 drivers/video/intelfb/intelfb.h conflict! +'F' 20 linux/ivtvfb.h conflict! +'F' 20 linux/matroxfb.h conflict! +'F' 20 drivers/video/aty/atyfb_base.c conflict! +'F' 00-0F video/da8xx-fb.h conflict! +'F' 80-8F linux/arcfb.h conflict! +'F' DD video/sstfb.h conflict! +'G' 00-3F drivers/misc/sgi-gru/grulib.h conflict! +'G' 00-0F linux/gigaset_dev.h conflict! +'H' 00-7F linux/hiddev.h conflict! +'H' 00-0F linux/hidraw.h conflict! +'H' 00-0F sound/asound.h conflict! +'H' 20-40 sound/asound_fm.h conflict! +'H' 80-8F sound/sfnt_info.h conflict! +'H' 10-8F sound/emu10k1.h conflict! +'H' 10-1F sound/sb16_csp.h conflict! +'H' 10-1F sound/hda_hwdep.h conflict! +'H' 40-4F sound/hdspm.h conflict! +'H' 40-4F sound/hdsp.h conflict! +'H' 90 sound/usb/usx2y/usb_stream.h +'H' C0-F0 net/bluetooth/hci.h conflict! +'H' C0-DF net/bluetooth/hidp/hidp.h conflict! +'H' C0-DF net/bluetooth/cmtp/cmtp.h conflict! +'H' C0-DF net/bluetooth/bnep/bnep.h conflict! +'I' all linux/isdn.h conflict! +'I' 00-0F drivers/isdn/divert/isdn_divert.h conflict! +'I' 40-4F linux/mISDNif.h conflict! 'J' 00-1F drivers/scsi/gdth_ioctl.h 'K' all linux/kd.h -'L' 00-1F linux/loop.h -'L' 20-2F driver/usb/misc/vstusb.h +'L' 00-1F linux/loop.h conflict! +'L' 10-1F drivers/scsi/mpt2sas/mpt2sas_ctl.h conflict! 'L' E0-FF linux/ppdd.h encrypted disk device driver <http://linux01.gwdg.de/~alatham/ppdd.html> -'M' all linux/soundcard.h +'M' all linux/soundcard.h conflict! +'M' 01-16 mtd/mtd-abi.h conflict! + and drivers/mtd/mtdchar.c +'M' 01-03 drivers/scsi/megaraid/megaraid_sas.h +'M' 00-0F drivers/video/fsl-diu-fb.h conflict! 'N' 00-1F drivers/usb/scanner.h -'O' 00-02 include/mtd/ubi-user.h UBI -'P' all linux/soundcard.h +'O' 00-06 mtd/ubi-user.h UBI +'P' all linux/soundcard.h conflict! +'P' 60-6F sound/sscape_ioctl.h conflict! +'P' 00-0F drivers/usb/class/usblp.c conflict! 'Q' all linux/soundcard.h -'R' 00-1F linux/random.h +'R' 00-1F linux/random.h conflict! +'R' 01 linux/rfkill.h conflict! +'R' 01-0F media/rds.h conflict! +'R' C0-DF net/bluetooth/rfcomm.h 'S' all linux/cdrom.h conflict! 'S' 80-81 scsi/scsi_ioctl.h conflict! 'S' 82-FF scsi/scsi.h conflict! +'S' 00-7F sound/asequencer.h conflict! 'T' all linux/soundcard.h conflict! +'T' 00-AF sound/asound.h conflict! 'T' all arch/x86/include/asm/ioctls.h conflict! -'U' 00-EF linux/drivers/usb/usb.h -'V' all linux/vt.h +'T' C0-DF linux/if_tun.h conflict! +'U' all sound/asound.h conflict! +'U' 00-0F drivers/media/video/uvc/uvcvideo.h conflict! +'U' 00-CF linux/uinput.h conflict! +'U' 00-EF linux/usbdevice_fs.h +'U' C0-CF drivers/bluetooth/hci_uart.h +'V' all linux/vt.h conflict! +'V' all linux/videodev2.h conflict! +'V' C0 linux/ivtvfb.h conflict! +'V' C0 linux/ivtv.h conflict! +'V' C0 media/davinci/vpfe_capture.h conflict! +'V' C0 media/si4713.h conflict! +'V' C0-CF drivers/media/video/mxb.h conflict! 'W' 00-1F linux/watchdog.h conflict! 'W' 00-1F linux/wanrouter.h conflict! -'X' all linux/xfs_fs.h +'W' 00-3F sound/asound.h conflict! +'X' all fs/xfs/xfs_fs.h conflict! + and fs/xfs/linux-2.6/xfs_ioctl32.h + and include/linux/falloc.h + and linux/fs.h +'X' all fs/ocfs2/ocfs_fs.h conflict! +'X' 01 linux/pktcdvd.h conflict! 'Y' all linux/cyclades.h -'[' 00-07 linux/usb/usbtmc.h USB Test and Measurement Devices +'Z' 14-15 drivers/message/fusion/mptctl.h +'[' 00-07 linux/usb/tmc.h USB Test and Measurement Devices <mailto:gregkh@suse.de> -'a' all ATM on linux +'a' all linux/atm*.h, linux/sonet.h ATM on linux <http://lrcwww.epfl.ch/linux-atm/magic.html> -'b' 00-FF bit3 vme host bridge +'b' 00-FF conflict! bit3 vme host bridge <mailto:natalia@nikhefk.nikhef.nl> +'b' 00-0F media/bt819.h conflict! +'c' all linux/cm4000_cs.h conflict! 'c' 00-7F linux/comstats.h conflict! 'c' 00-7F linux/coda.h conflict! -'c' 80-9F arch/s390/include/asm/chsc.h -'c' A0-AF arch/x86/include/asm/msr.h +'c' 00-1F linux/chio.h conflict! +'c' 80-9F arch/s390/include/asm/chsc.h conflict! +'c' A0-AF arch/x86/include/asm/msr.h conflict! 'd' 00-FF linux/char/drm/drm/h conflict! +'d' 02-40 pcmcia/ds.h conflict! +'d' 10-3F drivers/media/video/dabusb.h conflict! +'d' C0-CF drivers/media/video/saa7191.h conflict! 'd' F0-FF linux/digi1.h 'e' all linux/digi1.h conflict! -'e' 00-1F net/irda/irtty.h conflict! -'f' 00-1F linux/ext2_fs.h -'h' 00-7F Charon filesystem +'e' 00-1F drivers/net/irda/irtty-sir.h conflict! +'f' 00-1F linux/ext2_fs.h conflict! +'f' 00-1F linux/ext3_fs.h conflict! +'f' 00-0F fs/jfs/jfs_dinode.h conflict! +'f' 00-0F fs/ext4/ext4.h conflict! +'f' 00-0F linux/fs.h conflict! +'f' 00-0F fs/ocfs2/ocfs2_fs.h conflict! +'g' 00-0F linux/usb/gadgetfs.h +'g' 20-2F linux/usb/g_printer.h +'h' 00-7F conflict! Charon filesystem <mailto:zapman@interlan.net> -'i' 00-3F linux/i2o.h +'h' 00-1F linux/hpet.h conflict! +'i' 00-3F linux/i2o-dev.h conflict! +'i' 0B-1F linux/ipmi.h conflict! +'i' 80-8F linux/i8k.h 'j' 00-3F linux/joystick.h +'k' 00-0F linux/spi/spidev.h conflict! +'k' 00-05 video/kyro.h conflict! 'l' 00-3F linux/tcfs_fs.h transparent cryptographic file system <http://mikonos.dia.unisa.it/tcfs> 'l' 40-7F linux/udf_fs_i.h in development: <http://sourceforge.net/projects/linux-udf/> -'m' 00-09 linux/mmtimer.h +'m' 00-09 linux/mmtimer.h conflict! 'm' all linux/mtio.h conflict! 'm' all linux/soundcard.h conflict! 'm' all linux/synclink.h conflict! +'m' 00-19 drivers/message/fusion/mptctl.h conflict! +'m' 00 drivers/scsi/megaraid/megaraid_ioctl.h conflict! 'm' 00-1F net/irda/irmod.h conflict! -'n' 00-7F linux/ncp_fs.h +'n' 00-7F linux/ncp_fs.h and fs/ncpfs/ioctl.c 'n' 80-8F linux/nilfs2_fs.h NILFS2 -'n' E0-FF video/matrox.h matroxfb +'n' E0-FF linux/matroxfb.h matroxfb 'o' 00-1F fs/ocfs2/ocfs2_fs.h OCFS2 -'o' 00-03 include/mtd/ubi-user.h conflict! (OCFS2 and UBI overlaps) -'o' 40-41 include/mtd/ubi-user.h UBI -'o' 01-A1 include/linux/dvb/*.h DVB +'o' 00-03 mtd/ubi-user.h conflict! (OCFS2 and UBI overlaps) +'o' 40-41 mtd/ubi-user.h UBI +'o' 01-A1 linux/dvb/*.h DVB 'p' 00-0F linux/phantom.h conflict! (OpenHaptics needs this) +'p' 00-1F linux/rtc.h conflict! 'p' 00-3F linux/mc146818rtc.h conflict! 'p' 40-7F linux/nvram.h -'p' 80-9F user-space parport +'p' 80-9F linux/ppdev.h user-space parport <mailto:tim@cyberelk.net> -'p' a1-a4 linux/pps.h LinuxPPS +'p' A1-A4 linux/pps.h LinuxPPS <mailto:giometti@linux.it> 'q' 00-1F linux/serio.h -'q' 80-FF Internet PhoneJACK, Internet LineJACK - <http://www.quicknet.net> -'r' 00-1F linux/msdos_fs.h +'q' 80-FF linux/telephony.h Internet PhoneJACK, Internet LineJACK + linux/ixjuser.h <http://www.quicknet.net> +'r' 00-1F linux/msdos_fs.h and fs/fat/dir.c 's' all linux/cdk.h 't' 00-7F linux/if_ppp.h 't' 80-8F linux/isdn_ppp.h +'t' 90 linux/toshiba.h 'u' 00-1F linux/smb_fs.h -'v' 00-1F linux/ext2_fs.h conflict! 'v' all linux/videodev.h conflict! +'v' 00-1F linux/ext2_fs.h conflict! +'v' 00-1F linux/fs.h conflict! +'v' 00-0F linux/sonypi.h conflict! +'v' C0-CF drivers/media/video/ov511.h conflict! +'v' C0-DF media/pwc-ioctl.h conflict! +'v' C0-FF linux/meye.h conflict! +'v' C0-CF drivers/media/video/zoran/zoran.h conflict! +'v' D0-DF drivers/media/video/cpia2/cpia2dev.h conflict! 'w' all CERN SCI driver 'y' 00-1F packet based user level communications <mailto:zapman@interlan.net> -'z' 00-3F CAN bus card +'z' 00-3F CAN bus card conflict! <mailto:hdstich@connectu.ulm.circular.de> -'z' 40-7F CAN bus card +'z' 40-7F CAN bus card conflict! <mailto:oe@port.de> +'z' 10-4F drivers/s390/crypto/zcrypt_api.h conflict! 0x80 00-1F linux/fb.h 0x81 00-1F linux/videotext.h +0x88 00-3F media/ovcamchip.h 0x89 00-06 arch/x86/include/asm/sockios.h 0x89 0B-DF linux/sockios.h 0x89 E0-EF linux/sockios.h SIOCPROTOPRIVATE range +0x89 E0-EF linux/dn.h PROTOPRIVATE range 0x89 F0-FF linux/sockios.h SIOCDEVPRIVATE range 0x8B all linux/wireless.h 0x8C 00-3F WiNRADiO driver <http://www.proximity.com.au/~brian/winradio/> 0x90 00 drivers/cdrom/sbpcd.h +0x92 00-0F drivers/usb/mon/mon_bin.c 0x93 60-7F linux/auto_fs.h +0x94 all fs/btrfs/ioctl.h 0x99 00-0F 537-Addinboard driver <mailto:buk@buks.ipn.de> 0xA0 all linux/sdp/sdp.h Industrial Device Project @@ -192,17 +301,22 @@ Code Seq# Include File Comments 0xAB 00-1F linux/nbd.h 0xAC 00-1F linux/raw.h 0xAD 00 Netfilter device in development: - <mailto:rusty@rustcorp.com.au> + <mailto:rusty@rustcorp.com.au> 0xAE all linux/kvm.h Kernel-based Virtual Machine <mailto:kvm@vger.kernel.org> 0xB0 all RATIO devices in development: <mailto:vgo@ratio.de> 0xB1 00-1F PPPoX <mailto:mostrows@styx.uwaterloo.ca> +0xC0 00-0F linux/usb/iowarrior.h 0xCB 00-1F CBM serial IEC bus in development: <mailto:michael.klein@puffin.lb.shuttle.de> +0xCD 01 linux/reiserfs_fs.h +0xCF 02 fs/cifs/ioctl.c +0xDB 00-0F drivers/char/mwave/mwavepub.h 0xDD 00-3F ZFCP device driver see drivers/s390/scsi/ <mailto:aherrman@de.ibm.com> -0xF3 00-3F video/sisfb.h sisfb (in development) +0xF3 00-3F drivers/usb/misc/sisusbvga/sisusb.h sisfb (in development) <mailto:thomas@winischhofer.net> 0xF4 00-1F video/mbxfb.h mbxfb <mailto:raph@8d.com> +0xFD all linux/dm-ioctl.h diff --git a/Documentation/isdn/INTERFACE.CAPI b/Documentation/isdn/INTERFACE.CAPI index 5fe8de5cc72..f172091fb7c 100644 --- a/Documentation/isdn/INTERFACE.CAPI +++ b/Documentation/isdn/INTERFACE.CAPI @@ -149,10 +149,11 @@ char *(*procinfo)(struct capi_ctr *ctrlr) pointer to a callback function returning the entry for the device in the CAPI controller info table, /proc/capi/controller -read_proc_t *ctr_read_proc - pointer to the read_proc callback function for the device's proc file - system entry, /proc/capi/controllers/<n>; will be called with a - pointer to the device's capi_ctr structure as the last (data) argument +const struct file_operations *proc_fops + pointers to callback functions for the device's proc file + system entry, /proc/capi/controllers/<n>; pointer to the device's + capi_ctr structure is available from struct proc_dir_entry::data + which is available from struct inode. Note: Callback functions except send_message() are never called in interrupt context. diff --git a/Documentation/isdn/README.gigaset b/Documentation/isdn/README.gigaset index 794941fc949..e472df84232 100644 --- a/Documentation/isdn/README.gigaset +++ b/Documentation/isdn/README.gigaset @@ -292,10 +292,10 @@ GigaSet 307x Device Driver to /etc/modprobe.d/gigaset, /etc/modprobe.conf.local or a similar file. Problem: - Your isdn script aborts with a message about isdnlog. + The isdnlog program emits error messages or just doesn't work. Solution: - Try deactivating (or commenting out) isdnlog. This driver does not - support it. + Isdnlog supports only the HiSax driver. Do not attempt to use it with + other drivers such as Gigaset. Problem: You have two or more DECT data adapters (M101/M105) and only the @@ -321,8 +321,8 @@ GigaSet 307x Device Driver writing an appropriate value to /sys/module/gigaset/parameters/debug, e.g. echo 0 > /sys/module/gigaset/parameters/debug switches off debugging output completely, - echo 0x10a020 > /sys/module/gigaset/parameters/debug - enables the standard set of debugging output messages. These values are + echo 0x302020 > /sys/module/gigaset/parameters/debug + enables a reasonable set of debugging output messages. These values are bit patterns where every bit controls a certain type of debugging output. See the constants DEBUG_* in the source file gigaset.h for details. diff --git a/Documentation/kernel-doc-nano-HOWTO.txt b/Documentation/kernel-doc-nano-HOWTO.txt index 348b9e5e28f..27a52b35d55 100644 --- a/Documentation/kernel-doc-nano-HOWTO.txt +++ b/Documentation/kernel-doc-nano-HOWTO.txt @@ -214,11 +214,13 @@ The format of the block comment is like this: * (section header: (section description)? )* (*)?*/ -The short function description ***cannot be multiline***, but the other -descriptions can be (and they can contain blank lines). If you continue -that initial short description onto a second line, that second line will -appear further down at the beginning of the description section, which is -almost certainly not what you had in mind. +All "description" text can span multiple lines, although the +function_name & its short description are traditionally on a single line. +Description text may also contain blank lines (i.e., lines that contain +only a "*"). + +"section header:" names must be unique per function (or struct, +union, typedef, enum). Avoid putting a spurious blank line after the function name, or else the description will be repeated! diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 5ba4d9dff11..d80930d58da 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -54,6 +54,7 @@ parameter is applicable: IMA Integrity measurement architecture is enabled. IOSCHED More than one I/O scheduler is enabled. IP_PNP IP DHCP, BOOTP, or RARP is enabled. + IPV6 IPv6 support is enabled. ISAPNP ISA PnP code is enabled. ISDN Appropriate ISDN support is enabled. JOY Appropriate joystick support is enabled. @@ -199,6 +200,10 @@ and is between 256 and 4096 characters. It is defined in the file acpi_display_output=video See above. + acpi_early_pdc_eval [HW,ACPI] Evaluate processor _PDC methods + early. Needed on some platforms to properly + initialize the EC. + acpi_irq_balance [HW,ACPI] ACPI will balance active IRQs default in APIC mode @@ -240,7 +245,7 @@ and is between 256 and 4096 characters. It is defined in the file acpi_sleep= [HW,ACPI] Sleep options Format: { s3_bios, s3_mode, s3_beep, s4_nohwsig, - old_ordering, s4_nonvs } + old_ordering, s4_nonvs, sci_force_enable } See Documentation/power/video.txt for information on s3_bios and s3_mode. s3_beep is for debugging; it makes the PC's speaker beep @@ -253,6 +258,9 @@ and is between 256 and 4096 characters. It is defined in the file of _PTS is used by default). s4_nonvs prevents the kernel from saving/restoring the ACPI NVS memory during hibernation. + sci_force_enable causes the kernel to set SCI_EN directly + on resume from S1/S3 (which is against the ACPI spec, + but some broken systems don't work without it). acpi_use_timer_override [HW,ACPI] Use timer override. For some broken Nvidia NF5 boards @@ -308,6 +316,11 @@ and is between 256 and 4096 characters. It is defined in the file aic79xx= [HW,SCSI] See Documentation/scsi/aic79xx.txt. + alignment= [KNL,ARM] + Allow the default userspace alignment fault handler + behaviour to be specified. Bit 0 enables warnings, + bit 1 enables fixups, and bit 2 sends a segfault. + amd_iommu= [HW,X86-84] Pass parameters to the AMD IOMMU driver in the system. Possible values are: @@ -344,6 +357,9 @@ and is between 256 and 4096 characters. It is defined in the file Change the amount of debugging information output when initialising the APIC and IO-APIC components. + autoconf= [IPV6] + See Documentation/networking/ipv6.txt. + show_lapic= [APIC,X86] Advanced Programmable Interrupt Controller Limit apic dumping. The parameter defines the maximal number of local apics being dumped. Also it is possible @@ -626,6 +642,12 @@ and is between 256 and 4096 characters. It is defined in the file See drivers/char/README.epca and Documentation/serial/digiepca.txt. + disable= [IPV6] + See Documentation/networking/ipv6.txt. + + disable_ipv6= [IPV6] + See Documentation/networking/ipv6.txt. + disable_mtrr_cleanup [X86] The kernel tries to adjust MTRR layout from continuous to discrete, to make X server driver able to add WB @@ -1726,6 +1748,9 @@ and is between 256 and 4096 characters. It is defined in the file nomfgpt [X86-32] Disable Multi-Function General Purpose Timer usage (for AMD Geode machines). + nopat [X86] Disable PAT (page attribute table extension of + pagetables) support. + norandmaps Don't use address space randomization. Equivalent to echo 0 > /proc/sys/kernel/randomize_va_space @@ -1769,6 +1794,12 @@ and is between 256 and 4096 characters. It is defined in the file purges which is reported from either PAL_VM_SUMMARY or SAL PALO. + nr_cpus= [SMP] Maximum number of processors that an SMP kernel + could support. nr_cpus=n : n >= 1 limits the kernel to + supporting 'n' processors. Later in runtime you can not + use hotplug cpu feature to put more cpu back to online. + just like you compile the kernel NR_CPUS=n + nr_uarts= [SERIAL] maximum number of UARTs to be registered. numa_zonelist_order= [KNL, BOOT] Select zonelist order for NUMA. @@ -1936,8 +1967,12 @@ and is between 256 and 4096 characters. It is defined in the file IRQ routing is enabled. noacpi [X86] Do not use ACPI for IRQ routing or for PCI scanning. - use_crs [X86] Use _CRS for PCI resource - allocation. + use_crs [X86] Use PCI host bridge window information + from ACPI. On BIOSes from 2008 or later, this + is enabled by default. If you need to use this, + please report a bug. + nocrs [X86] Ignore PCI host bridge windows from ACPI. + If you need to use this, please report a bug. routeirq Do IRQ routing for all PCI devices. This is normally done in pci_enable_device(), so this option is a temporary workaround @@ -1986,6 +2021,14 @@ and is between 256 and 4096 characters. It is defined in the file force Enable ASPM even on devices that claim not to support it. WARNING: Forcing ASPM on may cause system lockups. + pcie_pme= [PCIE,PM] Native PCIe PME signaling options: + off Do not use native PCIe PME signaling. + force Use native PCIe PME signaling even if the BIOS refuses + to allow the kernel to control the relevant PCIe config + registers. + nomsi Do not use MSI for native PCIe PME signaling (this makes + all PCIe root ports use INTx for everything). + pcmv= [HW,PCMCIA] BadgePAD 4 pd. [PARIDE] @@ -2691,6 +2734,13 @@ and is between 256 and 4096 characters. It is defined in the file medium is write-protected). Example: quirks=0419:aaf5:rl,0421:0433:rc + userpte= + [X86] Flags controlling user PTE allocations. + + nohigh = do not allocate PTE pages in + HIGHMEM regardless of setting + of CONFIG_HIGHPTE. + vdso= [X86,SH] vdso=2: enable compat VDSO (default with COMPAT_VDSO) vdso=1: enable VDSO (default) diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt index e1a11416102..2811e452f75 100644 --- a/Documentation/kvm/api.txt +++ b/Documentation/kvm/api.txt @@ -685,7 +685,7 @@ struct kvm_vcpu_events { __u8 pad; } nmi; __u32 sipi_vector; - __u32 flags; /* must be zero */ + __u32 flags; }; 4.30 KVM_SET_VCPU_EVENTS @@ -701,6 +701,14 @@ vcpu. See KVM_GET_VCPU_EVENTS for the data structure. +Fields that may be modified asynchronously by running VCPUs can be excluded +from the update. These fields are nmi.pending and sipi_vector. Keep the +corresponding bits in the flags field cleared to suppress overwriting the +current in-kernel state. The bits are: + +KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel +KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector + 5. The kvm_run structure diff --git a/Documentation/laptops/thinkpad-acpi.txt b/Documentation/laptops/thinkpad-acpi.txt index 169091f75e6..39c0a09d010 100644 --- a/Documentation/laptops/thinkpad-acpi.txt +++ b/Documentation/laptops/thinkpad-acpi.txt @@ -650,6 +650,10 @@ LCD, CRT or DVI (if available). The following commands are available: echo expand_toggle > /proc/acpi/ibm/video echo video_switch > /proc/acpi/ibm/video +NOTE: Access to this feature is restricted to processes owning the +CAP_SYS_ADMIN capability for safety reasons, as it can interact badly +enough with some versions of X.org to crash it. + Each video output device can be enabled or disabled individually. Reading /proc/acpi/ibm/video shows the status of each device. @@ -1092,8 +1096,8 @@ WARNING: its level up and down at every change. -Volume control --------------- +Volume control (Console Audio control) +-------------------------------------- procfs: /proc/acpi/ibm/volume ALSA: "ThinkPad Console Audio Control", default ID: "ThinkPadEC" @@ -1110,9 +1114,53 @@ the desktop environment to just provide on-screen-display feedback. Software volume control should be done only in the main AC97/HDA mixer. -This feature allows volume control on ThinkPad models with a digital -volume knob (when available, not all models have it), as well as -mute/unmute control. The available commands are: + +About the ThinkPad Console Audio control: + +ThinkPads have a built-in amplifier and muting circuit that drives the +console headphone and speakers. This circuit is after the main AC97 +or HDA mixer in the audio path, and under exclusive control of the +firmware. + +ThinkPads have three special hotkeys to interact with the console +audio control: volume up, volume down and mute. + +It is worth noting that the normal way the mute function works (on +ThinkPads that do not have a "mute LED") is: + +1. Press mute to mute. It will *always* mute, you can press it as + many times as you want, and the sound will remain mute. + +2. Press either volume key to unmute the ThinkPad (it will _not_ + change the volume, it will just unmute). + +This is a very superior design when compared to the cheap software-only +mute-toggle solution found on normal consumer laptops: you can be +absolutely sure the ThinkPad will not make noise if you press the mute +button, no matter the previous state. + +The IBM ThinkPads, and the earlier Lenovo ThinkPads have variable-gain +amplifiers driving the speakers and headphone output, and the firmware +also handles volume control for the headphone and speakers on these +ThinkPads without any help from the operating system (this volume +control stage exists after the main AC97 or HDA mixer in the audio +path). + +The newer Lenovo models only have firmware mute control, and depend on +the main HDA mixer to do volume control (which is done by the operating +system). In this case, the volume keys are filtered out for unmute +key press (there are some firmware bugs in this area) and delivered as +normal key presses to the operating system (thinkpad-acpi is not +involved). + + +The ThinkPad-ACPI volume control: + +The preferred way to interact with the Console Audio control is the +ALSA interface. + +The legacy procfs interface allows one to read the current state, +and if volume control is enabled, accepts the following commands: echo up >/proc/acpi/ibm/volume echo down >/proc/acpi/ibm/volume @@ -1121,12 +1169,10 @@ mute/unmute control. The available commands are: echo 'level <level>' >/proc/acpi/ibm/volume The <level> number range is 0 to 14 although not all of them may be -distinct. The unmute the volume after the mute command, use either the +distinct. To unmute the volume after the mute command, use either the up or down command (the level command will not unmute the volume), or the unmute command. -The current volume level and mute state is shown in the file. - You can use the volume_capabilities parameter to tell the driver whether your thinkpad has volume control or mute-only control: volume_capabilities=1 for mixers with mute and volume control, diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index 42208511b5c..3119f5db75b 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -34,7 +34,6 @@ #include <sys/uio.h> #include <termios.h> #include <getopt.h> -#include <zlib.h> #include <assert.h> #include <sched.h> #include <limits.h> diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index 50189bf07d5..fe5c099b8fc 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX @@ -32,6 +32,8 @@ cs89x0.txt - the Crystal LAN (CS8900/20-based) Ethernet ISA adapter driver cxacru.txt - Conexant AccessRunner USB ADSL Modem +cxacru-cf.py + - Conexant AccessRunner USB ADSL Modem configuration file parser de4x5.txt - the Digital EtherWORKS DE4?? and DE5?? PCI Ethernet driver decnet.txt diff --git a/Documentation/networking/3c509.txt b/Documentation/networking/3c509.txt index 0643e3b7168..3c45d5dcd63 100644 --- a/Documentation/networking/3c509.txt +++ b/Documentation/networking/3c509.txt @@ -48,11 +48,11 @@ for LILO parameters for doing this: This configures the first found 3c509 card for IRQ 10, base I/O 0x310, and transceiver type 3 (10base2). The flag "0x3c509" must be set to avoid conflicts with other card types when overriding the I/O address. When the driver is -loaded as a module, only the IRQ and transceiver setting may be overridden. -For example, setting two cards to 10base2/IRQ10 and AUI/IRQ11 is done by using -the xcvr and irq module options: +loaded as a module, only the IRQ may be overridden. For example, +setting two cards to IRQ10 and IRQ11 is done by using the irq module +option: - options 3c509 xcvr=3,1 irq=10,11 + options 3c509 irq=10,11 (2) Full-duplex mode @@ -77,6 +77,8 @@ operation. itself full-duplex capable. This is almost certainly one of two things: a full- duplex-capable Ethernet switch (*not* a hub), or a full-duplex-capable NIC on another system that's connected directly to the 3c509B via a crossover cable. + +Full-duplex mode can be enabled using 'ethtool'. /////Extremely important caution concerning full-duplex mode///// Understand that the 3c509B's hardware's full-duplex support is much more @@ -113,6 +115,8 @@ This insured that merely upgrading the driver from an earlier version would never automatically enable full-duplex mode in an existing installation; it must always be explicitly enabled via one of these code in order to be activated. + +The transceiver type can be changed using 'ethtool'. (4a) Interpretation of error messages and common problems diff --git a/Documentation/networking/cxacru-cf.py b/Documentation/networking/cxacru-cf.py new file mode 100644 index 00000000000..b41d298398c --- /dev/null +++ b/Documentation/networking/cxacru-cf.py @@ -0,0 +1,48 @@ +#!/usr/bin/env python +# Copyright 2009 Simon Arlott +# +# This program is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the Free +# Software Foundation; either version 2 of the License, or (at your option) +# any later version. +# +# This program is distributed in the hope that it will be useful, but WITHOUT +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +# more details. +# +# You should have received a copy of the GNU General Public License along with +# this program; if not, write to the Free Software Foundation, Inc., 59 +# Temple Place - Suite 330, Boston, MA 02111-1307, USA. +# +# Usage: cxacru-cf.py < cxacru-cf.bin +# Output: values string suitable for the sysfs adsl_config attribute +# +# Warning: cxacru-cf.bin with MD5 hash cdbac2689969d5ed5d4850f117702110 +# contains mis-aligned values which will stop the modem from being able +# to make a connection. If the first and last two bytes are removed then +# the values become valid, but the modulation will be forced to ANSI +# T1.413 only which may not be appropriate. +# +# The original binary format is a packed list of le32 values. + +import sys +import struct + +i = 0 +while True: + buf = sys.stdin.read(4) + + if len(buf) == 0: + break + elif len(buf) != 4: + sys.stdout.write("\n") + sys.stderr.write("Error: read {0} not 4 bytes\n".format(len(buf))) + sys.exit(1) + + if i > 0: + sys.stdout.write(" ") + sys.stdout.write("{0:x}={1}".format(i, struct.unpack("<I", buf)[0])) + i += 1 + +sys.stdout.write("\n") diff --git a/Documentation/networking/cxacru.txt b/Documentation/networking/cxacru.txt index b074681a963..2cce04457b4 100644 --- a/Documentation/networking/cxacru.txt +++ b/Documentation/networking/cxacru.txt @@ -4,6 +4,12 @@ While it is capable of managing/maintaining the ADSL connection without the module loaded, the device will sometimes stop responding after unloading the driver and it is necessary to unplug/remove power to the device to fix this. +Note: support for cxacru-cf.bin has been removed. It was not loaded correctly +so it had no effect on the device configuration. Fixing it could have stopped +existing devices working when an invalid configuration is supplied. + +There is a script cxacru-cf.py to convert an existing file to the sysfs form. + Detected devices will appear as ATM devices named "cxacru". In /sys/class/atm/ these are directories named cxacruN where N is the device number. A symlink named device points to the USB interface device's directory which contains @@ -15,6 +21,15 @@ several sysfs attribute files for retrieving device statistics: * adsl_headend_environment Information about the remote headend. +* adsl_config + Configuration writing interface. + Write parameters in hexadecimal format <index>=<value>, + separated by whitespace, e.g.: + "1=0 a=5" + Up to 7 parameters at a time will be sent and the modem will restart + the ADSL connection when any value is set. These are logged for future + reference. + * downstream_attenuation (dB) * downstream_bits_per_frame * downstream_rate (kbps) @@ -61,6 +76,7 @@ several sysfs attribute files for retrieving device statistics: * mac_address * modulation + "" (when not connected) "ANSI T1.413" "ITU-T G.992.1 (G.DMT)" "ITU-T G.992.2 (G.LITE)" diff --git a/Documentation/networking/dccp.txt b/Documentation/networking/dccp.txt index b132e4a3cf0..a62fdf7a6bf 100644 --- a/Documentation/networking/dccp.txt +++ b/Documentation/networking/dccp.txt @@ -58,8 +58,10 @@ DCCP_SOCKOPT_GET_CUR_MPS is read-only and retrieves the current maximum packet size (application payload size) in bytes, see RFC 4340, section 14. DCCP_SOCKOPT_AVAILABLE_CCIDS is also read-only and returns the list of CCIDs -supported by the endpoint (see include/linux/dccp.h for symbolic constants). -The caller needs to provide a sufficiently large (> 2) array of type uint8_t. +supported by the endpoint. The option value is an array of type uint8_t whose +size is passed as option length. The minimum array size is 4 elements, the +value returned in the optlen argument always reflects the true number of +built-in CCIDs. DCCP_SOCKOPT_CCID is write-only and sets both the TX and RX CCIDs at the same time, combining the operation of the next two socket options. This option is diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 006b39dec87..8b72c88ba21 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -487,6 +487,30 @@ tcp_dma_copybreak - INTEGER and CONFIG_NET_DMA is enabled. Default: 4096 +tcp_thin_linear_timeouts - BOOLEAN + Enable dynamic triggering of linear timeouts for thin streams. + If set, a check is performed upon retransmission by timeout to + determine if the stream is thin (less than 4 packets in flight). + As long as the stream is found to be thin, up to 6 linear + timeouts may be performed before exponential backoff mode is + initiated. This improves retransmission latency for + non-aggressive thin streams, often found to be time-dependent. + For more information on thin streams, see + Documentation/networking/tcp-thin.txt + Default: 0 + +tcp_thin_dupack - BOOLEAN + Enable dynamic triggering of retransmissions after one dupACK + for thin streams. If set, a check is performed upon reception + of a dupACK to determine if the stream is thin (less than 4 + packets in flight). As long as the stream is found to be thin, + data is retransmitted on the first received dupACK. This + improves retransmission latency for non-aggressive thin + streams, often found to be time-dependent. + For more information on thin streams, see + Documentation/networking/tcp-thin.txt + Default: 0 + UDP variables: udp_mem - vector of 3 INTEGERs: min, pressure, max @@ -692,6 +716,25 @@ proxy_arp - BOOLEAN conf/{all,interface}/proxy_arp is set to TRUE, it will be disabled otherwise +proxy_arp_pvlan - BOOLEAN + Private VLAN proxy arp. + Basically allow proxy arp replies back to the same interface + (from which the ARP request/solicitation was received). + + This is done to support (ethernet) switch features, like RFC + 3069, where the individual ports are NOT allowed to + communicate with each other, but they are allowed to talk to + the upstream router. As described in RFC 3069, it is possible + to allow these hosts to communicate through the upstream + router by proxy_arp'ing. Don't need to be used together with + proxy_arp. + + This technology is known by different names: + In RFC 3069 it is called VLAN Aggregation. + Cisco and Allied Telesyn call it Private VLAN. + Hewlett-Packard call it Source-Port filtering or port-isolation. + Ericsson call it MAC-Forced Forwarding (RFC Draft). + shared_media - BOOLEAN Send(router) or accept(host) RFC1620 shared media redirects. Overrides ip_secure_redirects. @@ -833,9 +876,18 @@ arp_notify - BOOLEAN or hardware address changes. arp_accept - BOOLEAN - Define behavior when gratuitous arp replies are received: - 0 - drop gratuitous arp frames - 1 - accept gratuitous arp frames + Define behavior for gratuitous ARP frames who's IP is not + already present in the ARP table: + 0 - don't create new entries in the ARP table + 1 - create new entries in the ARP table + + Both replies and requests type gratuitous arp will trigger the + ARP table to be updated, if this setting is on. + + If the ARP table already contains the IP address of the + gratuitous arp frame, the arp table will be updated regardless + if this setting is on or off. + app_solicit - INTEGER The maximum number of probes to send to the user space ARP daemon @@ -1074,10 +1126,10 @@ regen_max_retry - INTEGER Default: 5 max_addresses - INTEGER - Number of maximum addresses per interface. 0 disables limitation. - It is recommended not set too large value (or 0) because it would - be too easy way to crash kernel to allow to create too much of - autoconfigured addresses. + Maximum number of autoconfigured addresses per interface. Setting + to zero disables the limitation. It is not recommended to set this + value too large (or to zero) because it would be an easy way to + crash the kernel by allowing too many addresses to be created. Default: 16 disable_ipv6 - BOOLEAN diff --git a/Documentation/networking/ixgbevf.txt b/Documentation/networking/ixgbevf.txt new file mode 100755 index 00000000000..19015de6725 --- /dev/null +++ b/Documentation/networking/ixgbevf.txt @@ -0,0 +1,90 @@ +Linux* Base Driver for Intel(R) Network Connection +================================================== + +November 24, 2009 + +Contents +======== + +- In This Release +- Identifying Your Adapter +- Known Issues/Troubleshooting +- Support + +In This Release +=============== + +This file describes the ixgbevf Linux* Base Driver for Intel Network +Connection. + +The ixgbevf driver supports 82599-based virtual function devices that can only +be activated on kernels with CONFIG_PCI_IOV enabled. + +The ixgbevf driver supports virtual functions generated by the ixgbe driver +with a max_vfs value of 1 or greater. + +The guest OS loading the ixgbevf driver must support MSI-X interrupts. + +VLANs: There is a limit of a total of 32 shared VLANs to 1 or more VFs. + +Identifying Your Adapter +======================== + +For more information on how to identify your adapter, go to the Adapter & +Driver ID Guide at: + + http://support.intel.com/support/network/sb/CS-008441.htm + +Known Issues/Troubleshooting +============================ + + Unloading Physical Function (PF) Driver Causes System Reboots When VM is + Running and VF is Loaded on the VM + ------------------------------------------------------------------------ + Do not unload the PF driver (ixgbe) while VFs are assigned to guests. + +Support +======= + +For general information, go to the Intel support website at: + + http://support.intel.com + +or the Intel Wired Networking project hosted by Sourceforge at: + + http://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on the supported +kernel with a supported adapter, email the specific information related +to the issue to e1000-devel@lists.sf.net + +License +======= + +Intel 10 Gigabit Linux driver. +Copyright(c) 1999 - 2009 Intel Corporation. + +This program is free software; you can redistribute it and/or modify it +under the terms and conditions of the GNU General Public License, +version 2, as published by the Free Software Foundation. + +This program is distributed in the hope it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +more details. + +You should have received a copy of the GNU General Public License along with +this program; if not, write to the Free Software Foundation, Inc., +51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + +The full GNU General Public License is included in this distribution in +the file called "COPYING". + +Trademarks +========== + +Intel, Itanium, and Pentium are trademarks or registered trademarks of +Intel Corporation or its subsidiaries in the United States and other +countries. + +* Other names and brands may be claimed as the property of others. diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt index a22fd85e379..09ab0d29032 100644 --- a/Documentation/networking/packet_mmap.txt +++ b/Documentation/networking/packet_mmap.txt @@ -2,7 +2,7 @@ + ABSTRACT -------------------------------------------------------------------------------- -This file documents the CONFIG_PACKET_MMAP option available with the PACKET +This file documents the mmap() facility available with the PACKET socket interface on 2.4 and 2.6 kernels. This type of sockets is used for capture network traffic with utilities like tcpdump or any other that needs raw access to network interface. @@ -44,7 +44,7 @@ enabled. For transmission, check the MTU (Maximum Transmission Unit) used and supported by devices of your network. -------------------------------------------------------------------------------- -+ How to use CONFIG_PACKET_MMAP to improve capture process ++ How to use mmap() to improve capture process -------------------------------------------------------------------------------- From the user standpoint, you should use the higher level libpcap library, which @@ -64,7 +64,7 @@ the low level details or want to improve libpcap by including PACKET_MMAP support. -------------------------------------------------------------------------------- -+ How to use CONFIG_PACKET_MMAP directly to improve capture process ++ How to use mmap() directly to improve capture process -------------------------------------------------------------------------------- From the system calls stand point, the use of PACKET_MMAP involves @@ -105,7 +105,7 @@ also the mapping of the circular buffer in the user process and the use of this buffer. -------------------------------------------------------------------------------- -+ How to use CONFIG_PACKET_MMAP directly to improve transmission process ++ How to use mmap() directly to improve transmission process -------------------------------------------------------------------------------- Transmission process is similar to capture as shown below. diff --git a/Documentation/networking/regulatory.txt b/Documentation/networking/regulatory.txt index ee31369e9e5..9551622d0a7 100644 --- a/Documentation/networking/regulatory.txt +++ b/Documentation/networking/regulatory.txt @@ -188,3 +188,27 @@ Then in some part of your code after your wiphy has been registered: &mydriver_jp_regdom.reg_rules[i], sizeof(struct ieee80211_reg_rule)); regulatory_struct_hint(rd); + +Statically compiled regulatory database +--------------------------------------- + +In most situations the userland solution using CRDA as described +above is the preferred solution. However in some cases a set of +rules built into the kernel itself may be desirable. To account +for this situation, a configuration option has been provided +(i.e. CONFIG_CFG80211_INTERNAL_REGDB). With this option enabled, +the wireless database information contained in net/wireless/db.txt is +used to generate a data structure encoded in net/wireless/regdb.c. +That option also enables code in net/wireless/reg.c which queries +the data in regdb.c as an alternative to using CRDA. + +The file net/wireless/db.txt should be kept up-to-date with the db.txt +file available in the git repository here: + + git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-regdb.git + +Again, most users in most situations should be using the CRDA package +provided with their distribution, and in most other situations users +should be building and using CRDA on their own rather than using +this option. If you are not absolutely sure that you should be using +CONFIG_CFG80211_INTERNAL_REGDB then _DO_NOT_USE_IT_. diff --git a/Documentation/networking/tcp-thin.txt b/Documentation/networking/tcp-thin.txt new file mode 100644 index 00000000000..151e229980f --- /dev/null +++ b/Documentation/networking/tcp-thin.txt @@ -0,0 +1,47 @@ +Thin-streams and TCP +==================== +A wide range of Internet-based services that use reliable transport +protocols display what we call thin-stream properties. This means +that the application sends data with such a low rate that the +retransmission mechanisms of the transport protocol are not fully +effective. In time-dependent scenarios (like online games, control +systems, stock trading etc.) where the user experience depends +on the data delivery latency, packet loss can be devastating for +the service quality. Extreme latencies are caused by TCP's +dependency on the arrival of new data from the application to trigger +retransmissions effectively through fast retransmit instead of +waiting for long timeouts. + +After analysing a large number of time-dependent interactive +applications, we have seen that they often produce thin streams +and also stay with this traffic pattern throughout its entire +lifespan. The combination of time-dependency and the fact that the +streams provoke high latencies when using TCP is unfortunate. + +In order to reduce application-layer latency when packets are lost, +a set of mechanisms has been made, which address these latency issues +for thin streams. In short, if the kernel detects a thin stream, +the retransmission mechanisms are modified in the following manner: + +1) If the stream is thin, fast retransmit on the first dupACK. +2) If the stream is thin, do not apply exponential backoff. + +These enhancements are applied only if the stream is detected as +thin. This is accomplished by defining a threshold for the number +of packets in flight. If there are less than 4 packets in flight, +fast retransmissions can not be triggered, and the stream is prone +to experience high retransmission latencies. + +Since these mechanisms are targeted at time-dependent applications, +they must be specifically activated by the application using the +TCP_THIN_LINEAR_TIMEOUTS and TCP_THIN_DUPACK IOCTLS or the +tcp_thin_linear_timeouts and tcp_thin_dupack sysctls. Both +modifications are turned off by default. + +References +========== +More information on the modifications, as well as a wide range of +experimental data can be found here: +"Improving latency for interactive, thin-stream applications over +reliable transport" +http://simula.no/research/nd/publications/Simula.nd.477/simula_pdf_file diff --git a/Documentation/pcmcia/locking.txt b/Documentation/pcmcia/locking.txt new file mode 100644 index 00000000000..68f622bc406 --- /dev/null +++ b/Documentation/pcmcia/locking.txt @@ -0,0 +1,118 @@ +This file explains the locking and exclusion scheme used in the PCCARD +and PCMCIA subsystems. + + +A) Overview, Locking Hierarchy: +=============================== + +pcmcia_socket_list_rwsem - protects only the list of sockets +- skt_mutex - serializes card insert / ejection + - ops_mutex - serializes socket operation + + +B) Exclusion +============ + +The following functions and callbacks to struct pcmcia_socket must +be called with "skt_mutex" held: + + socket_detect_change() + send_event() + socket_reset() + socket_shutdown() + socket_setup() + socket_remove() + socket_insert() + socket_early_resume() + socket_late_resume() + socket_resume() + socket_suspend() + + struct pcmcia_callback *callback + +The following functions and callbacks to struct pcmcia_socket must +be called with "ops_mutex" held: + + socket_reset() + socket_setup() + + struct pccard_operations *ops + struct pccard_resource_ops *resource_ops; + +Note that send_event() and struct pcmcia_callback *callback must not be +called with "ops_mutex" held. + + +C) Protection +============= + +1. Global Data: +--------------- +struct list_head pcmcia_socket_list; + +protected by pcmcia_socket_list_rwsem; + + +2. Per-Socket Data: +------------------- +The resource_ops and their data are protected by ops_mutex. + +The "main" struct pcmcia_socket is protected as follows (read-only fields +or single-use fields not mentioned): + +- by pcmcia_socket_list_rwsem: + struct list_head socket_list; + +- by thread_lock: + unsigned int thread_events; + +- by skt_mutex: + u_int suspended_state; + void (*tune_bridge); + struct pcmcia_callback *callback; + int resume_status; + +- by ops_mutex: + socket_state_t socket; + u_int state; + u_short lock_count; + pccard_mem_map cis_mem; + void __iomem *cis_virt; + struct { } irq; + io_window_t io[]; + pccard_mem_map win[]; + struct list_head cis_cache; + size_t fake_cis_len; + u8 *fake_cis; + u_int irq_mask; + void (*zoom_video); + int (*power_hook); + u8 resource...; + struct list_head devices_list; + u8 device_count; + struct pcmcia_state; + + +3. Per PCMCIA-device Data: +-------------------------- + +The "main" struct pcmcia_devie is protected as follows (read-only fields +or single-use fields not mentioned): + + +- by pcmcia_socket->ops_mutex: + struct list_head socket_device_list; + struct config_t *function_config; + u16 _irq:1; + u16 _io:1; + u16 _win:4; + u16 _locked:1; + u16 allow_func_id_match:1; + u16 suspended:1; + u16 _removed:1; + +- by the PCMCIA driver: + io_req_t io; + irq_req_t irq; + config_req_t conf; + window_handle_t win; diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt index 4a3109b2884..356fd86f4ea 100644 --- a/Documentation/power/runtime_pm.txt +++ b/Documentation/power/runtime_pm.txt @@ -42,80 +42,81 @@ struct dev_pm_ops { ... }; -The ->runtime_suspend() callback is executed by the PM core for the bus type of -the device being suspended. The bus type's callback is then _entirely_ -_responsible_ for handling the device as appropriate, which may, but need not -include executing the device driver's own ->runtime_suspend() callback (from the +The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks are +executed by the PM core for either the bus type, or device type (if the bus +type's callback is not defined), or device class (if the bus type's and device +type's callbacks are not defined) of given device. The bus type, device type +and device class callbacks are referred to as subsystem-level callbacks in what +follows. + +The subsystem-level suspend callback is _entirely_ _responsible_ for handling +the suspend of the device as appropriate, which may, but need not include +executing the device driver's own ->runtime_suspend() callback (from the PM core's point of view it is not necessary to implement a ->runtime_suspend() -callback in a device driver as long as the bus type's ->runtime_suspend() knows -what to do to handle the device). +callback in a device driver as long as the subsystem-level suspend callback +knows what to do to handle the device). - * Once the bus type's ->runtime_suspend() callback has completed successfully + * Once the subsystem-level suspend callback has completed successfully for given device, the PM core regards the device as suspended, which need not mean that the device has been put into a low power state. It is supposed to mean, however, that the device will not process data and will - not communicate with the CPU(s) and RAM until its bus type's - ->runtime_resume() callback is executed for it. The run-time PM status of - a device after successful execution of its bus type's ->runtime_suspend() - callback is 'suspended'. - - * If the bus type's ->runtime_suspend() callback returns -EBUSY or -EAGAIN, - the device's run-time PM status is supposed to be 'active', which means that - the device _must_ be fully operational afterwards. - - * If the bus type's ->runtime_suspend() callback returns an error code - different from -EBUSY or -EAGAIN, the PM core regards this as a fatal - error and will refuse to run the helper functions described in Section 4 - for the device, until the status of it is directly set either to 'active' - or to 'suspended' (the PM core provides special helper functions for this - purpose). - -In particular, if the driver requires remote wakeup capability for proper -functioning and device_run_wake() returns 'false' for the device, then -->runtime_suspend() should return -EBUSY. On the other hand, if -device_run_wake() returns 'true' for the device and the device is put -into a low power state during the execution of its bus type's -->runtime_suspend(), it is expected that remote wake-up (i.e. hardware mechanism -allowing the device to request a change of its power state, such as PCI PME) -will be enabled for the device. Generally, remote wake-up should be enabled -for all input devices put into a low power state at run time. - -The ->runtime_resume() callback is executed by the PM core for the bus type of -the device being woken up. The bus type's callback is then _entirely_ -_responsible_ for handling the device as appropriate, which may, but need not -include executing the device driver's own ->runtime_resume() callback (from the -PM core's point of view it is not necessary to implement a ->runtime_resume() -callback in a device driver as long as the bus type's ->runtime_resume() knows -what to do to handle the device). - - * Once the bus type's ->runtime_resume() callback has completed successfully, - the PM core regards the device as fully operational, which means that the - device _must_ be able to complete I/O operations as needed. The run-time - PM status of the device is then 'active'. - - * If the bus type's ->runtime_resume() callback returns an error code, the PM - core regards this as a fatal error and will refuse to run the helper - functions described in Section 4 for the device, until its status is - directly set either to 'active' or to 'suspended' (the PM core provides - special helper functions for this purpose). - -The ->runtime_idle() callback is executed by the PM core for the bus type of -given device whenever the device appears to be idle, which is indicated to the -PM core by two counters, the device's usage counter and the counter of 'active' -children of the device. + not communicate with the CPU(s) and RAM until the subsystem-level resume + callback is executed for it. The run-time PM status of a device after + successful execution of the subsystem-level suspend callback is 'suspended'. + + * If the subsystem-level suspend callback returns -EBUSY or -EAGAIN, + the device's run-time PM status is 'active', which means that the device + _must_ be fully operational afterwards. + + * If the subsystem-level suspend callback returns an error code different + from -EBUSY or -EAGAIN, the PM core regards this as a fatal error and will + refuse to run the helper functions described in Section 4 for the device, + until the status of it is directly set either to 'active', or to 'suspended' + (the PM core provides special helper functions for this purpose). + +In particular, if the driver requires remote wake-up capability (i.e. hardware +mechanism allowing the device to request a change of its power state, such as +PCI PME) for proper functioning and device_run_wake() returns 'false' for the +device, then ->runtime_suspend() should return -EBUSY. On the other hand, if +device_run_wake() returns 'true' for the device and the device is put into a low +power state during the execution of the subsystem-level suspend callback, it is +expected that remote wake-up will be enabled for the device. Generally, remote +wake-up should be enabled for all input devices put into a low power state at +run time. + +The subsystem-level resume callback is _entirely_ _responsible_ for handling the +resume of the device as appropriate, which may, but need not include executing +the device driver's own ->runtime_resume() callback (from the PM core's point of +view it is not necessary to implement a ->runtime_resume() callback in a device +driver as long as the subsystem-level resume callback knows what to do to handle +the device). + + * Once the subsystem-level resume callback has completed successfully, the PM + core regards the device as fully operational, which means that the device + _must_ be able to complete I/O operations as needed. The run-time PM status + of the device is then 'active'. + + * If the subsystem-level resume callback returns an error code, the PM core + regards this as a fatal error and will refuse to run the helper functions + described in Section 4 for the device, until its status is directly set + either to 'active' or to 'suspended' (the PM core provides special helper + functions for this purpose). + +The subsystem-level idle callback is executed by the PM core whenever the device +appears to be idle, which is indicated to the PM core by two counters, the +device's usage counter and the counter of 'active' children of the device. * If any of these counters is decreased using a helper function provided by the PM core and it turns out to be equal to zero, the other counter is checked. If that counter also is equal to zero, the PM core executes the - device bus type's ->runtime_idle() callback (with the device as an - argument). + subsystem-level idle callback with the device as an argument. -The action performed by a bus type's ->runtime_idle() callback is totally -dependent on the bus type in question, but the expected and recommended action -is to check if the device can be suspended (i.e. if all of the conditions -necessary for suspending the device are satisfied) and to queue up a suspend -request for the device in that case. The value returned by this callback is -ignored by the PM core. +The action performed by a subsystem-level idle callback is totally dependent on +the subsystem in question, but the expected and recommended action is to check +if the device can be suspended (i.e. if all of the conditions necessary for +suspending the device are satisfied) and to queue up a suspend request for the +device in that case. The value returned by this callback is ignored by the PM +core. The helper functions provided by the PM core, described in Section 4, guarantee that the following constraints are met with respect to the bus type's run-time @@ -238,41 +239,41 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: removing the device from device hierarchy int pm_runtime_idle(struct device *dev); - - execute ->runtime_idle() for the device's bus type; returns 0 on success - or error code on failure, where -EINPROGRESS means that ->runtime_idle() - is already being executed + - execute the subsystem-level idle callback for the device; returns 0 on + success or error code on failure, where -EINPROGRESS means that + ->runtime_idle() is already being executed int pm_runtime_suspend(struct device *dev); - - execute ->runtime_suspend() for the device's bus type; returns 0 on + - execute the subsystem-level suspend callback for the device; returns 0 on success, 1 if the device's run-time PM status was already 'suspended', or error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt to suspend the device again in future int pm_runtime_resume(struct device *dev); - - execute ->runtime_resume() for the device's bus type; returns 0 on + - execute the subsystem-leve resume callback for the device; returns 0 on success, 1 if the device's run-time PM status was already 'active' or error code on failure, where -EAGAIN means it may be safe to attempt to resume the device again in future, but 'power.runtime_error' should be checked additionally int pm_request_idle(struct device *dev); - - submit a request to execute ->runtime_idle() for the device's bus type - (the request is represented by a work item in pm_wq); returns 0 on success - or error code if the request has not been queued up + - submit a request to execute the subsystem-level idle callback for the + device (the request is represented by a work item in pm_wq); returns 0 on + success or error code if the request has not been queued up int pm_schedule_suspend(struct device *dev, unsigned int delay); - - schedule the execution of ->runtime_suspend() for the device's bus type - in future, where 'delay' is the time to wait before queuing up a suspend - work item in pm_wq, in milliseconds (if 'delay' is zero, the work item is - queued up immediately); returns 0 on success, 1 if the device's PM + - schedule the execution of the subsystem-level suspend callback for the + device in future, where 'delay' is the time to wait before queuing up a + suspend work item in pm_wq, in milliseconds (if 'delay' is zero, the work + item is queued up immediately); returns 0 on success, 1 if the device's PM run-time status was already 'suspended', or error code if the request hasn't been scheduled (or queued up if 'delay' is 0); if the execution of ->runtime_suspend() is already scheduled and not yet expired, the new value of 'delay' will be used as the time to wait int pm_request_resume(struct device *dev); - - submit a request to execute ->runtime_resume() for the device's bus type - (the request is represented by a work item in pm_wq); returns 0 on + - submit a request to execute the subsystem-level resume callback for the + device (the request is represented by a work item in pm_wq); returns 0 on success, 1 if the device's run-time PM status was already 'active', or error code if the request hasn't been queued up @@ -303,12 +304,12 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: run-time PM callbacks described in Section 2 int pm_runtime_disable(struct device *dev); - - prevent the run-time PM helper functions from running the device bus - type's run-time PM callbacks, make sure that all of the pending run-time - PM operations on the device are either completed or canceled; returns - 1 if there was a resume request pending and it was necessary to execute - ->runtime_resume() for the device's bus type to satisfy that request, - otherwise 0 is returned + - prevent the run-time PM helper functions from running subsystem-level + run-time PM callbacks for the device, make sure that all of the pending + run-time PM operations on the device are either completed or canceled; + returns 1 if there was a resume request pending and it was necessary to + execute the subsystem-level resume callback for the device to satisfy that + request, otherwise 0 is returned void pm_suspend_ignore_children(struct device *dev, bool enable); - set/unset the power.ignore_children flag of the device @@ -378,5 +379,55 @@ pm_runtime_suspend() or pm_runtime_idle() or their asynchronous counterparts, they will fail returning -EAGAIN, because the device's usage counter is incremented by the core before executing ->probe() and ->remove(). Still, it may be desirable to suspend the device as soon as ->probe() or ->remove() has -finished, so the PM core uses pm_runtime_idle_sync() to invoke the device bus -type's ->runtime_idle() callback at that time. +finished, so the PM core uses pm_runtime_idle_sync() to invoke the +subsystem-level idle callback for the device at that time. + +6. Run-time PM and System Sleep + +Run-time PM and system sleep (i.e., system suspend and hibernation, also known +as suspend-to-RAM and suspend-to-disk) interact with each other in a couple of +ways. If a device is active when a system sleep starts, everything is +straightforward. But what should happen if the device is already suspended? + +The device may have different wake-up settings for run-time PM and system sleep. +For example, remote wake-up may be enabled for run-time suspend but disallowed +for system sleep (device_may_wakeup(dev) returns 'false'). When this happens, +the subsystem-level system suspend callback is responsible for changing the +device's wake-up setting (it may leave that to the device driver's system +suspend routine). It may be necessary to resume the device and suspend it again +in order to do so. The same is true if the driver uses different power levels +or other settings for run-time suspend and system sleep. + +During system resume, devices generally should be brought back to full power, +even if they were suspended before the system sleep began. There are several +reasons for this, including: + + * The device might need to switch power levels, wake-up settings, etc. + + * Remote wake-up events might have been lost by the firmware. + + * The device's children may need the device to be at full power in order + to resume themselves. + + * The driver's idea of the device state may not agree with the device's + physical state. This can happen during resume from hibernation. + + * The device might need to be reset. + + * Even though the device was suspended, if its usage counter was > 0 then most + likely it would need a run-time resume in the near future anyway. + + * Always going back to full power is simplest. + +If the device was suspended before the sleep began, then its run-time PM status +will have to be updated to reflect the actual post-system sleep status. The way +to do this is: + + pm_runtime_disable(dev); + pm_runtime_set_active(dev); + pm_runtime_enable(dev); + +The PM core always increments the run-time usage counter before calling the +->prepare() callback and decrements it after calling the ->complete() callback. +Hence disabling run-time PM temporarily like this will not cause any run-time +suspend callbacks to be lost. diff --git a/Documentation/powerpc/dts-bindings/fsl/can.txt b/Documentation/powerpc/dts-bindings/fsl/can.txt new file mode 100644 index 00000000000..2fa4fcd38fd --- /dev/null +++ b/Documentation/powerpc/dts-bindings/fsl/can.txt @@ -0,0 +1,53 @@ +CAN Device Tree Bindings +------------------------ + +(c) 2006-2009 Secret Lab Technologies Ltd +Grant Likely <grant.likely@secretlab.ca> + +fsl,mpc5200-mscan nodes +----------------------- +In addition to the required compatible-, reg- and interrupt-properties, you can +also specify which clock source shall be used for the controller: + +- fsl,mscan-clock-source : a string describing the clock source. Valid values + are: "ip" for ip bus clock + "ref" for reference clock (XTAL) + "ref" is default in case this property is not + present. + +fsl,mpc5121-mscan nodes +----------------------- +In addition to the required compatible-, reg- and interrupt-properties, you can +also specify which clock source and divider shall be used for the controller: + +- fsl,mscan-clock-source : a string describing the clock source. Valid values + are: "ip" for ip bus clock + "ref" for reference clock + "sys" for system clock + If this property is not present, an optimal CAN + clock source and frequency based on the system + clock will be selected. If this is not possible, + the reference clock will be used. + +- fsl,mscan-clock-divider: for the reference and system clock, an additional + clock divider can be specified. By default, a + value of 1 is used. + +Note that the MPC5121 Rev. 1 processor is not supported. + +Examples: + can@1300 { + compatible = "fsl,mpc5121-mscan"; + interrupts = <12 0x8>; + interrupt-parent = <&ipic>; + reg = <0x1300 0x80>; + }; + + can@1380 { + compatible = "fsl,mpc5121-mscan"; + interrupts = <13 0x8>; + interrupt-parent = <&ipic>; + reg = <0x1380 0x80>; + fsl,mscan-clock-source = "ref"; + fsl,mscan-clock-divider = <3>; + }; diff --git a/Documentation/powerpc/dts-bindings/fsl/mpc5121-psc.txt b/Documentation/powerpc/dts-bindings/fsl/mpc5121-psc.txt new file mode 100644 index 00000000000..8832e879891 --- /dev/null +++ b/Documentation/powerpc/dts-bindings/fsl/mpc5121-psc.txt @@ -0,0 +1,70 @@ +MPC5121 PSC Device Tree Bindings + +PSC in UART mode +---------------- + +For PSC in UART mode the needed PSC serial devices +are specified by fsl,mpc5121-psc-uart nodes in the +fsl,mpc5121-immr SoC node. Additionally the PSC FIFO +Controller node fsl,mpc5121-psc-fifo is requered there: + +fsl,mpc5121-psc-uart nodes +-------------------------- + +Required properties : + - compatible : Should contain "fsl,mpc5121-psc-uart" and "fsl,mpc5121-psc" + - cell-index : Index of the PSC in hardware + - reg : Offset and length of the register set for the PSC device + - interrupts : <a b> where a is the interrupt number of the + PSC FIFO Controller and b is a field that represents an + encoding of the sense and level information for the interrupt. + - interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + +Recommended properties : + - fsl,rx-fifo-size : the size of the RX fifo slice (a multiple of 4) + - fsl,tx-fifo-size : the size of the TX fifo slice (a multiple of 4) + + +fsl,mpc5121-psc-fifo node +------------------------- + +Required properties : + - compatible : Should be "fsl,mpc5121-psc-fifo" + - reg : Offset and length of the register set for the PSC + FIFO Controller + - interrupts : <a b> where a is the interrupt number of the + PSC FIFO Controller and b is a field that represents an + encoding of the sense and level information for the interrupt. + - interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + + +Example for a board using PSC0 and PSC1 devices in serial mode: + +serial@11000 { + compatible = "fsl,mpc5121-psc-uart", "fsl,mpc5121-psc"; + cell-index = <0>; + reg = <0x11000 0x100>; + interrupts = <40 0x8>; + interrupt-parent = < &ipic >; + fsl,rx-fifo-size = <16>; + fsl,tx-fifo-size = <16>; +}; + +serial@11100 { + compatible = "fsl,mpc5121-psc-uart", "fsl,mpc5121-psc"; + cell-index = <1>; + reg = <0x11100 0x100>; + interrupts = <40 0x8>; + interrupt-parent = < &ipic >; + fsl,rx-fifo-size = <16>; + fsl,tx-fifo-size = <16>; +}; + +pscfifo@11f00 { + compatible = "fsl,mpc5121-psc-fifo"; + reg = <0x11f00 0x100>; + interrupts = <40 0x8>; + interrupt-parent = < &ipic >; +}; diff --git a/Documentation/powerpc/dts-bindings/fsl/mpc5200.txt b/Documentation/powerpc/dts-bindings/fsl/mpc5200.txt index 5c6602dbfdc..4ccb2cd5df9 100644 --- a/Documentation/powerpc/dts-bindings/fsl/mpc5200.txt +++ b/Documentation/powerpc/dts-bindings/fsl/mpc5200.txt @@ -195,11 +195,4 @@ External interrupts: fsl,mpc5200-mscan nodes ----------------------- -In addition to the required compatible-, reg- and interrupt-properites, you can -also specify which clock source shall be used for the controller: - -- fsl,mscan-clock-source- a string describing the clock source. Valid values - are: "ip" for ip bus clock - "ref" for reference clock (XTAL) - "ref" is default in case this property is not - present. +See file can.txt in this directory. diff --git a/Documentation/powerpc/dts-bindings/fsl/mpic.txt b/Documentation/powerpc/dts-bindings/fsl/mpic.txt new file mode 100644 index 00000000000..71e39cf3215 --- /dev/null +++ b/Documentation/powerpc/dts-bindings/fsl/mpic.txt @@ -0,0 +1,42 @@ +* OpenPIC and its interrupt numbers on Freescale's e500/e600 cores + +The OpenPIC specification does not specify which interrupt source has to +become which interrupt number. This is up to the software implementation +of the interrupt controller. The only requirement is that every +interrupt source has to have an unique interrupt number / vector number. +To accomplish this the current implementation assigns the number zero to +the first source, the number one to the second source and so on until +all interrupt sources have their unique number. +Usually the assigned vector number equals the interrupt number mentioned +in the documentation for a given core / CPU. This is however not true +for the e500 cores (MPC85XX CPUs) where the documentation distinguishes +between internal and external interrupt sources and starts counting at +zero for both of them. + +So what to write for external interrupt source X or internal interrupt +source Y into the device tree? Here is an example: + +The memory map for the interrupt controller in the MPC8544[0] shows, +that the first interrupt source starts at 0x5_0000 (PIC Register Address +Map-Interrupt Source Configuration Registers). This source becomes the +number zero therefore: + External interrupt 0 = interrupt number 0 + External interrupt 1 = interrupt number 1 + External interrupt 2 = interrupt number 2 + ... +Every interrupt number allocates 0x20 bytes register space. So to get +its number it is sufficient to shift the lower 16bits to right by five. +So for the external interrupt 10 we have: + 0x0140 >> 5 = 10 + +After the external sources, the internal sources follow. The in core I2C +controller on the MPC8544 for instance has the internal source number +27. Oo obtain its interrupt number we take the lower 16bits of its memory +address (0x5_0560) and shift it right: + 0x0560 >> 5 = 43 + +Therefore the I2C device node for the MPC8544 CPU has to have the +interrupt number 43 specified in the device tree. + +[0] MPC8544E PowerQUICCTM III, Integrated Host Processor Family Reference Manual + MPC8544ERM Rev. 1 10/2007 diff --git a/Documentation/powerpc/dts-bindings/fsl/spi.txt b/Documentation/powerpc/dts-bindings/fsl/spi.txt index e7d9a344c4f..80510c018ee 100644 --- a/Documentation/powerpc/dts-bindings/fsl/spi.txt +++ b/Documentation/powerpc/dts-bindings/fsl/spi.txt @@ -13,6 +13,11 @@ Required properties: - interrupt-parent : the phandle for the interrupt controller that services interrupts for this device. +Optional properties: +- gpios : specifies the gpio pins to be used for chipselects. + The gpios will be referred to as reg = <index> in the SPI child nodes. + If unspecified, a single SPI device without a chip select can be used. + Example: spi@4c0 { cell-index = <0>; @@ -21,4 +26,6 @@ Example: interrupts = <82 0>; interrupt-parent = <700>; mode = "cpu"; + gpios = <&gpio 18 1 // device reg=<0> + &gpio 19 1>; // device reg=<1> }; diff --git a/Documentation/powerpc/ptrace.txt b/Documentation/powerpc/ptrace.txt new file mode 100644 index 00000000000..f4a5499b7bc --- /dev/null +++ b/Documentation/powerpc/ptrace.txt @@ -0,0 +1,134 @@ +GDB intends to support the following hardware debug features of BookE +processors: + +4 hardware breakpoints (IAC) +2 hardware watchpoints (read, write and read-write) (DAC) +2 value conditions for the hardware watchpoints (DVC) + +For that, we need to extend ptrace so that GDB can query and set these +resources. Since we're extending, we're trying to create an interface +that's extendable and that covers both BookE and server processors, so +that GDB doesn't need to special-case each of them. We added the +following 3 new ptrace requests. + +1. PTRACE_PPC_GETHWDEBUGINFO + +Query for GDB to discover the hardware debug features. The main info to +be returned here is the minimum alignment for the hardware watchpoints. +BookE processors don't have restrictions here, but server processors have +an 8-byte alignment restriction for hardware watchpoints. We'd like to avoid +adding special cases to GDB based on what it sees in AUXV. + +Since we're at it, we added other useful info that the kernel can return to +GDB: this query will return the number of hardware breakpoints, hardware +watchpoints and whether it supports a range of addresses and a condition. +The query will fill the following structure provided by the requesting process: + +struct ppc_debug_info { + unit32_t version; + unit32_t num_instruction_bps; + unit32_t num_data_bps; + unit32_t num_condition_regs; + unit32_t data_bp_alignment; + unit32_t sizeof_condition; /* size of the DVC register */ + uint64_t features; /* bitmask of the individual flags */ +}; + +features will have bits indicating whether there is support for: + +#define PPC_DEBUG_FEATURE_INSN_BP_RANGE 0x1 +#define PPC_DEBUG_FEATURE_INSN_BP_MASK 0x2 +#define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4 +#define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8 + +2. PTRACE_SETHWDEBUG + +Sets a hardware breakpoint or watchpoint, according to the provided structure: + +struct ppc_hw_breakpoint { + uint32_t version; +#define PPC_BREAKPOINT_TRIGGER_EXECUTE 0x1 +#define PPC_BREAKPOINT_TRIGGER_READ 0x2 +#define PPC_BREAKPOINT_TRIGGER_WRITE 0x4 + uint32_t trigger_type; /* only some combinations allowed */ +#define PPC_BREAKPOINT_MODE_EXACT 0x0 +#define PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE 0x1 +#define PPC_BREAKPOINT_MODE_RANGE_EXCLUSIVE 0x2 +#define PPC_BREAKPOINT_MODE_MASK 0x3 + uint32_t addr_mode; /* address match mode */ + +#define PPC_BREAKPOINT_CONDITION_MODE 0x3 +#define PPC_BREAKPOINT_CONDITION_NONE 0x0 +#define PPC_BREAKPOINT_CONDITION_AND 0x1 +#define PPC_BREAKPOINT_CONDITION_EXACT 0x1 /* different name for the same thing as above */ +#define PPC_BREAKPOINT_CONDITION_OR 0x2 +#define PPC_BREAKPOINT_CONDITION_AND_OR 0x3 +#define PPC_BREAKPOINT_CONDITION_BE_ALL 0x00ff0000 /* byte enable bits */ +#define PPC_BREAKPOINT_CONDITION_BE(n) (1<<((n)+16)) + uint32_t condition_mode; /* break/watchpoint condition flags */ + + uint64_t addr; + uint64_t addr2; + uint64_t condition_value; +}; + +A request specifies one event, not necessarily just one register to be set. +For instance, if the request is for a watchpoint with a condition, both the +DAC and DVC registers will be set in the same request. + +With this GDB can ask for all kinds of hardware breakpoints and watchpoints +that the BookE supports. COMEFROM breakpoints available in server processors +are not contemplated, but that is out of the scope of this work. + +ptrace will return an integer (handle) uniquely identifying the breakpoint or +watchpoint just created. This integer will be used in the PTRACE_DELHWDEBUG +request to ask for its removal. Return -ENOSPC if the requested breakpoint +can't be allocated on the registers. + +Some examples of using the structure to: + +- set a breakpoint in the first breakpoint register + + p.version = PPC_DEBUG_CURRENT_VERSION; + p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE; + p.addr_mode = PPC_BREAKPOINT_MODE_EXACT; + p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE; + p.addr = (uint64_t) address; + p.addr2 = 0; + p.condition_value = 0; + +- set a watchpoint which triggers on reads in the second watchpoint register + + p.version = PPC_DEBUG_CURRENT_VERSION; + p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ; + p.addr_mode = PPC_BREAKPOINT_MODE_EXACT; + p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE; + p.addr = (uint64_t) address; + p.addr2 = 0; + p.condition_value = 0; + +- set a watchpoint which triggers only with a specific value + + p.version = PPC_DEBUG_CURRENT_VERSION; + p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ; + p.addr_mode = PPC_BREAKPOINT_MODE_EXACT; + p.condition_mode = PPC_BREAKPOINT_CONDITION_AND | PPC_BREAKPOINT_CONDITION_BE_ALL; + p.addr = (uint64_t) address; + p.addr2 = 0; + p.condition_value = (uint64_t) condition; + +- set a ranged hardware breakpoint + + p.version = PPC_DEBUG_CURRENT_VERSION; + p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE; + p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE; + p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE; + p.addr = (uint64_t) begin_range; + p.addr2 = (uint64_t) end_range; + p.condition_value = 0; + +3. PTRACE_DELHWDEBUG + +Takes an integer which identifies an existing breakpoint or watchpoint +(i.e., the value returned from PTRACE_SETHWDEBUG), and deletes the +corresponding breakpoint or watchpoint.. diff --git a/Documentation/s390/CommonIO b/Documentation/s390/CommonIO index 339207d11d9..d378cba6645 100644 --- a/Documentation/s390/CommonIO +++ b/Documentation/s390/CommonIO @@ -87,6 +87,12 @@ Command line parameters compatibility, by the device number in hexadecimal (0xabcd or abcd). Device numbers given as 0xabcd will be interpreted as 0.0.abcd. +* /proc/cio_settle + + A write request to this file is blocked until all queued cio actions are + handled. This will allow userspace to wait for pending work affecting + device availability after changing cio_ignore or the hardware configuration. + * For some of the information present in the /proc filesystem in 2.4 (namely, /proc/subchannels and /proc/chpids), see driver-model.txt. Information formerly in /proc/irq_count is now in /proc/interrupts. diff --git a/Documentation/s390/driver-model.txt b/Documentation/s390/driver-model.txt index bde473df748..ed265cf54cd 100644 --- a/Documentation/s390/driver-model.txt +++ b/Documentation/s390/driver-model.txt @@ -223,8 +223,8 @@ touched by the driver - it should use the ccwgroup device's driver_data for its private data. To implement a ccwgroup driver, please refer to include/asm/ccwgroup.h. Keep in -mind that most drivers will need to implement both a ccwgroup and a ccw driver -(unless you have a meta ccw driver, like cu3088 for lcs and ctc). +mind that most drivers will need to implement both a ccwgroup and a ccw +driver. 2. Channel paths diff --git a/Documentation/scsi/ChangeLog.megaraid_sas b/Documentation/scsi/ChangeLog.megaraid_sas index 17ffa060771..30023568805 100644 --- a/Documentation/scsi/ChangeLog.megaraid_sas +++ b/Documentation/scsi/ChangeLog.megaraid_sas @@ -1,3 +1,19 @@ +1 Release Date : Thur. Oct 29, 2009 09:12:45 PST 2009 - + (emaild-id:megaraidlinux@lsi.com) + Bo Yang + +2 Current Version : 00.00.04.17.1-rc1 +3 Older Version : 00.00.04.12 + +1. Add the pad_0 in mfi frame structure to 0 to fix the + context value larger than 32bit value issue. + +2. Add the logic drive list to the driver. Driver will + keep the logic drive list internal after driver load. + +3. driver fixed the device update issue after get the AEN + PD delete/ADD, LD add/delete from FW. + 1 Release Date : Tues. July 28, 2009 10:12:45 PST 2009 - (emaild-id:megaraidlinux@lsi.com) Bo Yang diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 8923597bd2b..33df82e3a39 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -482,6 +482,9 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. reference_rate - reference sample rate, 44100 or 48000 (default) multiple - multiple to ref. sample rate, 1 or 2 (default) + subsystem - override the PCI SSID for probing; the value + consists of SSVID << 16 | SSDID. The default is + zero, which means no override. This module supports multiple cards. @@ -1123,6 +1126,21 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. This module supports multiple cards, autoprobe and ISA PnP. + Module snd-jazz16 + ------------------- + + Module for Media Vision Jazz16 chipset. The chipset consists of 3 chips: + MVD1216 + MVA416 + MVA514. + + port - port # for SB DSP chip (0x210,0x220,0x230,0x240,0x250,0x260) + irq - IRQ # for SB DSP chip (3,5,7,9,10,15) + dma8 - DMA # for SB DSP chip (1,3) + dma16 - DMA # for SB DSP chip (5,7) + mpu_port - MPU-401 port # (0x300,0x310,0x320,0x330) + mpu_irq - MPU-401 irq # (2,3,5,7) + + This module supports multiple cards. + Module snd-korg1212 ------------------- @@ -1791,6 +1809,13 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. The power-management is supported. + Module snd-ua101 + ---------------- + + Module for the Edirol UA-101 audio/MIDI interface. + + This module supports multiple devices, autoprobe and hotplugging. + Module snd-usb-audio -------------------- @@ -1923,7 +1948,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. ------------------- Module for sound cards based on the Asus AV100/AV200 chips, - i.e., Xonar D1, DX, D2, D2X, HDAV1.3 (Deluxe), Essence ST + i.e., Xonar D1, DX, D2, D2X, DS, HDAV1.3 (Deluxe), Essence ST (Deluxe) and Essence STX. This module supports autoprobe and multiple cards. diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index e93affff3af..1d38b0dfba9 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt @@ -124,6 +124,8 @@ ALC882/883/885/888/889 asus-a7m ASUS A7M macpro MacPro support mb5 Macbook 5,1 + macmini3 Macmini 3,1 + mba21 Macbook Air 2,1 mbp3 Macbook Pro rev3 imac24 iMac 24'' with jack detection imac91 iMac 9,1 @@ -279,13 +281,16 @@ Conexant 5051 laptop Basic Laptop config (default) hp HP Spartan laptop hp-dv6736 HP dv6736 + hp-f700 HP Compaq Presario F700 lenovo-x200 Lenovo X200 laptop + toshiba Toshiba Satellite M300 Conexant 5066 ============= laptop Basic Laptop config (default) dell-laptop Dell laptops olpc-xo-1_5 OLPC XO 1.5 + ideapad Lenovo IdeaPad U150 STAC9200 ======== @@ -403,4 +408,5 @@ STAC9872 Cirrus Logic CS4206/4207 ======================== mbp55 MacBook Pro 5,5 + imac27 IMac 27 Inch auto BIOS setup (default) diff --git a/Documentation/sound/alsa/HD-Audio.txt b/Documentation/sound/alsa/HD-Audio.txt index 6325bec06a7..f4dd3bf99d1 100644 --- a/Documentation/sound/alsa/HD-Audio.txt +++ b/Documentation/sound/alsa/HD-Audio.txt @@ -452,6 +452,33 @@ Similarly, the lines after `[verb]` are parsed as `init_verbs` sysfs entries, and the lines after `[hint]` are parsed as `hints` sysfs entries, respectively. +Another example to override the codec vendor id from 0x12345678 to +0xdeadbeef is like below: +------------------------------------------------------------------------ + [codec] + 0x12345678 0xabcd1234 2 + + [vendor_id] + 0xdeadbeef +------------------------------------------------------------------------ + +In the similar way, you can override the codec subsystem_id via +`[subsystem_id]`, the revision id via `[revision_id]` line. +Also, the codec chip name can be rewritten via `[chip_name]` line. +------------------------------------------------------------------------ + [codec] + 0x12345678 0xabcd1234 2 + + [subsystem_id] + 0xffff1111 + + [revision_id] + 0x10 + + [chip_name] + My-own NEWS-0002 +------------------------------------------------------------------------ + The hd-audio driver reads the file via request_firmware(). Thus, a patch file has to be located on the appropriate firmware path, typically, /lib/firmware. For example, when you pass the option diff --git a/Documentation/sound/alsa/Procfile.txt b/Documentation/sound/alsa/Procfile.txt index 719a819f8cc..07301de12cc 100644 --- a/Documentation/sound/alsa/Procfile.txt +++ b/Documentation/sound/alsa/Procfile.txt @@ -95,7 +95,7 @@ card*/pcm*/xrun_debug It takes an integer value, can be changed by writing to this file, such as - # cat 5 > /proc/asound/card0/pcm0p/xrun_debug + # echo 5 > /proc/asound/card0/pcm0p/xrun_debug The value consists of the following bit flags: bit 0 = Enable XRUN/jiffies debug messages diff --git a/Documentation/stable_kernel_rules.txt b/Documentation/stable_kernel_rules.txt index a452227361b..5effa5bd993 100644 --- a/Documentation/stable_kernel_rules.txt +++ b/Documentation/stable_kernel_rules.txt @@ -26,13 +26,33 @@ Procedure for submitting patches to the -stable tree: - Send the patch, after verifying that it follows the above rules, to stable@kernel.org. + - To have the patch automatically included in the stable tree, add the + the tag + Cc: stable@kernel.org + in the sign-off area. Once the patch is merged it will be applied to + the stable tree without anything else needing to be done by the author + or subsystem maintainer. + - If the patch requires other patches as prerequisites which can be + cherry-picked than this can be specified in the following format in + the sign-off area: + + Cc: <stable@kernel.org> # .32.x: a1f84a3: sched: Check for idle + Cc: <stable@kernel.org> # .32.x: 1b9508f: sched: Rate-limit newidle + Cc: <stable@kernel.org> # .32.x: fd21073: sched: Fix affinity logic + Cc: <stable@kernel.org> # .32.x + Signed-off-by: Ingo Molnar <mingo@elte.hu> + + The tag sequence has the meaning of: + git cherry-pick a1f84a3 + git cherry-pick 1b9508f + git cherry-pick fd21073 + git cherry-pick <this commit> + - The sender will receive an ACK when the patch has been accepted into the queue, or a NAK if the patch is rejected. This response might take a few days, according to the developer's schedules. - If accepted, the patch will be added to the -stable queue, for review by other developers and by the relevant subsystem maintainer. - - If the stable@kernel.org address is added to a patch, when it goes into - Linus's tree it will automatically be emailed to the stable team. - Security patches should not be sent to this alias, but instead to the documented security@kernel.org address. diff --git a/Documentation/trace/events-kmem.txt b/Documentation/trace/events-kmem.txt index 6ef2a8652e1..aa82ee4a5a8 100644 --- a/Documentation/trace/events-kmem.txt +++ b/Documentation/trace/events-kmem.txt @@ -1,7 +1,7 @@ Subsystem Trace Points: kmem -The tracing system kmem captures events related to object and page allocation -within the kernel. Broadly speaking there are four major subheadings. +The kmem tracing system captures events related to object and page allocation +within the kernel. Broadly speaking there are five major subheadings. o Slab allocation of small objects of unknown type (kmalloc) o Slab allocation of small objects of known type @@ -9,7 +9,7 @@ within the kernel. Broadly speaking there are four major subheadings. o Per-CPU Allocator Activity o External Fragmentation -This document will describe what each of the tracepoints are and why they +This document describes what each of the tracepoints is and why they might be useful. 1. Slab allocation of small objects of unknown type @@ -34,7 +34,7 @@ kmem_cache_free call_site=%lx ptr=%p These events are similar in usage to the kmalloc-related events except that it is likely easier to pin the event down to a specific cache. At the time of writing, no information is available on what slab is being allocated from, -but the call_site can usually be used to extrapolate that information +but the call_site can usually be used to extrapolate that information. 3. Page allocation ================== @@ -80,9 +80,9 @@ event indicating whether it is for a percpu_refill or not. When the per-CPU list is too full, a number of pages are freed, each one which triggers a mm_page_pcpu_drain event. -The individual nature of the events are so that pages can be tracked +The individual nature of the events is so that pages can be tracked between allocation and freeing. A number of drain or refill pages that occur -consecutively imply the zone->lock being taken once. Large amounts of PCP +consecutively imply the zone->lock being taken once. Large amounts of per-CPU refills and drains could imply an imbalance between CPUs where too much work is being concentrated in one place. It could also indicate that the per-CPU lists should be a larger size. Finally, large amounts of refills on one CPU @@ -102,6 +102,6 @@ is important. Large numbers of this event implies that memory is fragmenting and high-order allocations will start failing at some time in the future. One -means of reducing the occurange of this event is to increase the size of +means of reducing the occurrence of this event is to increase the size of min_free_kbytes in increments of 3*pageblock_size*nr_online_nodes where pageblock_size is usually the size of the default hugepage size. diff --git a/Documentation/trace/ftrace-design.txt b/Documentation/trace/ftrace-design.txt index 641a1ef2a7f..f1f81afee8a 100644 --- a/Documentation/trace/ftrace-design.txt +++ b/Documentation/trace/ftrace-design.txt @@ -1,5 +1,6 @@ function tracer guts ==================== + By Mike Frysinger Introduction ------------ @@ -53,14 +54,14 @@ size of the mcount call that is embedded in the function). For example, if the function foo() calls bar(), when the bar() function calls mcount(), the arguments mcount() will pass to the tracer are: "frompc" - the address bar() will use to return to foo() - "selfpc" - the address bar() (with _mcount() size adjustment) + "selfpc" - the address bar() (with mcount() size adjustment) Also keep in mind that this mcount function will be called *a lot*, so optimizing for the default case of no tracer will help the smooth running of your system when tracing is disabled. So the start of the mcount function is -typically the bare min with checking things before returning. That also means -the code flow should usually kept linear (i.e. no branching in the nop case). -This is of course an optimization and not a hard requirement. +typically the bare minimum with checking things before returning. That also +means the code flow should usually be kept linear (i.e. no branching in the nop +case). This is of course an optimization and not a hard requirement. Here is some pseudo code that should help (these functions should actually be implemented in assembly): @@ -131,10 +132,10 @@ some functions to save (hijack) and restore the return address. The mcount function should check the function pointers ftrace_graph_return (compare to ftrace_stub) and ftrace_graph_entry (compare to -ftrace_graph_entry_stub). If either of those are not set to the relevant stub +ftrace_graph_entry_stub). If either of those is not set to the relevant stub function, call the arch-specific function ftrace_graph_caller which in turn calls the arch-specific function prepare_ftrace_return. Neither of these -function names are strictly required, but you should use them anyways to stay +function names is strictly required, but you should use them anyway to stay consistent across the architecture ports -- easier to compare & contrast things. @@ -144,7 +145,7 @@ but the first argument should be a pointer to the "frompc". Typically this is located on the stack. This allows the function to hijack the return address temporarily to have it point to the arch-specific function return_to_handler. That function will simply call the common ftrace_return_to_handler function and -that will return the original return address with which, you can return to the +that will return the original return address with which you can return to the original call site. Here is the updated mcount pseudo code: @@ -173,14 +174,16 @@ void ftrace_graph_caller(void) unsigned long *frompc = &...; unsigned long selfpc = <return address> - MCOUNT_INSN_SIZE; - prepare_ftrace_return(frompc, selfpc); + /* passing frame pointer up is optional -- see below */ + prepare_ftrace_return(frompc, selfpc, frame_pointer); /* restore all state needed by the ABI */ } #endif -For information on how to implement prepare_ftrace_return(), simply look at -the x86 version. The only architecture-specific piece in it is the setup of +For information on how to implement prepare_ftrace_return(), simply look at the +x86 version (the frame pointer passing is optional; see the next section for +more information). The only architecture-specific piece in it is the setup of the fault recovery table (the asm(...) code). The rest should be the same across architectures. @@ -205,6 +208,23 @@ void return_to_handler(void) #endif +HAVE_FUNCTION_GRAPH_FP_TEST +--------------------------- + +An arch may pass in a unique value (frame pointer) to both the entering and +exiting of a function. On exit, the value is compared and if it does not +match, then it will panic the kernel. This is largely a sanity check for bad +code generation with gcc. If gcc for your port sanely updates the frame +pointer under different opitmization levels, then ignore this option. + +However, adding support for it isn't terribly difficult. In your assembly code +that calls prepare_ftrace_return(), pass the frame pointer as the 3rd argument. +Then in the C version of that function, do what the x86 port does and pass it +along to ftrace_push_return_trace() instead of a stub value of 0. + +Similarly, when you call ftrace_return_to_handler(), pass it the frame pointer. + + HAVE_FTRACE_NMI_ENTER --------------------- @@ -218,11 +238,10 @@ HAVE_SYSCALL_TRACEPOINTS You need very few things to get the syscalls tracing in an arch. +- Support HAVE_ARCH_TRACEHOOK (see arch/Kconfig). - Have a NR_syscalls variable in <asm/unistd.h> that provides the number of syscalls supported by the arch. -- Implement arch_syscall_addr() that resolves a syscall address from a - syscall number. -- Support the TIF_SYSCALL_TRACEPOINT thread flags +- Support the TIF_SYSCALL_TRACEPOINT thread flags. - Put the trace_sys_enter() and trace_sys_exit() tracepoints calls from ptrace in the ptrace syscalls tracing path. - Tag this arch as HAVE_SYSCALL_TRACEPOINTS. diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 8179692fbb9..bab3040da54 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -1625,7 +1625,7 @@ If I am only interested in sys_nanosleep and hrtimer_interrupt: # echo sys_nanosleep hrtimer_interrupt \ > set_ftrace_filter - # echo ftrace > current_tracer + # echo function > current_tracer # echo 1 > tracing_enabled # usleep 1 # echo 0 > tracing_enabled diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt index 47aabeebbdf..a9100b28eb8 100644 --- a/Documentation/trace/kprobetrace.txt +++ b/Documentation/trace/kprobetrace.txt @@ -24,6 +24,7 @@ Synopsis of kprobe_events ------------------------- p[:[GRP/]EVENT] SYMBOL[+offs]|MEMADDR [FETCHARGS] : Set a probe r[:[GRP/]EVENT] SYMBOL[+0] [FETCHARGS] : Set a return probe + -:[GRP/]EVENT : Clear a probe GRP : Group name. If omitted, use "kprobes" for it. EVENT : Event name. If omitted, the event name is generated @@ -37,15 +38,12 @@ Synopsis of kprobe_events @SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol) $stackN : Fetch Nth entry of stack (N >= 0) $stack : Fetch stack address. - $argN : Fetch function argument. (N >= 0)(*) - $retval : Fetch return value.(**) - +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(***) + $retval : Fetch return value.(*) + +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**) NAME=FETCHARG: Set NAME as the argument name of FETCHARG. - (*) aN may not correct on asmlinkaged functions and at the middle of - function body. - (**) only for return probe. - (***) this is useful for fetching a field of data structures. + (*) only for return probe. + (**) this is useful for fetching a field of data structures. Per-Probe Event Filtering @@ -82,13 +80,16 @@ Usage examples To add a probe as a new event, write a new definition to kprobe_events as below. - echo p:myprobe do_sys_open dfd=$arg0 filename=$arg1 flags=$arg2 mode=$arg3 > /sys/kernel/debug/tracing/kprobe_events + echo 'p:myprobe do_sys_open dfd=%ax filename=%dx flags=%cx mode=+4($stack)' > /sys/kernel/debug/tracing/kprobe_events This sets a kprobe on the top of do_sys_open() function with recording -1st to 4th arguments as "myprobe" event. As this example shows, users can -choose more familiar names for each arguments. +1st to 4th arguments as "myprobe" event. Note, which register/stack entry is +assigned to each function argument depends on arch-specific ABI. If you unsure +the ABI, please try to use probe subcommand of perf-tools (you can find it +under tools/perf/). +As this example shows, users can choose more familiar names for each arguments. - echo r:myretprobe do_sys_open $retval >> /sys/kernel/debug/tracing/kprobe_events + echo 'r:myretprobe do_sys_open $retval' >> /sys/kernel/debug/tracing/kprobe_events This sets a kretprobe on the return point of do_sys_open() function with recording return value as "myretprobe" event. @@ -97,23 +98,24 @@ recording return value as "myretprobe" event. cat /sys/kernel/debug/tracing/events/kprobes/myprobe/format name: myprobe -ID: 75 +ID: 780 format: - field:unsigned short common_type; offset:0; size:2; - field:unsigned char common_flags; offset:2; size:1; - field:unsigned char common_preempt_count; offset:3; size:1; - field:int common_pid; offset:4; size:4; - field:int common_tgid; offset:8; size:4; + field:unsigned short common_type; offset:0; size:2; signed:0; + field:unsigned char common_flags; offset:2; size:1; signed:0; + field:unsigned char common_preempt_count; offset:3; size:1;signed:0; + field:int common_pid; offset:4; size:4; signed:1; + field:int common_lock_depth; offset:8; size:4; signed:1; - field: unsigned long ip; offset:16;tsize:8; - field: int nargs; offset:24;tsize:4; - field: unsigned long dfd; offset:32;tsize:8; - field: unsigned long filename; offset:40;tsize:8; - field: unsigned long flags; offset:48;tsize:8; - field: unsigned long mode; offset:56;tsize:8; + field:unsigned long __probe_ip; offset:12; size:4; signed:0; + field:int __probe_nargs; offset:16; size:4; signed:1; + field:unsigned long dfd; offset:20; size:4; signed:0; + field:unsigned long filename; offset:24; size:4; signed:0; + field:unsigned long flags; offset:28; size:4; signed:0; + field:unsigned long mode; offset:32; size:4; signed:0; -print fmt: "(%lx) dfd=%lx filename=%lx flags=%lx mode=%lx", REC->ip, REC->dfd, REC->filename, REC->flags, REC->mode +print fmt: "(%lx) dfd=%lx filename=%lx flags=%lx mode=%lx", REC->__probe_ip, +REC->dfd, REC->filename, REC->flags, REC->mode You can see that the event has 4 arguments as in the expressions you specified. @@ -121,6 +123,12 @@ print fmt: "(%lx) dfd=%lx filename=%lx flags=%lx mode=%lx", REC->ip, REC->dfd, R This clears all probe points. + Or, + + echo -:myprobe >> kprobe_events + + This clears probe points selectively. + Right after definition, each event is disabled by default. For tracing these events, you need to enable it. @@ -146,4 +154,3 @@ events, you need to enable it. returns from SYMBOL(e.g. "sys_open+0x1b/0x1d <- do_sys_open" means kernel returns from do_sys_open to sys_open+0x1b). - diff --git a/Documentation/trace/mmiotrace.txt b/Documentation/trace/mmiotrace.txt index 162effbfbde..664e7386d89 100644 --- a/Documentation/trace/mmiotrace.txt +++ b/Documentation/trace/mmiotrace.txt @@ -44,7 +44,8 @@ Check for lost events. Usage ----- -Make sure debugfs is mounted to /sys/kernel/debug. If not, (requires root privileges) +Make sure debugfs is mounted to /sys/kernel/debug. +If not (requires root privileges): $ mount -t debugfs debugfs /sys/kernel/debug Check that the driver you are about to trace is not loaded. @@ -91,7 +92,7 @@ $ dmesg > dmesg.txt $ tar zcf pciid-nick-mmiotrace.tar.gz mydump.txt lspci.txt dmesg.txt and then send the .tar.gz file. The trace compresses considerably. Replace "pciid" and "nick" with the PCI ID or model name of your piece of hardware -under investigation and your nick name. +under investigation and your nickname. How Mmiotrace Works @@ -100,7 +101,7 @@ How Mmiotrace Works Access to hardware IO-memory is gained by mapping addresses from PCI bus by calling one of the ioremap_*() functions. Mmiotrace is hooked into the __ioremap() function and gets called whenever a mapping is created. Mapping is -an event that is recorded into the trace log. Note, that ISA range mappings +an event that is recorded into the trace log. Note that ISA range mappings are not caught, since the mapping always exists and is returned directly. MMIO accesses are recorded via page faults. Just before __ioremap() returns, @@ -122,11 +123,11 @@ Trace Log Format ---------------- The raw log is text and easily filtered with e.g. grep and awk. One record is -one line in the log. A record starts with a keyword, followed by keyword -dependant arguments. Arguments are separated by a space, or continue until the +one line in the log. A record starts with a keyword, followed by keyword- +dependent arguments. Arguments are separated by a space, or continue until the end of line. The format for version 20070824 is as follows: -Explanation Keyword Space separated arguments +Explanation Keyword Space-separated arguments --------------------------------------------------------------------------- read event R width, timestamp, map id, physical, value, PC, PID @@ -136,7 +137,7 @@ iounmap event UNMAP timestamp, map id, PC, PID marker MARK timestamp, text version VERSION the string "20070824" info for reader LSPCI one line from lspci -v -PCI address map PCIDEV space separated /proc/bus/pci/devices data +PCI address map PCIDEV space-separated /proc/bus/pci/devices data unk. opcode UNKNOWN timestamp, map id, physical, data, PC, PID Timestamp is in seconds with decimals. Physical is a PCI bus address, virtual diff --git a/Documentation/trace/ring-buffer-design.txt b/Documentation/trace/ring-buffer-design.txt index 5b1d23d604c..d299ff31df5 100644 --- a/Documentation/trace/ring-buffer-design.txt +++ b/Documentation/trace/ring-buffer-design.txt @@ -33,9 +33,9 @@ head_page - a pointer to the page that the reader will use next tail_page - a pointer to the page that will be written to next -commit_page - a pointer to the page with the last finished non nested write. +commit_page - a pointer to the page with the last finished non-nested write. -cmpxchg - hardware assisted atomic transaction that performs the following: +cmpxchg - hardware-assisted atomic transaction that performs the following: A = B iff previous A == C @@ -52,15 +52,15 @@ The Generic Ring Buffer The ring buffer can be used in either an overwrite mode or in producer/consumer mode. -Producer/consumer mode is where the producer were to fill up the +Producer/consumer mode is where if the producer were to fill up the buffer before the consumer could free up anything, the producer will stop writing to the buffer. This will lose most recent events. -Overwrite mode is where the produce were to fill up the buffer +Overwrite mode is where if the producer were to fill up the buffer before the consumer could free up anything, the producer will overwrite the older data. This will lose the oldest events. -No two writers can write at the same time (on the same per cpu buffer), +No two writers can write at the same time (on the same per-cpu buffer), but a writer may interrupt another writer, but it must finish writing before the previous writer may continue. This is very important to the algorithm. The writers act like a "stack". The way interrupts works @@ -79,16 +79,16 @@ the interrupt doing a write as well. Readers can happen at any time. But no two readers may run at the same time, nor can a reader preempt/interrupt another reader. A reader -can not preempt/interrupt a writer, but it may read/consume from the +cannot preempt/interrupt a writer, but it may read/consume from the buffer at the same time as a writer is writing, but the reader must be on another processor to do so. A reader may read on its own processor and can be preempted by a writer. -A writer can preempt a reader, but a reader can not preempt a writer. +A writer can preempt a reader, but a reader cannot preempt a writer. But a reader can read the buffer at the same time (on another processor) as a writer. -The ring buffer is made up of a list of pages held together by a link list. +The ring buffer is made up of a list of pages held together by a linked list. At initialization a reader page is allocated for the reader that is not part of the ring buffer. @@ -102,7 +102,7 @@ the head page. The reader has its own page to use. At start up time, this page is allocated but is not attached to the list. When the reader wants -to read from the buffer, if its page is empty (like it is on start up) +to read from the buffer, if its page is empty (like it is on start-up), it will swap its page with the head_page. The old reader page will become part of the ring buffer and the head_page will be removed. The page after the inserted page (old reader_page) will become the @@ -206,7 +206,7 @@ The main pointers: commit page - the page that last finished a write. -The commit page only is updated by the outer most writer in the +The commit page only is updated by the outermost writer in the writer stack. A writer that preempts another writer will not move the commit page. @@ -281,7 +281,7 @@ with the previous write. The commit pointer points to the last write location that was committed without preempting another write. When a write that preempted another write is committed, it only becomes a pending commit -and will not be a full commit till all writes have been committed. +and will not be a full commit until all writes have been committed. The commit page points to the page that has the last full commit. The tail page points to the page with the last write (before @@ -292,7 +292,7 @@ be several pages ahead. If the tail page catches up to the commit page then no more writes may take place (regardless of the mode of the ring buffer: overwrite and produce/consumer). -The order of pages are: +The order of pages is: head page commit page @@ -311,7 +311,7 @@ Possible scenario: There is a special case that the head page is after either the commit page and possibly the tail page. That is when the commit (and tail) page has been swapped with the reader page. This is because the head page is always -part of the ring buffer, but the reader page is not. When ever there +part of the ring buffer, but the reader page is not. Whenever there has been less than a full page that has been committed inside the ring buffer, and a reader swaps out a page, it will be swapping out the commit page. @@ -338,7 +338,7 @@ and a reader swaps out a page, it will be swapping out the commit page. In this case, the head page will not move when the tail and commit move back into the ring buffer. -The reader can not swap a page into the ring buffer if the commit page +The reader cannot swap a page into the ring buffer if the commit page is still on that page. If the read meets the last commit (real commit not pending or reserved), then there is nothing more to read. The buffer is considered empty until another full commit finishes. @@ -395,7 +395,7 @@ The main idea behind the lockless algorithm is to combine the moving of the head_page pointer with the swapping of pages with the reader. State flags are placed inside the pointer to the page. To do this, each page must be aligned in memory by 4 bytes. This will allow the 2 -least significant bits of the address to be used as flags. Since +least significant bits of the address to be used as flags, since they will always be zero for the address. To get the address, simply mask out the flags. @@ -460,7 +460,7 @@ When the reader tries to swap the page with the ring buffer, it will also use cmpxchg. If the flag bit in the pointer to the head page does not have the HEADER flag set, the compare will fail and the reader will need to look for the new head page and try again. -Note, the flag UPDATE and HEADER are never set at the same time. +Note, the flags UPDATE and HEADER are never set at the same time. The reader swaps the reader page as follows: @@ -539,7 +539,7 @@ updated to the reader page. | +-----------------------------+ | +------------------------------------+ -Another important point. The page that the reader page points back to +Another important point: The page that the reader page points back to by its previous pointer (the one that now points to the new head page) never points back to the reader page. That is because the reader page is not part of the ring buffer. Traversing the ring buffer via the next pointers @@ -572,7 +572,7 @@ not be able to swap the head page from the buffer, nor will it be able to move the head page, until the writer is finished with the move. This eliminates any races that the reader can have on the writer. The reader -must spin, and this is why the reader can not preempt the writer. +must spin, and this is why the reader cannot preempt the writer. tail page | @@ -659,9 +659,9 @@ before pushing the head page. If it is, then it can be assumed that the tail page wrapped the buffer, and we must drop new writes. This is not a race condition, because the commit page can only be moved -by the outter most writer (the writer that was preempted). +by the outermost writer (the writer that was preempted). This means that the commit will not move while a writer is moving the -tail page. The reader can not swap the reader page if it is also being +tail page. The reader cannot swap the reader page if it is also being used as the commit page. The reader can simply check that the commit is off the reader page. Once the commit page leaves the reader page it will never go back on it unless a reader does another swap with the @@ -733,7 +733,7 @@ The write converts the head page pointer to UPDATE. --->| |<---| |<---| |<---| |<--- +---+ +---+ +---+ +---+ -But if a nested writer preempts here. It will see that the next +But if a nested writer preempts here, it will see that the next page is a head page, but it is also nested. It will detect that it is nested and will save that information. The detection is the fact that it sees the UPDATE flag instead of a HEADER or NORMAL @@ -761,7 +761,7 @@ to NORMAL. --->| |<---| |<---| |<---| |<--- +---+ +---+ +---+ +---+ -After the nested writer finishes, the outer most writer will convert +After the nested writer finishes, the outermost writer will convert the UPDATE pointer to NORMAL. @@ -812,7 +812,7 @@ head page. +---+ +---+ +---+ +---+ The nested writer moves the tail page forward. But does not set the old -update page to NORMAL because it is not the outer most writer. +update page to NORMAL because it is not the outermost writer. tail page | @@ -892,7 +892,7 @@ It will return to the first writer. --->| |<---| |<---| |<---| |<--- +---+ +---+ +---+ +---+ -The first writer can not know atomically test if the tail page moved +The first writer cannot know atomically if the tail page moved while it updates the HEAD page. It will then update the head page to what it thinks is the new head page. @@ -923,9 +923,9 @@ if the tail page is either where it use to be or on the next page: --->| |<---| |<---| |<---| |<--- +---+ +---+ +---+ +---+ -If tail page != A and tail page does not equal B, then it must reset the -pointer back to NORMAL. The fact that it only needs to worry about -nested writers, it only needs to check this after setting the HEAD page. +If tail page != A and tail page != B, then it must reset the pointer +back to NORMAL. The fact that it only needs to worry about nested +writers means that it only needs to check this after setting the HEAD page. (first writer) @@ -939,7 +939,7 @@ nested writers, it only needs to check this after setting the HEAD page. +---+ +---+ +---+ +---+ Now the writer can update the head page. This is also why the head page must -remain in UPDATE and only reset by the outer most writer. This prevents +remain in UPDATE and only reset by the outermost writer. This prevents the reader from seeing the incorrect head page. diff --git a/Documentation/trace/tracepoint-analysis.txt b/Documentation/trace/tracepoint-analysis.txt index 5eb4e487e66..87bee3c129b 100644 --- a/Documentation/trace/tracepoint-analysis.txt +++ b/Documentation/trace/tracepoint-analysis.txt @@ -10,8 +10,8 @@ Tracepoints (see Documentation/trace/tracepoints.txt) can be used without creating custom kernel modules to register probe functions using the event tracing infrastructure. -Simplistically, tracepoints will represent an important event that when can -be taken in conjunction with other tracepoints to build a "Big Picture" of +Simplistically, tracepoints represent important events that can be +taken in conjunction with other tracepoints to build a "Big Picture" of what is going on within the system. There are a large number of methods for gathering and interpreting these events. Lacking any current Best Practises, this document describes some of the methods that can be used. @@ -33,12 +33,12 @@ calling will give a fair indication of the number of events available. -2.2 PCL +2.2 PCL (Performance Counters for Linux) ------- -Discovery and enumeration of all counters and events, including tracepoints +Discovery and enumeration of all counters and events, including tracepoints, are available with the perf tool. Getting a list of available events is a -simple case of +simple case of: $ perf list 2>&1 | grep Tracepoint ext4:ext4_free_inode [Tracepoint event] @@ -49,19 +49,19 @@ simple case of [ .... remaining output snipped .... ] -2. Enabling Events +3. Enabling Events ================== -2.1 System-Wide Event Enabling +3.1 System-Wide Event Enabling ------------------------------ See Documentation/trace/events.txt for a proper description on how events can be enabled system-wide. A short example of enabling all events related -to page allocation would look something like +to page allocation would look something like: $ for i in `find /sys/kernel/debug/tracing/events -name "enable" | grep mm_`; do echo 1 > $i; done -2.2 System-Wide Event Enabling with SystemTap +3.2 System-Wide Event Enabling with SystemTap --------------------------------------------- In SystemTap, tracepoints are accessible using the kernel.trace() function @@ -86,7 +86,7 @@ were allocating the pages. print_count() } -2.3 System-Wide Event Enabling with PCL +3.3 System-Wide Event Enabling with PCL --------------------------------------- By specifying the -a switch and analysing sleep, the system-wide events @@ -107,16 +107,16 @@ for a duration of time can be examined. Similarly, one could execute a shell and exit it as desired to get a report at that point. -2.4 Local Event Enabling +3.4 Local Event Enabling ------------------------ Documentation/trace/ftrace.txt describes how to enable events on a per-thread basis using set_ftrace_pid. -2.5 Local Event Enablement with PCL +3.5 Local Event Enablement with PCL ----------------------------------- -Events can be activate and tracked for the duration of a process on a local +Events can be activated and tracked for the duration of a process on a local basis using PCL such as follows. $ perf stat -e kmem:mm_page_alloc -e kmem:mm_page_free_direct \ @@ -131,18 +131,18 @@ basis using PCL such as follows. 0.973913387 seconds time elapsed -3. Event Filtering +4. Event Filtering ================== Documentation/trace/ftrace.txt covers in-depth how to filter events in ftrace. Obviously using grep and awk of trace_pipe is an option as well as any script reading trace_pipe. -4. Analysing Event Variances with PCL +5. Analysing Event Variances with PCL ===================================== Any workload can exhibit variances between runs and it can be important -to know what the standard deviation in. By and large, this is left to the +to know what the standard deviation is. By and large, this is left to the performance analyst to do it by hand. In the event that the discrete event occurrences are useful to the performance analyst, then perf can be used. @@ -166,7 +166,7 @@ In the event that some higher-level event is required that depends on some aggregation of discrete events, then a script would need to be developed. Using --repeat, it is also possible to view how events are fluctuating over -time on a system wide basis using -a and sleep. +time on a system-wide basis using -a and sleep. $ perf stat -e kmem:mm_page_alloc -e kmem:mm_page_free_direct \ -e kmem:mm_pagevec_free \ @@ -180,7 +180,7 @@ time on a system wide basis using -a and sleep. 1.002251757 seconds time elapsed ( +- 0.005% ) -5. Higher-Level Analysis with Helper Scripts +6. Higher-Level Analysis with Helper Scripts ============================================ When events are enabled the events that are triggering can be read from @@ -190,11 +190,11 @@ be gathered on-line as appropriate. Examples of post-processing might include o Reading information from /proc for the PID that triggered the event o Deriving a higher-level event from a series of lower-level events. - o Calculate latencies between two events + o Calculating latencies between two events Documentation/trace/postprocess/trace-pagealloc-postprocess.pl is an example script that can read trace_pipe from STDIN or a copy of a trace. When used -on-line, it can be interrupted once to generate a report without existing +on-line, it can be interrupted once to generate a report without exiting and twice to exit. Simplistically, the script just reads STDIN and counts up events but it @@ -212,12 +212,12 @@ also can do more such as processes, the parent process responsible for creating all the helpers can be identified -6. Lower-Level Analysis with PCL +7. Lower-Level Analysis with PCL ================================ -There may also be a requirement to identify what functions with a program +There may also be a requirement to identify what functions within a program were generating events within the kernel. To begin this sort of analysis, the -data must be recorded. At the time of writing, this required root +data must be recorded. At the time of writing, this required root: $ perf record -c 1 \ -e kmem:mm_page_alloc -e kmem:mm_page_free_direct \ @@ -253,11 +253,11 @@ perf report. # (For more details, try: perf report --sort comm,dso,symbol) # -According to this, the vast majority of events occured triggered on events -within the VDSO. With simple binaries, this will often be the case so lets +According to this, the vast majority of events triggered on events +within the VDSO. With simple binaries, this will often be the case so let's take a slightly different example. In the course of writing this, it was -noticed that X was generating an insane amount of page allocations so lets look -at it +noticed that X was generating an insane amount of page allocations so let's look +at it: $ perf record -c 1 -f \ -e kmem:mm_page_alloc -e kmem:mm_page_free_direct \ @@ -280,8 +280,8 @@ This was interrupted after a few seconds and # (For more details, try: perf report --sort comm,dso,symbol) # -So, almost half of the events are occuring in a library. To get an idea which -symbol. +So, almost half of the events are occurring in a library. To get an idea which +symbol: $ perf report --sort comm,dso,symbol # Samples: 27666 @@ -297,7 +297,7 @@ symbol. 0.01% Xorg /opt/gfx-test/lib/libpixman-1.so.0.13.1 [.] get_fast_path 0.00% Xorg [kernel] [k] ftrace_trace_userstack -To see where within the function pixmanFillsse2 things are going wrong +To see where within the function pixmanFillsse2 things are going wrong: $ perf annotate pixmanFillsse2 [ ... ] diff --git a/Documentation/usb/error-codes.txt b/Documentation/usb/error-codes.txt index 9cf83e8c27b..d83703ea74b 100644 --- a/Documentation/usb/error-codes.txt +++ b/Documentation/usb/error-codes.txt @@ -41,8 +41,8 @@ USB-specific: -EFBIG Host controller driver can't schedule that many ISO frames. --EPIPE Specified endpoint is stalled. For non-control endpoints, - reset this status with usb_clear_halt(). +-EPIPE The pipe type specified in the URB doesn't match the + endpoint's actual type. -EMSGSIZE (a) endpoint maxpacket size is zero; it is not usable in the current interface altsetting. @@ -60,6 +60,8 @@ USB-specific: -EHOSTUNREACH URB was rejected because the device is suspended. +-ENOEXEC A control URB doesn't contain a Setup packet. + ************************************************************************** * Error codes returned by in urb->status * diff --git a/Documentation/usb/power-management.txt b/Documentation/usb/power-management.txt index c7c1dc2f801..2790ad48cfc 100644 --- a/Documentation/usb/power-management.txt +++ b/Documentation/usb/power-management.txt @@ -2,7 +2,7 @@ Alan Stern <stern@rowland.harvard.edu> - November 10, 2009 + December 11, 2009 @@ -29,9 +29,9 @@ covered to some extent (see Documentation/power/*.txt for more information about system PM). Note: Dynamic PM support for USB is present only if the kernel was -built with CONFIG_USB_SUSPEND enabled. System PM support is present -only if the kernel was built with CONFIG_SUSPEND or CONFIG_HIBERNATION -enabled. +built with CONFIG_USB_SUSPEND enabled (which depends on +CONFIG_PM_RUNTIME). System PM support is present only if the kernel +was built with CONFIG_SUSPEND or CONFIG_HIBERNATION enabled. What is Remote Wakeup? @@ -71,12 +71,10 @@ being accessed through sysfs, then it definitely is idle. Forms of dynamic PM ------------------- -Dynamic suspends can occur in two ways: manual and automatic. -"Manual" means that the user has told the kernel to suspend a device, -whereas "automatic" means that the kernel has decided all by itself to -suspend a device. Automatic suspend is called "autosuspend" for -short. In general, a device won't be autosuspended unless it has been -idle for some minimum period of time, the so-called idle-delay time. +Dynamic suspends occur when the kernel decides to suspend an idle +device. This is called "autosuspend" for short. In general, a device +won't be autosuspended unless it has been idle for some minimum period +of time, the so-called idle-delay time. Of course, nothing the kernel does on its own initiative should prevent the computer or its devices from working properly. If a @@ -96,10 +94,11 @@ idle. We can categorize power management events in two broad classes: external and internal. External events are those triggered by some agent outside the USB stack: system suspend/resume (triggered by -userspace), manual dynamic suspend/resume (also triggered by -userspace), and remote wakeup (triggered by the device). Internal -events are those triggered within the USB stack: autosuspend and -autoresume. +userspace), manual dynamic resume (also triggered by userspace), and +remote wakeup (triggered by the device). Internal events are those +triggered within the USB stack: autosuspend and autoresume. Note that +all dynamic suspend events are internal; external agents are not +allowed to issue dynamic suspends. The user interface for dynamic PM @@ -145,9 +144,9 @@ relevant attribute files are: wakeup, level, and autosuspend. number of seconds the device should remain idle before the kernel will autosuspend it (the idle-delay time). The default is 2. 0 means to autosuspend as soon as - the device becomes idle, and -1 means never to - autosuspend. You can write a number to the file to - change the autosuspend idle-delay time. + the device becomes idle, and negative values mean + never to autosuspend. You can write a number to the + file to change the autosuspend idle-delay time. Writing "-1" to power/autosuspend and writing "on" to power/level do essentially the same thing -- they both prevent the device from being @@ -230,6 +229,11 @@ necessary operations by hand or add them to a udev script. You can also change the idle-delay time; 2 seconds is not the best choice for every device. +If a driver knows that its device has proper suspend/resume support, +it can enable autosuspend all by itself. For example, the video +driver for a laptop's webcam might do this, since these devices are +rarely used and so should normally be autosuspended. + Sometimes it turns out that even when a device does work okay with autosuspend there are still problems. For example, there are experimental patches adding autosuspend support to the usbhid driver, @@ -322,69 +326,81 @@ driver does so by calling these six functions: void usb_autopm_get_interface_no_resume(struct usb_interface *intf); void usb_autopm_put_interface_no_suspend(struct usb_interface *intf); -The functions work by maintaining a counter in the usb_interface -structure. When intf->pm_usage_count is > 0 then the interface is -deemed to be busy, and the kernel will not autosuspend the interface's -device. When intf->pm_usage_count is <= 0 then the interface is -considered to be idle, and the kernel may autosuspend the device. +The functions work by maintaining a usage counter in the +usb_interface's embedded device structure. When the counter is > 0 +then the interface is deemed to be busy, and the kernel will not +autosuspend the interface's device. When the usage counter is = 0 +then the interface is considered to be idle, and the kernel may +autosuspend the device. -(There is a similar pm_usage_count field in struct usb_device, +(There is a similar usage counter field in struct usb_device, associated with the device itself rather than any of its interfaces. -This field is used only by the USB core.) - -Drivers must not modify intf->pm_usage_count directly; its value -should be changed only be using the functions listed above. Drivers -are responsible for insuring that the overall change to pm_usage_count -during their lifetime balances out to 0 (it may be necessary for the -disconnect method to call usb_autopm_put_interface() one or more times -to fulfill this requirement). The first two routines use the PM mutex -in struct usb_device for mutual exclusion; drivers using the async -routines are responsible for their own synchronization and mutual -exclusion. - - usb_autopm_get_interface() increments pm_usage_count and - attempts an autoresume if the new value is > 0 and the - device is suspended. - - usb_autopm_put_interface() decrements pm_usage_count and - attempts an autosuspend if the new value is <= 0 and the - device isn't suspended. +This counter is used only by the USB core.) + +Drivers need not be concerned about balancing changes to the usage +counter; the USB core will undo any remaining "get"s when a driver +is unbound from its interface. As a corollary, drivers must not call +any of the usb_autopm_* functions after their diconnect() routine has +returned. + +Drivers using the async routines are responsible for their own +synchronization and mutual exclusion. + + usb_autopm_get_interface() increments the usage counter and + does an autoresume if the device is suspended. If the + autoresume fails, the counter is decremented back. + + usb_autopm_put_interface() decrements the usage counter and + attempts an autosuspend if the new value is = 0. usb_autopm_get_interface_async() and usb_autopm_put_interface_async() do almost the same things as - their non-async counterparts. The differences are: they do - not acquire the PM mutex, and they use a workqueue to do their + their non-async counterparts. The big difference is that they + use a workqueue to do the resume or suspend part of their jobs. As a result they can be called in an atomic context, such as an URB's completion handler, but when they return the - device will not generally not yet be in the desired state. + device will generally not yet be in the desired state. usb_autopm_get_interface_no_resume() and usb_autopm_put_interface_no_suspend() merely increment or - decrement the pm_usage_count value; they do not attempt to - carry out an autoresume or an autosuspend. Hence they can be - called in an atomic context. + decrement the usage counter; they do not attempt to carry out + an autoresume or an autosuspend. Hence they can be called in + an atomic context. -The conventional usage pattern is that a driver calls +The simplest usage pattern is that a driver calls usb_autopm_get_interface() in its open routine and -usb_autopm_put_interface() in its close or release routine. But -other patterns are possible. +usb_autopm_put_interface() in its close or release routine. But other +patterns are possible. The autosuspend attempts mentioned above will often fail for one reason or another. For example, the power/level attribute might be set to "on", or another interface in the same device might not be idle. This is perfectly normal. If the reason for failure was that -the device hasn't been idle for long enough, a delayed workqueue -routine is automatically set up to carry out the operation when the -autosuspend idle-delay has expired. +the device hasn't been idle for long enough, a timer is scheduled to +carry out the operation automatically when the autosuspend idle-delay +has expired. -Autoresume attempts also can fail. This will happen if power/level is -set to "suspend" or if the device doesn't manage to resume properly. -Unlike autosuspend, there's no delay for an autoresume. +Autoresume attempts also can fail, although failure would mean that +the device is no longer present or operating properly. Unlike +autosuspend, there's no idle-delay for an autoresume. Other parts of the driver interface ----------------------------------- +Drivers can enable autosuspend for their devices by calling + + usb_enable_autosuspend(struct usb_device *udev); + +in their probe() routine, if they know that the device is capable of +suspending and resuming correctly. This is exactly equivalent to +writing "auto" to the device's power/level attribute. Likewise, +drivers can disable autosuspend by calling + + usb_disable_autosuspend(struct usb_device *udev); + +This is exactly the same as writing "on" to the power/level attribute. + Sometimes a driver needs to make sure that remote wakeup is enabled during autosuspend. For example, there's not much point autosuspending a keyboard if the user can't cause the keyboard to do a @@ -396,26 +412,27 @@ though, setting this flag won't cause the kernel to autoresume it. Normally a driver would set this flag in its probe method, at which time the device is guaranteed not to be autosuspended.) -The synchronous usb_autopm_* routines have to run in a sleepable -process context; they must not be called from an interrupt handler or -while holding a spinlock. In fact, the entire autosuspend mechanism -is not well geared toward interrupt-driven operation. However there -is one thing a driver can do in an interrupt handler: +If a driver does its I/O asynchronously in interrupt context, it +should call usb_autopm_get_interface_async() before starting output and +usb_autopm_put_interface_async() when the output queue drains. When +it receives an input event, it should call usb_mark_last_busy(struct usb_device *udev); -This sets udev->last_busy to the current time. udev->last_busy is the -field used for idle-delay calculations; updating it will cause any -pending autosuspend to be moved back. The usb_autopm_* routines will -also set the last_busy field to the current time. - -Calling urb_mark_last_busy() from within an URB completion handler is -subject to races: The kernel may have just finished deciding the -device has been idle for long enough but not yet gotten around to -calling the driver's suspend method. The driver would have to be -responsible for synchronizing its suspend method with its URB -completion handler and causing the autosuspend to fail with -EBUSY if -an URB had completed too recently. +in the event handler. This sets udev->last_busy to the current time. +udev->last_busy is the field used for idle-delay calculations; +updating it will cause any pending autosuspend to be moved back. Most +of the usb_autopm_* routines will also set the last_busy field to the +current time. + +Asynchronous operation is always subject to races. For example, a +driver may call one of the usb_autopm_*_interface_async() routines at +a time when the core has just finished deciding the device has been +idle for long enough but not yet gotten around to calling the driver's +suspend method. The suspend method must be responsible for +synchronizing with the output request routine and the URB completion +handler; it should cause autosuspends to fail with -EBUSY if the +driver needs to use the device. External suspend calls should never be allowed to fail in this way, only autosuspend calls. The driver can tell them apart by checking @@ -423,75 +440,23 @@ the PM_EVENT_AUTO bit in the message.event argument to the suspend method; this bit will be set for internal PM events (autosuspend) and clear for external PM events. -Many of the ingredients in the autosuspend framework are oriented -towards interfaces: The usb_interface structure contains the -pm_usage_cnt field, and the usb_autopm_* routines take an interface -pointer as their argument. But somewhat confusingly, a few of the -pieces (i.e., usb_mark_last_busy()) use the usb_device structure -instead. Drivers need to keep this straight; they can call -interface_to_usbdev() to find the device structure for a given -interface. - - Locking requirements - -------------------- + Mutual exclusion + ---------------- -All three suspend/resume methods are always called while holding the -usb_device's PM mutex. For external events -- but not necessarily for -autosuspend or autoresume -- the device semaphore (udev->dev.sem) will -also be held. This implies that external suspend/resume events are -mutually exclusive with calls to probe, disconnect, pre_reset, and -post_reset; the USB core guarantees that this is true of internal -suspend/resume events as well. +For external events -- but not necessarily for autosuspend or +autoresume -- the device semaphore (udev->dev.sem) will be held when a +suspend or resume method is called. This implies that external +suspend/resume events are mutually exclusive with calls to probe, +disconnect, pre_reset, and post_reset; the USB core guarantees that +this is true of autosuspend/autoresume events as well. If a driver wants to block all suspend/resume calls during some -critical section, it can simply acquire udev->pm_mutex. Note that -calls to resume may be triggered indirectly. Block IO due to memory -allocations can make the vm subsystem resume a device. Thus while -holding this lock you must not allocate memory with GFP_KERNEL or -GFP_NOFS. - -Alternatively, if the critical section might call some of the -usb_autopm_* routines, the driver can avoid deadlock by doing: - - down(&udev->dev.sem); - rc = usb_autopm_get_interface(intf); - -and at the end of the critical section: - - if (!rc) - usb_autopm_put_interface(intf); - up(&udev->dev.sem); - -Holding the device semaphore will block all external PM calls, and the -usb_autopm_get_interface() will prevent any internal PM calls, even if -it fails. (Exercise: Why?) - -The rules for locking order are: - - Never acquire any device semaphore while holding any PM mutex. - - Never acquire udev->pm_mutex while holding the PM mutex for - a device that isn't a descendant of udev. - -In other words, PM mutexes should only be acquired going up the device -tree, and they should be acquired only after locking all the device -semaphores you need to hold. These rules don't matter to drivers very -much; they usually affect just the USB core. - -Still, drivers do need to be careful. For example, many drivers use a -private mutex to synchronize their normal I/O activities with their -disconnect method. Now if the driver supports autosuspend then it -must call usb_autopm_put_interface() from somewhere -- maybe from its -close method. It should make the call while holding the private mutex, -since a driver shouldn't call any of the usb_autopm_* functions for an -interface from which it has been unbound. - -But the usb_autpm_* routines always acquire the device's PM mutex, and -consequently the locking order has to be: private mutex first, PM -mutex second. Since the suspend method is always called with the PM -mutex held, it mustn't try to acquire the private mutex. It has to -synchronize with the driver's I/O activities in some other way. +critical section, the best way is to lock the device and call +usb_autopm_get_interface() (and do the reverse at the end of the +critical section). Holding the device semaphore will block all +external PM calls, and the usb_autopm_get_interface() will prevent any +internal PM calls, even if it fails. (Exercise: Why?) Interaction between dynamic PM and system PM @@ -500,22 +465,11 @@ synchronize with the driver's I/O activities in some other way. Dynamic power management and system power management can interact in a couple of ways. -Firstly, a device may already be manually suspended or autosuspended -when a system suspend occurs. Since system suspends are supposed to -be as transparent as possible, the device should remain suspended -following the system resume. The 2.6.23 kernel obeys this principle -for manually suspended devices but not for autosuspended devices; they -do get resumed when the system wakes up. (Presumably they will be -autosuspended again after their idle-delay time expires.) In later -kernels this behavior will be fixed. - -(There is an exception. If a device would undergo a reset-resume -instead of a normal resume, and the device is enabled for remote -wakeup, then the reset-resume takes place even if the device was -already suspended when the system suspend began. The justification is -that a reset-resume is a kind of remote-wakeup event. Or to put it -another way, a device which needs a reset won't be able to generate -normal remote-wakeup signals, so it ought to be resumed immediately.) +Firstly, a device may already be autosuspended when a system suspend +occurs. Since system suspends are supposed to be as transparent as +possible, the device should remain suspended following the system +resume. But this theory may not work out well in practice; over time +the kernel's behavior in this regard has changed. Secondly, a dynamic power-management event may occur as a system suspend is underway. The window for this is short, since system @@ -527,13 +481,3 @@ succeed, it may still remain active and thus cause the system to resume as soon as the system suspend is complete. Or the remote wakeup may fail and get lost. Which outcome occurs depends on timing and on the hardware and firmware design. - -More interestingly, a device might undergo a manual resume or -autoresume during system suspend. With current kernels this shouldn't -happen, because manual resumes must be initiated by userspace and -autoresumes happen in response to I/O requests, but all user processes -and I/O should be quiescent during a system suspend -- thanks to the -freezer. However there are plans to do away with the freezer, which -would mean these things would become possible. If and when this comes -about, the USB core will carefully arrange matters so that either type -of resume will block until the entire system has resumed. diff --git a/Documentation/vgaarbiter.txt b/Documentation/vgaarbiter.txt index 987f9b0a5ec..43a9b0694fd 100644 --- a/Documentation/vgaarbiter.txt +++ b/Documentation/vgaarbiter.txt @@ -103,7 +103,7 @@ I.2 libpciaccess ---------------- To use the vga arbiter char device it was implemented an API inside the -libpciaccess library. One fieldd was added to struct pci_device (each device +libpciaccess library. One field was added to struct pci_device (each device on the system): /* the type of resource decoded by the device */ diff --git a/Documentation/video4linux/CARDLIST.cx23885 b/Documentation/video4linux/CARDLIST.cx23885 index 7539e8fa1ff..16ca030e118 100644 --- a/Documentation/video4linux/CARDLIST.cx23885 +++ b/Documentation/video4linux/CARDLIST.cx23885 @@ -26,3 +26,4 @@ 25 -> Compro VideoMate E800 [1858:e800] 26 -> Hauppauge WinTV-HVR1290 [0070:8551] 27 -> Mygica X8558 PRO DMB-TH [14f1:8578] + 28 -> LEADTEK WinFast PxTV1200 [107d:6f22] diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index fce1e7eb047..b4a767060ed 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -174,3 +174,4 @@ 173 -> Zolid Hybrid TV Tuner PCI [1131:2004] 174 -> Asus Europa Hybrid OEM [1043:4847] 175 -> Leadtek Winfast DTV1000S [107d:6655] +176 -> Beholder BeholdTV 505 RDS [0000:5051] diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner index e0d298fe883..9b2e0dd6017 100644 --- a/Documentation/video4linux/CARDLIST.tuner +++ b/Documentation/video4linux/CARDLIST.tuner @@ -81,3 +81,4 @@ tuner=80 - Philips FQ1216LME MK3 PAL/SECAM w/active loopthrough tuner=81 - Partsnic (Daewoo) PTI-5NF05 tuner=82 - Philips CU1216L tuner=83 - NXP TDA18271 +tuner=84 - Sony BTF-Pxn01Z diff --git a/Documentation/video4linux/README.tlg2300 b/Documentation/video4linux/README.tlg2300 new file mode 100644 index 00000000000..416ccb93d8c --- /dev/null +++ b/Documentation/video4linux/README.tlg2300 @@ -0,0 +1,47 @@ +tlg2300 release notes +==================== + +This is a v4l2/dvb device driver for the tlg2300 chip. + + +current status +============== + +video + - support mmap and read().(no overlay) + +audio + - The driver will register a ALSA card for the audio input. + +vbi + - Works for almost TV norms. + +dvb-t + - works for DVB-T + +FM + - Works for radio. + +--------------------------------------------------------------------------- +TESTED APPLICATIONS: + +-VLC1.0.4 test the video and dvb. The GUI is friendly to use. + +-Mplayer test the video. + +-Mplayer test the FM. The mplayer should be compiled with --enable-radio and + --enable-radio-capture. + The command runs as this(The alsa audio registers to card 1): + #mplayer radio://103.7/capture/ -radio adevice=hw=1,0:arate=48000 \ + -rawaudio rate=48000:channels=2 + +--------------------------------------------------------------------------- +KNOWN PROBLEMS: +about preemphasis: + You can set the preemphasis for radio by the following command: + #v4l2-ctl -d /dev/radio0 --set-ctrl=pre_emphasis_settings=1 + + "pre_emphasis_settings=1" means that you select the 50us. If you want + to select the 75us, please use "pre_emphasis_settings=2" + + diff --git a/Documentation/video4linux/gspca.txt b/Documentation/video4linux/gspca.txt index 1800a62cf13..181b9e6fd98 100644 --- a/Documentation/video4linux/gspca.txt +++ b/Documentation/video4linux/gspca.txt @@ -42,6 +42,7 @@ ov519 041e:4064 Creative Live! VISTA VF0420 ov519 041e:4067 Creative Live! Cam Video IM (VF0350) ov519 041e:4068 Creative Live! VISTA VF0470 spca561 0458:7004 Genius VideoCAM Express V2 +sn9c2028 0458:7005 Genius Smart 300, version 2 sunplus 0458:7006 Genius Dsc 1.3 Smart zc3xx 0458:7007 Genius VideoCam V2 zc3xx 0458:700c Genius VideoCam V3 @@ -109,6 +110,7 @@ sunplus 04a5:3003 Benq DC 1300 sunplus 04a5:3008 Benq DC 1500 sunplus 04a5:300a Benq DC 3410 spca500 04a5:300c Benq DC 1016 +benq 04a5:3035 Benq DC E300 finepix 04cb:0104 Fujifilm FinePix 4800 finepix 04cb:0109 Fujifilm FinePix A202 finepix 04cb:010b Fujifilm FinePix A203 @@ -142,6 +144,7 @@ sunplus 04fc:5360 Sunplus Generic spca500 04fc:7333 PalmPixDC85 sunplus 04fc:ffff Pure DigitalDakota spca501 0506:00df 3Com HomeConnect Lite +sunplus 052b:1507 Megapixel 5 Pretec DC-1007 sunplus 052b:1513 Megapix V4 sunplus 052b:1803 MegaImage VI tv8532 0545:808b Veo Stingray @@ -151,6 +154,7 @@ sunplus 0546:3191 Polaroid Ion 80 sunplus 0546:3273 Polaroid PDC2030 ov519 054c:0154 Sonny toy4 ov519 054c:0155 Sonny toy5 +cpia1 0553:0002 CPIA CPiA (version1) based cameras zc3xx 055f:c005 Mustek Wcam300A spca500 055f:c200 Mustek Gsmart 300 sunplus 055f:c211 Kowa Bs888e Microcamera @@ -188,8 +192,7 @@ spca500 06bd:0404 Agfa CL20 spca500 06be:0800 Optimedia sunplus 06d6:0031 Trust 610 LCD PowerC@m Zoom spca506 06e1:a190 ADS Instant VCD -ov534 06f8:3002 Hercules Blog Webcam -ov534 06f8:3003 Hercules Dualpix HD Weblog +ov534_9 06f8:3003 Hercules Dualpix HD Weblog sonixj 06f8:3004 Hercules Classic Silver sonixj 06f8:3008 Hercules Deluxe Optical Glass pac7302 06f8:3009 Hercules Classic Link @@ -204,6 +207,7 @@ sunplus 0733:2221 Mercury Digital Pro 3.1p sunplus 0733:3261 Concord 3045 spca536a sunplus 0733:3281 Cyberpix S550V spca506 0734:043b 3DeMon USB Capture aka +cpia1 0813:0001 QX3 camera ov519 0813:0002 Dual Mode USB Camera Plus spca500 084d:0003 D-Link DSC-350 spca500 08ca:0103 Aiptek PocketDV @@ -225,7 +229,8 @@ sunplus 08ca:2050 Medion MD 41437 sunplus 08ca:2060 Aiptek PocketDV5300 tv8532 0923:010f ICM532 cams mars 093a:050f Mars-Semi Pc-Camera -mr97310a 093a:010f Sakar Digital no. 77379 +mr97310a 093a:010e All known CIF cams with this ID +mr97310a 093a:010f All known VGA cams with this ID pac207 093a:2460 Qtec Webcam 100 pac207 093a:2461 HP Webcam pac207 093a:2463 Philips SPC 220 NC @@ -302,6 +307,7 @@ sonixj 0c45:613b Surfer SN-206 sonixj 0c45:613c Sonix Pccam168 sonixj 0c45:6143 Sonix Pccam168 sonixj 0c45:6148 Digitus DA-70811/ZSMC USB PC Camera ZS211/Microdia +sonixj 0c45:614a Frontech E-Ccam (JIL-2225) sn9c20x 0c45:6240 PC Camera (SN9C201 + MT9M001) sn9c20x 0c45:6242 PC Camera (SN9C201 + MT9M111) sn9c20x 0c45:6248 PC Camera (SN9C201 + OV9655) @@ -324,6 +330,10 @@ sn9c20x 0c45:62b0 PC Camera (SN9C202 + MT9V011/MT9V111/MT9V112) sn9c20x 0c45:62b3 PC Camera (SN9C202 + OV9655) sn9c20x 0c45:62bb PC Camera (SN9C202 + OV7660) sn9c20x 0c45:62bc PC Camera (SN9C202 + HV7131R) +sn9c2028 0c45:8001 Wild Planet Digital Spy Camera +sn9c2028 0c45:8003 Sakar #11199, #6637x, #67480 keychain cams +sn9c2028 0c45:8008 Mini-Shotz ms-350 +sn9c2028 0c45:800a Vivitar Vivicam 3350B sunplus 0d64:0303 Sunplus FashionCam DXG ov519 0e96:c001 TRUST 380 USB2 SPACEC@M etoms 102c:6151 Qcam Sangha CIF @@ -341,10 +351,11 @@ spca501 1776:501c Arowana 300K CMOS Camera t613 17a1:0128 TASCORP JPEG Webcam, NGS Cyclops vc032x 17ef:4802 Lenovo Vc0323+MI1310_SOC pac207 2001:f115 D-Link DSB-C120 -sq905c 2770:9050 sq905c -sq905c 2770:905c DualCamera -sq905 2770:9120 Argus Digital Camera DC1512 -sq905c 2770:913d sq905c +sq905c 2770:9050 Disney pix micro (CIF) +sq905c 2770:9052 Disney pix micro 2 (VGA) +sq905c 2770:905c All 11 known cameras with this ID +sq905 2770:9120 All 24 known cameras with this ID +sq905c 2770:913d All 4 known cameras with this ID spca500 2899:012c Toptro Industrial ov519 8020:ef04 ov519 spca508 8086:0110 Intel Easy PC Camera diff --git a/Documentation/video4linux/v4l2-framework.txt b/Documentation/video4linux/v4l2-framework.txt index 74d677c8b03..5155700c206 100644 --- a/Documentation/video4linux/v4l2-framework.txt +++ b/Documentation/video4linux/v4l2-framework.txt @@ -599,99 +599,13 @@ video_device::minor fields. video buffer helper functions ----------------------------- -The v4l2 core API provides a standard method for dealing with video -buffers. Those methods allow a driver to implement read(), mmap() and -overlay() on a consistent way. - -There are currently methods for using video buffers on devices that -supports DMA with scatter/gather method (videobuf-dma-sg), DMA with -linear access (videobuf-dma-contig), and vmalloced buffers, mostly -used on USB drivers (videobuf-vmalloc). - -Any driver using videobuf should provide operations (callbacks) for -four handlers: - -ops->buf_setup - calculates the size of the video buffers and avoid they - to waste more than some maximum limit of RAM; -ops->buf_prepare - fills the video buffer structs and calls - videobuf_iolock() to alloc and prepare mmaped memory; -ops->buf_queue - advices the driver that another buffer were - requested (by read() or by QBUF); -ops->buf_release - frees any buffer that were allocated. - -In order to use it, the driver need to have a code (generally called at -interrupt context) that will properly handle the buffer request lists, -announcing that a new buffer were filled. - -The irq handling code should handle the videobuf task lists, in order -to advice videobuf that a new frame were filled, in order to honor to a -request. The code is generally like this one: - if (list_empty(&dma_q->active)) - return; - - buf = list_entry(dma_q->active.next, struct vbuffer, vb.queue); - - if (!waitqueue_active(&buf->vb.done)) - return; - - /* Some logic to handle the buf may be needed here */ - - list_del(&buf->vb.queue); - do_gettimeofday(&buf->vb.ts); - wake_up(&buf->vb.done); - -Those are the videobuffer functions used on drivers, implemented on -videobuf-core: - -- Videobuf init functions - videobuf_queue_sg_init() - Initializes the videobuf infrastructure. This function should be - called before any other videobuf function on drivers that uses DMA - Scatter/Gather buffers. - - videobuf_queue_dma_contig_init - Initializes the videobuf infrastructure. This function should be - called before any other videobuf function on drivers that need DMA - contiguous buffers. - - videobuf_queue_vmalloc_init() - Initializes the videobuf infrastructure. This function should be - called before any other videobuf function on USB (and other drivers) - that need a vmalloced type of videobuf. - -- videobuf_iolock() - Prepares the videobuf memory for the proper method (read, mmap, overlay). - -- videobuf_queue_is_busy() - Checks if a videobuf is streaming. - -- videobuf_queue_cancel() - Stops video handling. - -- videobuf_mmap_free() - frees mmap buffers. - -- videobuf_stop() - Stops video handling, ends mmap and frees mmap and other buffers. - -- V4L2 api functions. Those functions correspond to VIDIOC_foo ioctls: - videobuf_reqbufs(), videobuf_querybuf(), videobuf_qbuf(), - videobuf_dqbuf(), videobuf_streamon(), videobuf_streamoff(). - -- V4L1 api function (corresponds to VIDIOCMBUF ioctl): - videobuf_cgmbuf() - This function is used to provide backward compatibility with V4L1 - API. - -- Some help functions for read()/poll() operations: - videobuf_read_stream() - For continuous stream read() - videobuf_read_one() - For snapshot read() - videobuf_poll_stream() - polling help function - -The better way to understand it is to take a look at vivi driver. One -of the main reasons for vivi is to be a videobuf usage example. the -vivi_thread_tick() does the task that the IRQ callback would do on PCI -drivers (or the irq callback on USB). +The v4l2 core API provides a set of standard methods (called "videobuf") +for dealing with video buffers. Those methods allow a driver to implement +read(), mmap() and overlay() in a consistent way. There are currently +methods for using video buffers on devices that supports DMA with +scatter/gather method (videobuf-dma-sg), DMA with linear access +(videobuf-dma-contig), and vmalloced buffers, mostly used on USB drivers +(videobuf-vmalloc). + +Please see Documentation/video4linux/videobuf for more information on how +to use the videobuf layer. diff --git a/Documentation/video4linux/videobuf b/Documentation/video4linux/videobuf new file mode 100644 index 00000000000..17a1f9abf26 --- /dev/null +++ b/Documentation/video4linux/videobuf @@ -0,0 +1,360 @@ +An introduction to the videobuf layer +Jonathan Corbet <corbet@lwn.net> +Current as of 2.6.33 + +The videobuf layer functions as a sort of glue layer between a V4L2 driver +and user space. It handles the allocation and management of buffers for +the storage of video frames. There is a set of functions which can be used +to implement many of the standard POSIX I/O system calls, including read(), +poll(), and, happily, mmap(). Another set of functions can be used to +implement the bulk of the V4L2 ioctl() calls related to streaming I/O, +including buffer allocation, queueing and dequeueing, and streaming +control. Using videobuf imposes a few design decisions on the driver +author, but the payback comes in the form of reduced code in the driver and +a consistent implementation of the V4L2 user-space API. + +Buffer types + +Not all video devices use the same kind of buffers. In fact, there are (at +least) three common variations: + + - Buffers which are scattered in both the physical and (kernel) virtual + address spaces. (Almost) all user-space buffers are like this, but it + makes great sense to allocate kernel-space buffers this way as well when + it is possible. Unfortunately, it is not always possible; working with + this kind of buffer normally requires hardware which can do + scatter/gather DMA operations. + + - Buffers which are physically scattered, but which are virtually + contiguous; buffers allocated with vmalloc(), in other words. These + buffers are just as hard to use for DMA operations, but they can be + useful in situations where DMA is not available but virtually-contiguous + buffers are convenient. + + - Buffers which are physically contiguous. Allocation of this kind of + buffer can be unreliable on fragmented systems, but simpler DMA + controllers cannot deal with anything else. + +Videobuf can work with all three types of buffers, but the driver author +must pick one at the outset and design the driver around that decision. + +[It's worth noting that there's a fourth kind of buffer: "overlay" buffers +which are located within the system's video memory. The overlay +functionality is considered to be deprecated for most use, but it still +shows up occasionally in system-on-chip drivers where the performance +benefits merit the use of this technique. Overlay buffers can be handled +as a form of scattered buffer, but there are very few implementations in +the kernel and a description of this technique is currently beyond the +scope of this document.] + +Data structures, callbacks, and initialization + +Depending on which type of buffers are being used, the driver should +include one of the following files: + + <media/videobuf-dma-sg.h> /* Physically scattered */ + <media/videobuf-vmalloc.h> /* vmalloc() buffers */ + <media/videobuf-dma-contig.h> /* Physically contiguous */ + +The driver's data structure describing a V4L2 device should include a +struct videobuf_queue instance for the management of the buffer queue, +along with a list_head for the queue of available buffers. There will also +need to be an interrupt-safe spinlock which is used to protect (at least) +the queue. + +The next step is to write four simple callbacks to help videobuf deal with +the management of buffers: + + struct videobuf_queue_ops { + int (*buf_setup)(struct videobuf_queue *q, + unsigned int *count, unsigned int *size); + int (*buf_prepare)(struct videobuf_queue *q, + struct videobuf_buffer *vb, + enum v4l2_field field); + void (*buf_queue)(struct videobuf_queue *q, + struct videobuf_buffer *vb); + void (*buf_release)(struct videobuf_queue *q, + struct videobuf_buffer *vb); + }; + +buf_setup() is called early in the I/O process, when streaming is being +initiated; its purpose is to tell videobuf about the I/O stream. The count +parameter will be a suggested number of buffers to use; the driver should +check it for rationality and adjust it if need be. As a practical rule, a +minimum of two buffers are needed for proper streaming, and there is +usually a maximum (which cannot exceed 32) which makes sense for each +device. The size parameter should be set to the expected (maximum) size +for each frame of data. + +Each buffer (in the form of a struct videobuf_buffer pointer) will be +passed to buf_prepare(), which should set the buffer's size, width, height, +and field fields properly. If the buffer's state field is +VIDEOBUF_NEEDS_INIT, the driver should pass it to: + + int videobuf_iolock(struct videobuf_queue* q, struct videobuf_buffer *vb, + struct v4l2_framebuffer *fbuf); + +Among other things, this call will usually allocate memory for the buffer. +Finally, the buf_prepare() function should set the buffer's state to +VIDEOBUF_PREPARED. + +When a buffer is queued for I/O, it is passed to buf_queue(), which should +put it onto the driver's list of available buffers and set its state to +VIDEOBUF_QUEUED. Note that this function is called with the queue spinlock +held; if it tries to acquire it as well things will come to a screeching +halt. Yes, this is the voice of experience. Note also that videobuf may +wait on the first buffer in the queue; placing other buffers in front of it +could again gum up the works. So use list_add_tail() to enqueue buffers. + +Finally, buf_release() is called when a buffer is no longer intended to be +used. The driver should ensure that there is no I/O active on the buffer, +then pass it to the appropriate free routine(s): + + /* Scatter/gather drivers */ + int videobuf_dma_unmap(struct videobuf_queue *q, + struct videobuf_dmabuf *dma); + int videobuf_dma_free(struct videobuf_dmabuf *dma); + + /* vmalloc drivers */ + void videobuf_vmalloc_free (struct videobuf_buffer *buf); + + /* Contiguous drivers */ + void videobuf_dma_contig_free(struct videobuf_queue *q, + struct videobuf_buffer *buf); + +One way to ensure that a buffer is no longer under I/O is to pass it to: + + int videobuf_waiton(struct videobuf_buffer *vb, int non_blocking, int intr); + +Here, vb is the buffer, non_blocking indicates whether non-blocking I/O +should be used (it should be zero in the buf_release() case), and intr +controls whether an interruptible wait is used. + +File operations + +At this point, much of the work is done; much of the rest is slipping +videobuf calls into the implementation of the other driver callbacks. The +first step is in the open() function, which must initialize the +videobuf queue. The function to use depends on the type of buffer used: + + void videobuf_queue_sg_init(struct videobuf_queue *q, + struct videobuf_queue_ops *ops, + struct device *dev, + spinlock_t *irqlock, + enum v4l2_buf_type type, + enum v4l2_field field, + unsigned int msize, + void *priv); + + void videobuf_queue_vmalloc_init(struct videobuf_queue *q, + struct videobuf_queue_ops *ops, + struct device *dev, + spinlock_t *irqlock, + enum v4l2_buf_type type, + enum v4l2_field field, + unsigned int msize, + void *priv); + + void videobuf_queue_dma_contig_init(struct videobuf_queue *q, + struct videobuf_queue_ops *ops, + struct device *dev, + spinlock_t *irqlock, + enum v4l2_buf_type type, + enum v4l2_field field, + unsigned int msize, + void *priv); + +In each case, the parameters are the same: q is the queue structure for the +device, ops is the set of callbacks as described above, dev is the device +structure for this video device, irqlock is an interrupt-safe spinlock to +protect access to the data structures, type is the buffer type used by the +device (cameras will use V4L2_BUF_TYPE_VIDEO_CAPTURE, for example), field +describes which field is being captured (often V4L2_FIELD_NONE for +progressive devices), msize is the size of any containing structure used +around struct videobuf_buffer, and priv is a private data pointer which +shows up in the priv_data field of struct videobuf_queue. Note that these +are void functions which, evidently, are immune to failure. + +V4L2 capture drivers can be written to support either of two APIs: the +read() system call and the rather more complicated streaming mechanism. As +a general rule, it is necessary to support both to ensure that all +applications have a chance of working with the device. Videobuf makes it +easy to do that with the same code. To implement read(), the driver need +only make a call to one of: + + ssize_t videobuf_read_one(struct videobuf_queue *q, + char __user *data, size_t count, + loff_t *ppos, int nonblocking); + + ssize_t videobuf_read_stream(struct videobuf_queue *q, + char __user *data, size_t count, + loff_t *ppos, int vbihack, int nonblocking); + +Either one of these functions will read frame data into data, returning the +amount actually read; the difference is that videobuf_read_one() will only +read a single frame, while videobuf_read_stream() will read multiple frames +if they are needed to satisfy the count requested by the application. A +typical driver read() implementation will start the capture engine, call +one of the above functions, then stop the engine before returning (though a +smarter implementation might leave the engine running for a little while in +anticipation of another read() call happening in the near future). + +The poll() function can usually be implemented with a direct call to: + + unsigned int videobuf_poll_stream(struct file *file, + struct videobuf_queue *q, + poll_table *wait); + +Note that the actual wait queue eventually used will be the one associated +with the first available buffer. + +When streaming I/O is done to kernel-space buffers, the driver must support +the mmap() system call to enable user space to access the data. In many +V4L2 drivers, the often-complex mmap() implementation simplifies to a +single call to: + + int videobuf_mmap_mapper(struct videobuf_queue *q, + struct vm_area_struct *vma); + +Everything else is handled by the videobuf code. + +The release() function requires two separate videobuf calls: + + void videobuf_stop(struct videobuf_queue *q); + int videobuf_mmap_free(struct videobuf_queue *q); + +The call to videobuf_stop() terminates any I/O in progress - though it is +still up to the driver to stop the capture engine. The call to +videobuf_mmap_free() will ensure that all buffers have been unmapped; if +so, they will all be passed to the buf_release() callback. If buffers +remain mapped, videobuf_mmap_free() returns an error code instead. The +purpose is clearly to cause the closing of the file descriptor to fail if +buffers are still mapped, but every driver in the 2.6.32 kernel cheerfully +ignores its return value. + +ioctl() operations + +The V4L2 API includes a very long list of driver callbacks to respond to +the many ioctl() commands made available to user space. A number of these +- those associated with streaming I/O - turn almost directly into videobuf +calls. The relevant helper functions are: + + int videobuf_reqbufs(struct videobuf_queue *q, + struct v4l2_requestbuffers *req); + int videobuf_querybuf(struct videobuf_queue *q, struct v4l2_buffer *b); + int videobuf_qbuf(struct videobuf_queue *q, struct v4l2_buffer *b); + int videobuf_dqbuf(struct videobuf_queue *q, struct v4l2_buffer *b, + int nonblocking); + int videobuf_streamon(struct videobuf_queue *q); + int videobuf_streamoff(struct videobuf_queue *q); + int videobuf_cgmbuf(struct videobuf_queue *q, struct video_mbuf *mbuf, + int count); + +So, for example, a VIDIOC_REQBUFS call turns into a call to the driver's +vidioc_reqbufs() callback which, in turn, usually only needs to locate the +proper struct videobuf_queue pointer and pass it to videobuf_reqbufs(). +These support functions can replace a great deal of buffer management +boilerplate in a lot of V4L2 drivers. + +The vidioc_streamon() and vidioc_streamoff() functions will be a bit more +complex, of course, since they will also need to deal with starting and +stopping the capture engine. videobuf_cgmbuf(), called from the driver's +vidiocgmbuf() function, only exists if the V4L1 compatibility module has +been selected with CONFIG_VIDEO_V4L1_COMPAT, so its use must be surrounded +with #ifdef directives. + +Buffer allocation + +Thus far, we have talked about buffers, but have not looked at how they are +allocated. The scatter/gather case is the most complex on this front. For +allocation, the driver can leave buffer allocation entirely up to the +videobuf layer; in this case, buffers will be allocated as anonymous +user-space pages and will be very scattered indeed. If the application is +using user-space buffers, no allocation is needed; the videobuf layer will +take care of calling get_user_pages() and filling in the scatterlist array. + +If the driver needs to do its own memory allocation, it should be done in +the vidioc_reqbufs() function, *after* calling videobuf_reqbufs(). The +first step is a call to: + + struct videobuf_dmabuf *videobuf_to_dma(struct videobuf_buffer *buf); + +The returned videobuf_dmabuf structure (defined in +<media/videobuf-dma-sg.h>) includes a couple of relevant fields: + + struct scatterlist *sglist; + int sglen; + +The driver must allocate an appropriately-sized scatterlist array and +populate it with pointers to the pieces of the allocated buffer; sglen +should be set to the length of the array. + +Drivers using the vmalloc() method need not (and cannot) concern themselves +with buffer allocation at all; videobuf will handle those details. The +same is normally true of contiguous-DMA drivers as well; videobuf will +allocate the buffers (with dma_alloc_coherent()) when it sees fit. That +means that these drivers may be trying to do high-order allocations at any +time, an operation which is not always guaranteed to work. Some drivers +play tricks by allocating DMA space at system boot time; videobuf does not +currently play well with those drivers. + +As of 2.6.31, contiguous-DMA drivers can work with a user-supplied buffer, +as long as that buffer is physically contiguous. Normal user-space +allocations will not meet that criterion, but buffers obtained from other +kernel drivers, or those contained within huge pages, will work with these +drivers. + +Filling the buffers + +The final part of a videobuf implementation has no direct callback - it's +the portion of the code which actually puts frame data into the buffers, +usually in response to interrupts from the device. For all types of +drivers, this process works approximately as follows: + + - Obtain the next available buffer and make sure that somebody is actually + waiting for it. + + - Get a pointer to the memory and put video data there. + + - Mark the buffer as done and wake up the process waiting for it. + +Step (1) above is done by looking at the driver-managed list_head structure +- the one which is filled in the buf_queue() callback. Because starting +the engine and enqueueing buffers are done in separate steps, it's possible +for the engine to be running without any buffers available - in the +vmalloc() case especially. So the driver should be prepared for the list +to be empty. It is equally possible that nobody is yet interested in the +buffer; the driver should not remove it from the list or fill it until a +process is waiting on it. That test can be done by examining the buffer's +done field (a wait_queue_head_t structure) with waitqueue_active(). + +A buffer's state should be set to VIDEOBUF_ACTIVE before being mapped for +DMA; that ensures that the videobuf layer will not try to do anything with +it while the device is transferring data. + +For scatter/gather drivers, the needed memory pointers will be found in the +scatterlist structure described above. Drivers using the vmalloc() method +can get a memory pointer with: + + void *videobuf_to_vmalloc(struct videobuf_buffer *buf); + +For contiguous DMA drivers, the function to use is: + + dma_addr_t videobuf_to_dma_contig(struct videobuf_buffer *buf); + +The contiguous DMA API goes out of its way to hide the kernel-space address +of the DMA buffer from drivers. + +The final step is to set the size field of the relevant videobuf_buffer +structure to the actual size of the captured image, set state to +VIDEOBUF_DONE, then call wake_up() on the done queue. At this point, the +buffer is owned by the videobuf layer and the driver should not touch it +again. + +Developers who are interested in more information can go into the relevant +header files; there are a few low-level functions declared there which have +not been talked about here. Also worthwhile is the vivi driver +(drivers/media/video/vivi.c), which is maintained as an example of how V4L2 +drivers should be written. Vivi only uses the vmalloc() API, but it's good +enough to get started with. Note also that all of these calls are exported +GPL-only, so they will not be available to non-GPL kernel modules. diff --git a/Documentation/vm/slub.txt b/Documentation/vm/slub.txt index b37300edf27..07375e73981 100644 --- a/Documentation/vm/slub.txt +++ b/Documentation/vm/slub.txt @@ -41,6 +41,7 @@ Possible debug options are P Poisoning (object and padding) U User tracking (free and alloc) T Trace (please only use on single slabs) + A Toggle failslab filter mark for the cache O Switch debugging off for caches that would have caused higher minimum slab orders - Switch all debugging off (useful if the kernel is diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt index 29a6ff8bc7d..7fbbaf85f5b 100644 --- a/Documentation/x86/x86_64/boot-options.txt +++ b/Documentation/x86/x86_64/boot-options.txt @@ -166,19 +166,13 @@ NUMA numa=noacpi Don't parse the SRAT table for NUMA setup - numa=fake=CMDLINE - If a number, fakes CMDLINE nodes and ignores NUMA setup of the - actual machine. Otherwise, system memory is configured - depending on the sizes and coefficients listed. For example: - numa=fake=2*512,1024,4*256,*128 - gives two 512M nodes, a 1024M node, four 256M nodes, and the - rest split into 128M chunks. If the last character of CMDLINE - is a *, the remaining memory is divided up equally among its - coefficient: - numa=fake=2*512,2* - gives two 512M nodes and the rest split into two nodes. - Otherwise, the remaining system RAM is allocated to an - additional node. + numa=fake=<size>[MG] + If given as a memory unit, fills all system RAM with nodes of + size interleaved over physical nodes. + + numa=fake=<N> + If given as an integer, fills all system RAM with N fake nodes + interleaved over physical nodes. ACPI |