diff options
Diffstat (limited to 'Documentation')
27 files changed, 1470 insertions, 219 deletions
diff --git a/Documentation/Changes b/Documentation/Changes index 27232be26e1..783ddc3ce4e 100644 --- a/Documentation/Changes +++ b/Documentation/Changes @@ -65,7 +65,7 @@ o isdn4k-utils 3.1pre1 # isdnctrl 2>&1|grep version o nfs-utils 1.0.5 # showmount --version o procps 3.2.0 # ps --version o oprofile 0.9 # oprofiled --version -o udev 058 # udevinfo -V +o udev 071 # udevinfo -V Kernel compilation ================== diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl index d650ce36485..4d9b66d8b4d 100644 --- a/Documentation/DocBook/kernel-api.tmpl +++ b/Documentation/DocBook/kernel-api.tmpl @@ -286,7 +286,9 @@ X!Edrivers/pci/search.c --> !Edrivers/pci/msi.c !Edrivers/pci/bus.c -!Edrivers/pci/hotplug.c +<!-- FIXME: Removed for now since no structured comments in source +X!Edrivers/pci/hotplug.c +--> !Edrivers/pci/probe.c !Edrivers/pci/rom.c </sect1> diff --git a/Documentation/DocBook/libata.tmpl b/Documentation/DocBook/libata.tmpl index 375ae760dc1..d260d92089a 100644 --- a/Documentation/DocBook/libata.tmpl +++ b/Documentation/DocBook/libata.tmpl @@ -415,6 +415,362 @@ and other resources, etc. </sect1> </chapter> + <chapter id="libataEH"> + <title>Error handling</title> + + <para> + This chapter describes how errors are handled under libata. + Readers are advised to read SCSI EH + (Documentation/scsi/scsi_eh.txt) and ATA exceptions doc first. + </para> + + <sect1><title>Origins of commands</title> + <para> + In libata, a command is represented with struct ata_queued_cmd + or qc. qc's are preallocated during port initialization and + repetitively used for command executions. Currently only one + qc is allocated per port but yet-to-be-merged NCQ branch + allocates one for each tag and maps each qc to NCQ tag 1-to-1. + </para> + <para> + libata commands can originate from two sources - libata itself + and SCSI midlayer. libata internal commands are used for + initialization and error handling. All normal blk requests + and commands for SCSI emulation are passed as SCSI commands + through queuecommand callback of SCSI host template. + </para> + </sect1> + + <sect1><title>How commands are issued</title> + + <variablelist> + + <varlistentry><term>Internal commands</term> + <listitem> + <para> + First, qc is allocated and initialized using + ata_qc_new_init(). Although ata_qc_new_init() doesn't + implement any wait or retry mechanism when qc is not + available, internal commands are currently issued only during + initialization and error recovery, so no other command is + active and allocation is guaranteed to succeed. + </para> + <para> + Once allocated qc's taskfile is initialized for the command to + be executed. qc currently has two mechanisms to notify + completion. One is via qc->complete_fn() callback and the + other is completion qc->waiting. qc->complete_fn() callback + is the asynchronous path used by normal SCSI translated + commands and qc->waiting is the synchronous (issuer sleeps in + process context) path used by internal commands. + </para> + <para> + Once initialization is complete, host_set lock is acquired + and the qc is issued. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>SCSI commands</term> + <listitem> + <para> + All libata drivers use ata_scsi_queuecmd() as + hostt->queuecommand callback. scmds can either be simulated + or translated. No qc is involved in processing a simulated + scmd. The result is computed right away and the scmd is + completed. + </para> + <para> + For a translated scmd, ata_qc_new_init() is invoked to + allocate a qc and the scmd is translated into the qc. SCSI + midlayer's completion notification function pointer is stored + into qc->scsidone. + </para> + <para> + qc->complete_fn() callback is used for completion + notification. ATA commands use ata_scsi_qc_complete() while + ATAPI commands use atapi_qc_complete(). Both functions end up + calling qc->scsidone to notify upper layer when the qc is + finished. After translation is completed, the qc is issued + with ata_qc_issue(). + </para> + <para> + Note that SCSI midlayer invokes hostt->queuecommand while + holding host_set lock, so all above occur while holding + host_set lock. + </para> + </listitem> + </varlistentry> + + </variablelist> + </sect1> + + <sect1><title>How commands are processed</title> + <para> + Depending on which protocol and which controller are used, + commands are processed differently. For the purpose of + discussion, a controller which uses taskfile interface and all + standard callbacks is assumed. + </para> + <para> + Currently 6 ATA command protocols are used. They can be + sorted into the following four categories according to how + they are processed. + </para> + + <variablelist> + <varlistentry><term>ATA NO DATA or DMA</term> + <listitem> + <para> + ATA_PROT_NODATA and ATA_PROT_DMA fall into this category. + These types of commands don't require any software + intervention once issued. Device will raise interrupt on + completion. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>ATA PIO</term> + <listitem> + <para> + ATA_PROT_PIO is in this category. libata currently + implements PIO with polling. ATA_NIEN bit is set to turn + off interrupt and pio_task on ata_wq performs polling and + IO. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>ATAPI NODATA or DMA</term> + <listitem> + <para> + ATA_PROT_ATAPI_NODATA and ATA_PROT_ATAPI_DMA are in this + category. packet_task is used to poll BSY bit after + issuing PACKET command. Once BSY is turned off by the + device, packet_task transfers CDB and hands off processing + to interrupt handler. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>ATAPI PIO</term> + <listitem> + <para> + ATA_PROT_ATAPI is in this category. ATA_NIEN bit is set + and, as in ATAPI NODATA or DMA, packet_task submits cdb. + However, after submitting cdb, further processing (data + transfer) is handed off to pio_task. + </para> + </listitem> + </varlistentry> + </variablelist> + </sect1> + + <sect1><title>How commands are completed</title> + <para> + Once issued, all qc's are either completed with + ata_qc_complete() or time out. For commands which are handled + by interrupts, ata_host_intr() invokes ata_qc_complete(), and, + for PIO tasks, pio_task invokes ata_qc_complete(). In error + cases, packet_task may also complete commands. + </para> + <para> + ata_qc_complete() does the following. + </para> + + <orderedlist> + + <listitem> + <para> + DMA memory is unmapped. + </para> + </listitem> + + <listitem> + <para> + ATA_QCFLAG_ACTIVE is clared from qc->flags. + </para> + </listitem> + + <listitem> + <para> + qc->complete_fn() callback is invoked. If the return value of + the callback is not zero. Completion is short circuited and + ata_qc_complete() returns. + </para> + </listitem> + + <listitem> + <para> + __ata_qc_complete() is called, which does + <orderedlist> + + <listitem> + <para> + qc->flags is cleared to zero. + </para> + </listitem> + + <listitem> + <para> + ap->active_tag and qc->tag are poisoned. + </para> + </listitem> + + <listitem> + <para> + qc->waiting is claread & completed (in that order). + </para> + </listitem> + + <listitem> + <para> + qc is deallocated by clearing appropriate bit in ap->qactive. + </para> + </listitem> + + </orderedlist> + </para> + </listitem> + + </orderedlist> + + <para> + So, it basically notifies upper layer and deallocates qc. One + exception is short-circuit path in #3 which is used by + atapi_qc_complete(). + </para> + <para> + For all non-ATAPI commands, whether it fails or not, almost + the same code path is taken and very little error handling + takes place. A qc is completed with success status if it + succeeded, with failed status otherwise. + </para> + <para> + However, failed ATAPI commands require more handling as + REQUEST SENSE is needed to acquire sense data. If an ATAPI + command fails, ata_qc_complete() is invoked with error status, + which in turn invokes atapi_qc_complete() via + qc->complete_fn() callback. + </para> + <para> + This makes atapi_qc_complete() set scmd->result to + SAM_STAT_CHECK_CONDITION, complete the scmd and return 1. As + the sense data is empty but scmd->result is CHECK CONDITION, + SCSI midlayer will invoke EH for the scmd, and returning 1 + makes ata_qc_complete() to return without deallocating the qc. + This leads us to ata_scsi_error() with partially completed qc. + </para> + + </sect1> + + <sect1><title>ata_scsi_error()</title> + <para> + ata_scsi_error() is the current hostt->eh_strategy_handler() + for libata. As discussed above, this will be entered in two + cases - timeout and ATAPI error completion. This function + calls low level libata driver's eng_timeout() callback, the + standard callback for which is ata_eng_timeout(). It checks + if a qc is active and calls ata_qc_timeout() on the qc if so. + Actual error handling occurs in ata_qc_timeout(). + </para> + <para> + If EH is invoked for timeout, ata_qc_timeout() stops BMDMA and + completes the qc. Note that as we're currently in EH, we + cannot call scsi_done. As described in SCSI EH doc, a + recovered scmd should be either retried with + scsi_queue_insert() or finished with scsi_finish_command(). + Here, we override qc->scsidone with scsi_finish_command() and + calls ata_qc_complete(). + </para> + <para> + If EH is invoked due to a failed ATAPI qc, the qc here is + completed but not deallocated. The purpose of this + half-completion is to use the qc as place holder to make EH + code reach this place. This is a bit hackish, but it works. + </para> + <para> + Once control reaches here, the qc is deallocated by invoking + __ata_qc_complete() explicitly. Then, internal qc for REQUEST + SENSE is issued. Once sense data is acquired, scmd is + finished by directly invoking scsi_finish_command() on the + scmd. Note that as we already have completed and deallocated + the qc which was associated with the scmd, we don't need + to/cannot call ata_qc_complete() again. + </para> + + </sect1> + + <sect1><title>Problems with the current EH</title> + + <itemizedlist> + + <listitem> + <para> + Error representation is too crude. Currently any and all + error conditions are represented with ATA STATUS and ERROR + registers. Errors which aren't ATA device errors are treated + as ATA device errors by setting ATA_ERR bit. Better error + descriptor which can properly represent ATA and other + errors/exceptions is needed. + </para> + </listitem> + + <listitem> + <para> + When handling timeouts, no action is taken to make device + forget about the timed out command and ready for new commands. + </para> + </listitem> + + <listitem> + <para> + EH handling via ata_scsi_error() is not properly protected + from usual command processing. On EH entrance, the device is + not in quiescent state. Timed out commands may succeed or + fail any time. pio_task and atapi_task may still be running. + </para> + </listitem> + + <listitem> + <para> + Too weak error recovery. Devices / controllers causing HSM + mismatch errors and other errors quite often require reset to + return to known state. Also, advanced error handling is + necessary to support features like NCQ and hotplug. + </para> + </listitem> + + <listitem> + <para> + ATA errors are directly handled in the interrupt handler and + PIO errors in pio_task. This is problematic for advanced + error handling for the following reasons. + </para> + <para> + First, advanced error handling often requires context and + internal qc execution. + </para> + <para> + Second, even a simple failure (say, CRC error) needs + information gathering and could trigger complex error handling + (say, resetting & reconfiguring). Having multiple code + paths to gather information, enter EH and trigger actions + makes life painful. + </para> + <para> + Third, scattered EH code makes implementing low level drivers + difficult. Low level drivers override libata callbacks. If + EH is scattered over several places, each affected callbacks + should perform its part of error handling. This can be error + prone and painful. + </para> + </listitem> + + </itemizedlist> + </sect1> + </chapter> + <chapter id="libataExt"> <title>libata Library</title> !Edrivers/scsi/libata-core.c @@ -431,6 +787,722 @@ and other resources, etc. !Idrivers/scsi/libata-scsi.c </chapter> + <chapter id="ataExceptions"> + <title>ATA errors & exceptions</title> + + <para> + This chapter tries to identify what error/exception conditions exist + for ATA/ATAPI devices and describe how they should be handled in + implementation-neutral way. + </para> + + <para> + The term 'error' is used to describe conditions where either an + explicit error condition is reported from device or a command has + timed out. + </para> + + <para> + The term 'exception' is either used to describe exceptional + conditions which are not errors (say, power or hotplug events), or + to describe both errors and non-error exceptional conditions. Where + explicit distinction between error and exception is necessary, the + term 'non-error exception' is used. + </para> + + <sect1 id="excat"> + <title>Exception categories</title> + <para> + Exceptions are described primarily with respect to legacy + taskfile + bus master IDE interface. If a controller provides + other better mechanism for error reporting, mapping those into + categories described below shouldn't be difficult. + </para> + + <para> + In the following sections, two recovery actions - reset and + reconfiguring transport - are mentioned. These are described + further in <xref linkend="exrec"/>. + </para> + + <sect2 id="excatHSMviolation"> + <title>HSM violation</title> + <para> + This error is indicated when STATUS value doesn't match HSM + requirement during issuing or excution any ATA/ATAPI command. + </para> + + <itemizedlist> + <title>Examples</title> + + <listitem> + <para> + ATA_STATUS doesn't contain !BSY && DRDY && !DRQ while trying + to issue a command. + </para> + </listitem> + + <listitem> + <para> + !BSY && !DRQ during PIO data transfer. + </para> + </listitem> + + <listitem> + <para> + DRQ on command completion. + </para> + </listitem> + + <listitem> + <para> + !BSY && ERR after CDB tranfer starts but before the + last byte of CDB is transferred. ATA/ATAPI standard states + that "The device shall not terminate the PACKET command + with an error before the last byte of the command packet has + been written" in the error outputs description of PACKET + command and the state diagram doesn't include such + transitions. + </para> + </listitem> + + </itemizedlist> + + <para> + In these cases, HSM is violated and not much information + regarding the error can be acquired from STATUS or ERROR + register. IOW, this error can be anything - driver bug, + faulty device, controller and/or cable. + </para> + + <para> + As HSM is violated, reset is necessary to restore known state. + Reconfiguring transport for lower speed might be helpful too + as transmission errors sometimes cause this kind of errors. + </para> + </sect2> + + <sect2 id="excatDevErr"> + <title>ATA/ATAPI device error (non-NCQ / non-CHECK CONDITION)</title> + + <para> + These are errors detected and reported by ATA/ATAPI devices + indicating device problems. For this type of errors, STATUS + and ERROR register values are valid and describe error + condition. Note that some of ATA bus errors are detected by + ATA/ATAPI devices and reported using the same mechanism as + device errors. Those cases are described later in this + section. + </para> + + <para> + For ATA commands, this type of errors are indicated by !BSY + && ERR during command execution and on completion. + </para> + + <para>For ATAPI commands,</para> + + <itemizedlist> + + <listitem> + <para> + !BSY && ERR && ABRT right after issuing PACKET + indicates that PACKET command is not supported and falls in + this category. + </para> + </listitem> + + <listitem> + <para> + !BSY && ERR(==CHK) && !ABRT after the last + byte of CDB is transferred indicates CHECK CONDITION and + doesn't fall in this category. + </para> + </listitem> + + <listitem> + <para> + !BSY && ERR(==CHK) && ABRT after the last byte + of CDB is transferred *probably* indicates CHECK CONDITION and + doesn't fall in this category. + </para> + </listitem> + + </itemizedlist> + + <para> + Of errors detected as above, the followings are not ATA/ATAPI + device errors but ATA bus errors and should be handled + according to <xref linkend="excatATAbusErr"/>. + </para> + + <variablelist> + + <varlistentry> + <term>CRC error during data transfer</term> + <listitem> + <para> + This is indicated by ICRC bit in the ERROR register and + means that corruption occurred during data transfer. Upto + ATA/ATAPI-7, the standard specifies that this bit is only + applicable to UDMA transfers but ATA/ATAPI-8 draft revision + 1f says that the bit may be applicable to multiword DMA and + PIO. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>ABRT error during data transfer or on completion</term> + <listitem> + <para> + Upto ATA/ATAPI-7, the standard specifies that ABRT could be + set on ICRC errors and on cases where a device is not able + to complete a command. Combined with the fact that MWDMA + and PIO transfer errors aren't allowed to use ICRC bit upto + ATA/ATAPI-7, it seems to imply that ABRT bit alone could + indicate tranfer errors. + </para> + <para> + However, ATA/ATAPI-8 draft revision 1f removes the part + that ICRC errors can turn on ABRT. So, this is kind of + gray area. Some heuristics are needed here. + </para> + </listitem> + </varlistentry> + + </variablelist> + + <para> + ATA/ATAPI device errors can be further categorized as follows. + </para> + + <variablelist> + + <varlistentry> + <term>Media errors</term> + <listitem> + <para> + This is indicated by UNC bit in the ERROR register. ATA + devices reports UNC error only after certain number of + retries cannot recover the data, so there's nothing much + else to do other than notifying upper layer. + </para> + <para> + READ and WRITE commands report CHS or LBA of the first + failed sector but ATA/ATAPI standard specifies that the + amount of transferred data on error completion is + indeterminate, so we cannot assume that sectors preceding + the failed sector have been transferred and thus cannot + complete those sectors successfully as SCSI does. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Media changed / media change requested error</term> + <listitem> + <para> + <<TODO: fill here>> + </para> + </listitem> + </varlistentry> + + <varlistentry><term>Address error</term> + <listitem> + <para> + This is indicated by IDNF bit in the ERROR register. + Report to upper layer. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>Other errors</term> + <listitem> + <para> + This can be invalid command or parameter indicated by ABRT + ERROR bit or some other error condition. Note that ABRT + bit can indicate a lot of things including ICRC and Address + errors. Heuristics needed. + </para> + </listitem> + </varlistentry> + + </variablelist> + + <para> + Depending on commands, not all STATUS/ERROR bits are + applicable. These non-applicable bits are marked with + "na" in the output descriptions but upto ATA/ATAPI-7 + no definition of "na" can be found. However, + ATA/ATAPI-8 draft revision 1f describes "N/A" as + follows. + </para> + + <blockquote> + <variablelist> + <varlistentry><term>3.2.3.3a N/A</term> + <listitem> + <para> + A keyword the indicates a field has no defined value in + this standard and should not be checked by the host or + device. N/A fields should be cleared to zero. + </para> + </listitem> + </varlistentry> + </variablelist> + </blockquote> + + <para> + So, it seems reasonable to assume that "na" bits are + cleared to zero by devices and thus need no explicit masking. + </para> + + </sect2> + + <sect2 id="excatATAPIcc"> + <title>ATAPI device CHECK CONDITION</title> + + <para> + ATAPI device CHECK CONDITION error is indicated by set CHK bit + (ERR bit) in the STATUS register after the last byte of CDB is + transferred for a PACKET command. For this kind of errors, + sense data should be acquired to gather information regarding + the errors. REQUEST SENSE packet command should be used to + acquire sense data. + </para> + + <para> + Once sense data is acquired, this type of errors can be + handled similary to other SCSI errors. Note that sense data + may indicate ATA bus error (e.g. Sense Key 04h HARDWARE ERROR + && ASC/ASCQ 47h/00h SCSI PARITY ERROR). In such + cases, the error should be considered as an ATA bus error and + handled according to <xref linkend="excatATAbusErr"/>. + </para> + + </sect2> + + <sect2 id="excatNCQerr"> + <title>ATA device error (NCQ)</title> + + <para> + NCQ command error is indicated by cleared BSY and set ERR bit + during NCQ command phase (one or more NCQ commands + outstanding). Although STATUS and ERROR registers will + contain valid values describing the error, READ LOG EXT is + required to clear the error condition, determine which command + has failed and acquire more information. + </para> + + <para> + READ LOG EXT Log Page 10h reports which tag has failed and + taskfile register values describing the error. With this + information the failed command can be handled as a normal ATA + command error as in <xref linkend="excatDevErr"/> and all + other in-flight commands must be retried. Note that this + retry should not be counted - it's likely that commands + retried this way would have completed normally if it were not + for the failed command. + </para> + + <para> + Note that ATA bus errors can be reported as ATA device NCQ + errors. This should be handled as described in <xref + linkend="excatATAbusErr"/>. + </para> + + <para> + If READ LOG EXT Log Page 10h fails or reports NQ, we're + thoroughly screwed. This condition should be treated + according to <xref linkend="excatHSMviolation"/>. + </para> + + </sect2> + + <sect2 id="excatATAbusErr"> + <title>ATA bus error</title> + + <para> + ATA bus error means that data corruption occurred during + transmission over ATA bus (SATA or PATA). This type of errors + can be indicated by + </para> + + <itemizedlist> + + <listitem> + <para> + ICRC or ABRT error as described in <xref linkend="excatDevErr"/>. + </para> + </listitem> + + <listitem> + <para> + Controller-specific error completion with error information + indicating transmission error. + </para> + </listitem> + + <listitem> + <para> + On some controllers, command timeout. In this case, there may + be a mechanism to determine that the timeout is due to + transmission error. + </para> + </listitem> + + <listitem> + <para> + Unknown/random errors, timeouts and all sorts of weirdities. + </para> + </listitem> + + </itemizedlist> + + <para> + As described above, transmission errors can cause wide variety + of symptoms ranging from device ICRC error to random device + lockup, and, for many cases, there is no way to tell if an + error condition is due to transmission error or not; + therefore, it's necessary to employ some kind of heuristic + when dealing with errors and timeouts. For example, + encountering repetitive ABRT errors for known supported + command is likely to indicate ATA bus error. + </para> + + <para> + Once it's determined that ATA bus errors have possibly + occurred, lowering ATA bus transmission speed is one of + actions which may alleviate the problem. See <xref + linkend="exrecReconf"/> for more information. + </para> + + </sect2> + + <sect2 id="excatPCIbusErr"> + <title>PCI bus error</title> + + <para> + Data corruption or other failures during transmission over PCI + (or other system bus). For standard BMDMA, this is indicated + by Error bit in the BMDMA Status register. This type of + errors must be logged as it indicates something is very wrong + with the system. Resetting host controller is recommended. + </para> + + </sect2> + + <sect2 id="excatLateCompletion"> + <title>Late completion</title> + + <para> + This occurs when timeout occurs and the timeout handler finds + out that the timed out command has completed successfully or + with error. This is usually caused by lost interrupts. This + type of errors must be logged. Resetting host controller is + recommended. + </para> + + </sect2> + + <sect2 id="excatUnknown"> + <title>Unknown error (timeout)</title> + + <para> + This is when timeout occurs and the command is still + processing or the host and device are in unknown state. When + this occurs, HSM could be in any valid or invalid state. To + bring the device to known state and make it forget about the + timed out command, resetting is necessary. The timed out + command may be retried. + </para> + + <para> + Timeouts can also be caused by transmission errors. Refer to + <xref linkend="excatATAbusErr"/> for more details. + </para> + + </sect2> + + <sect2 id="excatHoplugPM"> + <title>Hotplug and power management exceptions</title> + + <para> + <<TODO: fill here>> + </para> + + </sect2> + + </sect1> + + <sect1 id="exrec"> + <title>EH recovery actions</title> + + <para> + This section discusses several important recovery actions. + </para> + + <sect2 id="exrecClr"> + <title>Clearing error condition</title> + + <para> + Many controllers require its error registers to be cleared by + error handler. Different controllers may have different + requirements. + </para> + + <para> + For SATA, it's strongly recommended to clear at least SError + register during error handling. + </para> + </sect2> + + <sect2 id="exrecRst"> + <title>Reset</title> + + <para> + During EH, resetting is necessary in the following cases. + </para> + + <itemizedlist> + + <listitem> + <para> + HSM is in unknown or invalid state + </para> + </listitem> + + <listitem> + <para> + HBA is in unknown or invalid state + </para> + </listitem> + + <listitem> + <para> + EH needs to make HBA/device forget about in-flight commands + </para> + </listitem> + + <listitem> + <para> + HBA/device behaves weirdly + </para> + </listitem> + + </itemizedlist> + + <para> + Resetting during EH might be a good idea regardless of error + condition to improve EH robustness. Whether to reset both or + either one of HBA and device depends on situation but the + following scheme is recommended. + </para> + + <itemizedlist> + + <listitem> + <para> + When it's known that HBA is in ready state but ATA/ATAPI + device in in unknown state, reset only device. + </para> + </listitem> + + <listitem> + <para> + If HBA is in unknown state, reset both HBA and device. + </para> + </listitem> + + </itemizedlist> + + <para> + HBA resetting is implementation specific. For a controller + complying to taskfile/BMDMA PCI IDE, stopping active DMA + transaction may be sufficient iff BMDMA state is the only HBA + context. But even mostly taskfile/BMDMA PCI IDE complying + controllers may have implementation specific requirements and + mechanism to reset themselves. This must be addressed by + specific drivers. + </para> + + <para> + OTOH, ATA/ATAPI standard describes in detail ways to reset + ATA/ATAPI devices. + </para> + + <variablelist> + + <varlistentry><term>PATA hardware reset</term> + <listitem> + <para> + This is hardware initiated device reset signalled with + asserted PATA RESET- signal. There is no standard way to + initiate hardware reset from software although some + hardware provides registers that allow driver to directly + tweak the RESET- signal. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>Software reset</term> + <listitem> + <para> + This is achieved by turning CONTROL SRST bit on for at + least 5us. Both PATA and SATA support it but, in case of + SATA, this may require controller-specific support as the + second Register FIS to clear SRST should be transmitted + while BSY bit is still set. Note that on PATA, this resets + both master and slave devices on a channel. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>EXECUTE DEVICE DIAGNOSTIC command</term> + <listitem> + <para> + Although ATA/ATAPI standard doesn't describe exactly, EDD + implies some level of resetting, possibly similar level + with software reset. Host-side EDD protocol can be handled + with normal command processing and most SATA controllers + should be able to handle EDD's just like other commands. + As in software reset, EDD affects both devices on a PATA + bus. + </para> + <para> + Although EDD does reset devices, this doesn't suit error + handling as EDD cannot be issued while BSY is set and it's + unclear how it will act when device is in unknown/weird + state. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>ATAPI DEVICE RESET command</term> + <listitem> + <para> + This is very similar to software reset except that reset + can be restricted to the selected device without affecting + the other device sharing the cable. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>SATA phy reset</term> + <listitem> + <para> + This is the preferred way of resetting a SATA device. In + effect, it's identical to PATA hardware reset. Note that + this can be done with the standard SCR Control register. + As such, it's usually easier to implement than software + reset. + </para> + </listitem> + </varlistentry> + + </variablelist> + + <para> + One more thing to consider when resetting devices is that + resetting clears certain configuration parameters and they + need to be set to their previous or newly adjusted values + after reset. + </para> + + <para> + Parameters affected are. + </para> + + <itemizedlist> + + <listitem> + <para> + CHS set up with INITIALIZE DEVICE PARAMETERS (seldomly used) + </para> + </listitem> + + <listitem> + <para> + Parameters set with SET FEATURES including transfer mode setting + </para> + </listitem> + + <listitem> + <para> + Block count set with SET MULTIPLE MODE + </para> + </listitem> + + <listitem> + <para> + Other parameters (SET MAX, MEDIA LOCK...) + </para> + </listitem> + + </itemizedlist> + + <para> + ATA/ATAPI standard specifies that some parameters must be + maintained across hardware or software reset, but doesn't + strictly specify all of them. Always reconfiguring needed + parameters after reset is required for robustness. Note that + this also applies when resuming from deep sleep (power-off). + </para> + + <para> + Also, ATA/ATAPI standard requires that IDENTIFY DEVICE / + IDENTIFY PACKET DEVICE is issued after any configuration + parameter is updated or a hardware reset and the result used + for further operation. OS driver is required to implement + revalidation mechanism to support this. + </para> + + </sect2> + + <sect2 id="exrecReconf"> + <title>Reconfigure transport</title> + + <para> + For both PATA and SATA, a lot of corners are cut for cheap + connectors, cables or controllers and it's quite common to see + high transmission error rate. This can be mitigated by + lowering transmission speed. + </para> + + <para> + The following is a possible scheme Jeff Garzik suggested. + </para> + + <blockquote> + <para> + If more than $N (3?) transmission errors happen in 15 minutes, + </para> + <itemizedlist> + <listitem> + <para> + if SATA, decrease SATA PHY speed. if speed cannot be decreased, + </para> + </listitem> + <listitem> + <para> + decrease UDMA xfer speed. if at UDMA0, switch to PIO4, + </para> + </listitem> + <listitem> + <para> + decrease PIO xfer speed. if at PIO3, complain, but continue + </para> + </listitem> + </itemizedlist> + </blockquote> + + </sect2> + + </sect1> + + </chapter> + <chapter id="PiixInt"> <title>ata_piix Internals</title> !Idrivers/scsi/ata_piix.c diff --git a/Documentation/DocBook/usb.tmpl b/Documentation/DocBook/usb.tmpl index 705c442c7bf..15ce0f21e5e 100644 --- a/Documentation/DocBook/usb.tmpl +++ b/Documentation/DocBook/usb.tmpl @@ -291,7 +291,7 @@ !Edrivers/usb/core/hcd.c !Edrivers/usb/core/hcd-pci.c -!Edrivers/usb/core/buffer.c +!Idrivers/usb/core/buffer.c </chapter> <chapter> diff --git a/Documentation/DocBook/writing_usb_driver.tmpl b/Documentation/DocBook/writing_usb_driver.tmpl index 51f3bfb6fb6..008a341234d 100644 --- a/Documentation/DocBook/writing_usb_driver.tmpl +++ b/Documentation/DocBook/writing_usb_driver.tmpl @@ -345,8 +345,7 @@ if (!retval) { <programlisting> static inline void skel_delete (struct usb_skel *dev) { - if (dev->bulk_in_buffer != NULL) - kfree (dev->bulk_in_buffer); + kfree (dev->bulk_in_buffer); if (dev->bulk_out_buffer != NULL) usb_buffer_free (dev->udev, dev->bulk_out_size, dev->bulk_out_buffer, diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt index 6dd274d7e1c..2d65c218216 100644 --- a/Documentation/block/biodoc.txt +++ b/Documentation/block/biodoc.txt @@ -906,9 +906,20 @@ Aside: 4. The I/O scheduler -I/O schedulers are now per queue. They should be runtime switchable and modular -but aren't yet. Jens has most bits to do this, but the sysfs implementation is -missing. +I/O scheduler, a.k.a. elevator, is implemented in two layers. Generic dispatch +queue and specific I/O schedulers. Unless stated otherwise, elevator is used +to refer to both parts and I/O scheduler to specific I/O schedulers. + +Block layer implements generic dispatch queue in ll_rw_blk.c and elevator.c. +The generic dispatch queue is responsible for properly ordering barrier +requests, requeueing, handling non-fs requests and all other subtleties. + +Specific I/O schedulers are responsible for ordering normal filesystem +requests. They can also choose to delay certain requests to improve +throughput or whatever purpose. As the plural form indicates, there are +multiple I/O schedulers. They can be built as modules but at least one should +be built inside the kernel. Each queue can choose different one and can also +change to another one dynamically. A block layer call to the i/o scheduler follows the convention elv_xxx(). This calls elevator_xxx_fn in the elevator switch (drivers/block/elevator.c). Oh, @@ -921,44 +932,36 @@ keeping work. The functions an elevator may implement are: (* are mandatory) elevator_merge_fn called to query requests for merge with a bio -elevator_merge_req_fn " " " with another request +elevator_merge_req_fn called when two requests get merged. the one + which gets merged into the other one will be + never seen by I/O scheduler again. IOW, after + being merged, the request is gone. elevator_merged_fn called when a request in the scheduler has been involved in a merge. It is used in the deadline scheduler for example, to reposition the request if its sorting order has changed. -*elevator_next_req_fn returns the next scheduled request, or NULL - if there are none (or none are ready). +elevator_dispatch_fn fills the dispatch queue with ready requests. + I/O schedulers are free to postpone requests by + not filling the dispatch queue unless @force + is non-zero. Once dispatched, I/O schedulers + are not allowed to manipulate the requests - + they belong to generic dispatch queue. -*elevator_add_req_fn called to add a new request into the scheduler +elevator_add_req_fn called to add a new request into the scheduler elevator_queue_empty_fn returns true if the merge queue is empty. Drivers shouldn't use this, but rather check if elv_next_request is NULL (without losing the request if one exists!) -elevator_remove_req_fn This is called when a driver claims ownership of - the target request - it now belongs to the - driver. It must not be modified or merged. - Drivers must not lose the request! A subsequent - call of elevator_next_req_fn must return the - _next_ request. - -elevator_requeue_req_fn called to add a request to the scheduler. This - is used when the request has alrnadebeen - returned by elv_next_request, but hasn't - completed. If this is not implemented then - elevator_add_req_fn is called instead. - elevator_former_req_fn elevator_latter_req_fn These return the request before or after the one specified in disk sort order. Used by the block layer to find merge possibilities. -elevator_completed_req_fn called when a request is completed. This might - come about due to being merged with another or - when the device completes the request. +elevator_completed_req_fn called when a request is completed. elevator_may_queue_fn returns true if the scheduler wants to allow the current context to queue a new request even if @@ -967,13 +970,33 @@ elevator_may_queue_fn returns true if the scheduler wants to allow the elevator_set_req_fn elevator_put_req_fn Must be used to allocate and free any elevator - specific storate for a request. + specific storage for a request. + +elevator_activate_req_fn Called when device driver first sees a request. + I/O schedulers can use this callback to + determine when actual execution of a request + starts. +elevator_deactivate_req_fn Called when device driver decides to delay + a request by requeueing it. elevator_init_fn elevator_exit_fn Allocate and free any elevator specific storage for a queue. -4.2 I/O scheduler implementation +4.2 Request flows seen by I/O schedulers +All requests seens by I/O schedulers strictly follow one of the following three +flows. + + set_req_fn -> + + i. add_req_fn -> (merged_fn ->)* -> dispatch_fn -> activate_req_fn -> + (deactivate_req_fn -> activate_req_fn ->)* -> completed_req_fn + ii. add_req_fn -> (merged_fn ->)* -> merge_req_fn + iii. [none] + + -> put_req_fn + +4.3 I/O scheduler implementation The generic i/o scheduler algorithm attempts to sort/merge/batch requests for optimal disk scan and request servicing performance (based on generic principles and device capabilities), optimized for: @@ -993,18 +1016,7 @@ request in sort order to prevent binary tree lookups. This arrangement is not a generic block layer characteristic however, so elevators may implement queues as they please. -ii. Last merge hint -The last merge hint is part of the generic queue layer. I/O schedulers must do -some management on it. For the most part, the most important thing is to make -sure q->last_merge is cleared (set to NULL) when the request on it is no longer -a candidate for merging (for example if it has been sent to the driver). - -The last merge performed is cached as a hint for the subsequent request. If -sequential data is being submitted, the hint is used to perform merges without -any scanning. This is not sufficient when there are multiple processes doing -I/O though, so a "merge hash" is used by some schedulers. - -iii. Merge hash +ii. Merge hash AS and deadline use a hash table indexed by the last sector of a request. This enables merging code to quickly look up "back merge" candidates, even when multiple I/O streams are being performed at once on one disk. @@ -1013,29 +1025,8 @@ multiple I/O streams are being performed at once on one disk. are far less common than "back merges" due to the nature of most I/O patterns. Front merges are handled by the binary trees in AS and deadline schedulers. -iv. Handling barrier cases -A request with flags REQ_HARDBARRIER or REQ_SOFTBARRIER must not be ordered -around. That is, they must be processed after all older requests, and before -any newer ones. This includes merges! - -In AS and deadline schedulers, barriers have the effect of flushing the reorder -queue. The performance cost of this will vary from nothing to a lot depending -on i/o patterns and device characteristics. Obviously they won't improve -performance, so their use should be kept to a minimum. - -v. Handling insertion position directives -A request may be inserted with a position directive. The directives are one of -ELEVATOR_INSERT_BACK, ELEVATOR_INSERT_FRONT, ELEVATOR_INSERT_SORT. - -ELEVATOR_INSERT_SORT is a general directive for non-barrier requests. -ELEVATOR_INSERT_BACK is used to insert a barrier to the back of the queue. -ELEVATOR_INSERT_FRONT is used to insert a barrier to the front of the queue, and -overrides the ordering requested by any previous barriers. In practice this is -harmless and required, because it is used for SCSI requeueing. This does not -require flushing the reorder queue, so does not impose a performance penalty. - -vi. Plugging the queue to batch requests in anticipation of opportunities for - merge/sort optimizations +iii. Plugging the queue to batch requests in anticipation of opportunities for + merge/sort optimizations This is just the same as in 2.4 so far, though per-device unplugging support is anticipated for 2.5. Also with a priority-based i/o scheduler, @@ -1069,7 +1060,7 @@ Aside: blk_kick_queue() to unplug a specific queue (right away ?) or optionally, all queues, is in the plan. -4.3 I/O contexts +4.4 I/O contexts I/O contexts provide a dynamically allocated per process data area. They may be used in I/O schedulers, and in the block layer (could be used for IO statis, priorities for example). See *io_context in drivers/block/ll_rw_blk.c, and diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt index e132fb1163b..7eb715e07ed 100644 --- a/Documentation/cachetlb.txt +++ b/Documentation/cachetlb.txt @@ -49,9 +49,6 @@ changes occur: page table operations such as what happens during fork, and exec. - Platform developers note that generic code will always - invoke this interface without mm->page_table_lock held. - 3) void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) @@ -72,9 +69,6 @@ changes occur: call flush_tlb_page (see below) for each entry which may be modified. - Platform developers note that generic code will always - invoke this interface with mm->page_table_lock held. - 4) void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr) This time we need to remove the PAGE_SIZE sized translation @@ -93,9 +87,6 @@ changes occur: This is used primarily during fault processing. - Platform developers note that generic code will always - invoke this interface with mm->page_table_lock held. - 5) void flush_tlb_pgtables(struct mm_struct *mm, unsigned long start, unsigned long end) diff --git a/Documentation/driver-model/driver.txt b/Documentation/driver-model/driver.txt index fabaca1ab1b..59806c9761f 100644 --- a/Documentation/driver-model/driver.txt +++ b/Documentation/driver-model/driver.txt @@ -14,8 +14,8 @@ struct device_driver { int (*probe) (struct device * dev); int (*remove) (struct device * dev); - int (*suspend) (struct device * dev, pm_message_t state, u32 level); - int (*resume) (struct device * dev, u32 level); + int (*suspend) (struct device * dev, pm_message_t state); + int (*resume) (struct device * dev); }; @@ -194,69 +194,13 @@ device; i.e. anything in the device's driver_data field. If the device is still present, it should quiesce the device and place it into a supported low-power state. - int (*suspend) (struct device * dev, pm_message_t state, u32 level); + int (*suspend) (struct device * dev, pm_message_t state); -suspend is called to put the device in a low power state. There are -several stages to successfully suspending a device, which is denoted in -the @level parameter. Breaking the suspend transition into several -stages affords the platform flexibility in performing device power -management based on the requirements of the system and the -user-defined policy. +suspend is called to put the device in a low power state. -SUSPEND_NOTIFY notifies the device that a suspend transition is about -to happen. This happens on system power state transitions to verify -that all devices can successfully suspend. + int (*resume) (struct device * dev); -A driver may choose to fail on this call, which should cause the -entire suspend transition to fail. A driver should fail only if it -knows that the device will not be able to be resumed properly when the -system wakes up again. It could also fail if it somehow determines it -is in the middle of an operation too important to stop. - -SUSPEND_DISABLE tells the device to stop I/O transactions. When it -stops transactions, or what it should do with unfinished transactions -is a policy of the driver. After this call, the driver should not -accept any other I/O requests. - -SUSPEND_SAVE_STATE tells the device to save the context of the -hardware. This includes any bus-specific hardware state and -device-specific hardware state. A pointer to this saved state can be -stored in the device's saved_state field. - -SUSPEND_POWER_DOWN tells the driver to place the device in the low -power state requested. - -Whether suspend is called with a given level is a policy of the -platform. Some levels may be omitted; drivers must not assume the -reception of any level. However, all levels must be called in the -order above; i.e. notification will always come before disabling; -disabling the device will come before suspending the device. - -All calls are made with interrupts enabled, except for the -SUSPEND_POWER_DOWN level. - - int (*resume) (struct device * dev, u32 level); - -Resume is used to bring a device back from a low power state. Like the -suspend transition, it happens in several stages. - -RESUME_POWER_ON tells the driver to set the power state to the state -before the suspend call (The device could have already been in a low -power state before the suspend call to put in a lower power state). - -RESUME_RESTORE_STATE tells the driver to restore the state saved by -the SUSPEND_SAVE_STATE suspend call. - -RESUME_ENABLE tells the driver to start accepting I/O transactions -again. Depending on driver policy, the device may already have pending -I/O requests. - -RESUME_POWER_ON is called with interrupts disabled. The other resume -levels are called with interrupts enabled. - -As with the various suspend stages, the driver must not assume that -any other resume calls have been or will be made. Each call should be -self-contained and not dependent on any external state. +Resume is used to bring a device back from a low power state. Attributes diff --git a/Documentation/driver-model/porting.txt b/Documentation/driver-model/porting.txt index ff2fef2107f..98b233cb8b3 100644 --- a/Documentation/driver-model/porting.txt +++ b/Documentation/driver-model/porting.txt @@ -350,7 +350,7 @@ When a driver is registered, the bus's list of devices is iterated over. bus->match() is called for each device that is not already claimed by a driver. -When a device is successfully bound to a device, device->driver is +When a device is successfully bound to a driver, device->driver is set, the device is added to a per-driver list of devices, and a symlink is created in the driver's sysfs directory that points to the device's physical directory: diff --git a/Documentation/hwmon/it87 b/Documentation/hwmon/it87 index 0d0195040d8..7f42e441c64 100644 --- a/Documentation/hwmon/it87 +++ b/Documentation/hwmon/it87 @@ -4,18 +4,18 @@ Kernel driver it87 Supported chips: * IT8705F Prefix: 'it87' - Addresses scanned: from Super I/O config space, or default ISA 0x290 (8 I/O ports) + Addresses scanned: from Super I/O config space (8 I/O ports) Datasheet: Publicly available at the ITE website http://www.ite.com.tw/ * IT8712F Prefix: 'it8712' Addresses scanned: I2C 0x28 - 0x2f - from Super I/O config space, or default ISA 0x290 (8 I/O ports) + from Super I/O config space (8 I/O ports) Datasheet: Publicly available at the ITE website http://www.ite.com.tw/ * SiS950 [clone of IT8705F] - Prefix: 'sis950' - Addresses scanned: from Super I/O config space, or default ISA 0x290 (8 I/O ports) + Prefix: 'it87' + Addresses scanned: from Super I/O config space (8 I/O ports) Datasheet: No longer be available Author: Christophe Gauthron <chrisg@0-in.com> diff --git a/Documentation/hwmon/lm90 b/Documentation/hwmon/lm90 index 2c4cf39471f..438cb24cee5 100644 --- a/Documentation/hwmon/lm90 +++ b/Documentation/hwmon/lm90 @@ -24,14 +24,14 @@ Supported chips: http://www.national.com/pf/LM/LM86.html * Analog Devices ADM1032 Prefix: 'adm1032' - Addresses scanned: I2C 0x4c + Addresses scanned: I2C 0x4c and 0x4d Datasheet: Publicly available at the Analog Devices website - http://products.analog.com/products/info.asp?product=ADM1032 + http://www.analog.com/en/prod/0,2877,ADM1032,00.html * Analog Devices ADT7461 Prefix: 'adt7461' - Addresses scanned: I2C 0x4c + Addresses scanned: I2C 0x4c and 0x4d Datasheet: Publicly available at the Analog Devices website - http://products.analog.com/products/info.asp?product=ADT7461 + http://www.analog.com/en/prod/0,2877,ADT7461,00.html Note: Only if in ADM1032 compatibility mode * Maxim MAX6657 Prefix: 'max6657' @@ -71,8 +71,8 @@ increased resolution of the remote temperature measurement. The different chipsets of the family are not strictly identical, although very similar. This driver doesn't handle any specific feature for now, -but could if there ever was a need for it. For reference, here comes a -non-exhaustive list of specific features: +with the exception of SMBus PEC. For reference, here comes a non-exhaustive +list of specific features: LM90: * Filter and alert configuration register at 0xBF. @@ -91,6 +91,7 @@ ADM1032: * Conversion averaging. * Up to 64 conversions/s. * ALERT is triggered by open remote sensor. + * SMBus PEC support for Write Byte and Receive Byte transactions. ADT7461 * Extended temperature range (breaks compatibility) @@ -119,3 +120,37 @@ The lm90 driver will not update its values more frequently than every other second; reading them more often will do no harm, but will return 'old' values. +PEC Support +----------- + +The ADM1032 is the only chip of the family which supports PEC. It does +not support PEC on all transactions though, so some care must be taken. + +When reading a register value, the PEC byte is computed and sent by the +ADM1032 chip. However, in the case of a combined transaction (SMBus Read +Byte), the ADM1032 computes the CRC value over only the second half of +the message rather than its entirety, because it thinks the first half +of the message belongs to a different transaction. As a result, the CRC +value differs from what the SMBus master expects, and all reads fail. + +For this reason, the lm90 driver will enable PEC for the ADM1032 only if +the bus supports the SMBus Send Byte and Receive Byte transaction types. +These transactions will be used to read register values, instead of +SMBus Read Byte, and PEC will work properly. + +Additionally, the ADM1032 doesn't support SMBus Send Byte with PEC. +Instead, it will try to write the PEC value to the register (because the +SMBus Send Byte transaction with PEC is similar to a Write Byte transaction +without PEC), which is not what we want. Thus, PEC is explicitely disabled +on SMBus Send Byte transactions in the lm90 driver. + +PEC on byte data transactions represents a significant increase in bandwidth +usage (+33% for writes, +25% for reads) in normal conditions. With the need +to use two SMBus transaction for reads, this overhead jumps to +50%. Worse, +two transactions will typically mean twice as much delay waiting for +transaction completion, effectively doubling the register cache refresh time. +I guess reliability comes at a price, but it's quite expensive this time. + +So, as not everyone might enjoy the slowdown, PEC can be disabled through +sysfs. Just write 0 to the "pec" file and PEC will be disabled. Write 1 +to that file to enable PEC again. diff --git a/Documentation/hwmon/smsc47b397 b/Documentation/hwmon/smsc47b397 index da9d80c9643..20682f15ae4 100644 --- a/Documentation/hwmon/smsc47b397 +++ b/Documentation/hwmon/smsc47b397 @@ -3,6 +3,7 @@ Kernel driver smsc47b397 Supported chips: * SMSC LPC47B397-NC + * SMSC SCH5307-NS Prefix: 'smsc47b397' Addresses scanned: none, address read from Super I/O config space Datasheet: In this file @@ -12,11 +13,14 @@ Authors: Mark M. Hoffman <mhoffman@lightlink.com> November 23, 2004 -The following specification describes the SMSC LPC47B397-NC sensor chip +The following specification describes the SMSC LPC47B397-NC[1] sensor chip (for which there is no public datasheet available). This document was provided by Craig Kelly (In-Store Broadcast Network) and edited/corrected by Mark M. Hoffman <mhoffman@lightlink.com>. +[1] And SMSC SCH5307-NS, which has a different device ID but is otherwise +compatible. + * * * * * Methods for detecting the HP SIO and reading the thermal data on a dc7100. @@ -127,7 +131,7 @@ OUT DX,AL The registers of interest for identifying the SIO on the dc7100 are Device ID (0x20) and Device Rev (0x21). -The Device ID will read 0X6F +The Device ID will read 0x6F (for SCH5307-NS, 0x81) The Device Rev currently reads 0x01 Obtaining the HWM Base Address. diff --git a/Documentation/hwmon/smsc47m1 b/Documentation/hwmon/smsc47m1 index 34e6478c142..c15bbe68264 100644 --- a/Documentation/hwmon/smsc47m1 +++ b/Documentation/hwmon/smsc47m1 @@ -12,6 +12,10 @@ Supported chips: http://www.smsc.com/main/datasheets/47m14x.pdf http://www.smsc.com/main/tools/discontinued/47m15x.pdf http://www.smsc.com/main/datasheets/47m192.pdf + * SMSC LPC47M997 + Addresses scanned: none, address read from Super I/O config space + Prefix: 'smsc47m1' + Datasheet: none Authors: Mark D. Studebaker <mdsxyz123@yahoo.com>, @@ -30,6 +34,9 @@ The 47M15x and 47M192 chips contain a full 'hardware monitoring block' in addition to the fan monitoring and control. The hardware monitoring block is not supported by the driver. +No documentation is available for the 47M997, but it has the same device +ID as the 47M15x and 47M192 chips and seems to be compatible. + Fan rotation speeds are reported in RPM (rotations per minute). An alarm is triggered if the rotation speed has dropped below a programmable limit. Fan readings can be divided by a programmable divider (1, 2, 4 or 8) to give diff --git a/Documentation/hwmon/sysfs-interface b/Documentation/hwmon/sysfs-interface index 346400519d0..764cdc5480e 100644 --- a/Documentation/hwmon/sysfs-interface +++ b/Documentation/hwmon/sysfs-interface @@ -272,3 +272,6 @@ beep_mask Bitmask for beep. eeprom Raw EEPROM data in binary form. Read only. + +pec Enable or disable PEC (SMBus only) + Read/Write diff --git a/Documentation/hwmon/via686a b/Documentation/hwmon/via686a index b82014cb7c5..a936fb3824b 100644 --- a/Documentation/hwmon/via686a +++ b/Documentation/hwmon/via686a @@ -18,8 +18,9 @@ Authors: Module Parameters ----------------- -force_addr=0xaddr Set the I/O base address. Useful for Asus A7V boards - that don't set the address in the BIOS. Does not do a +force_addr=0xaddr Set the I/O base address. Useful for boards that + don't set the address in the BIOS. Look for a BIOS + upgrade before resorting to this. Does not do a PCI force; the via686a must still be present in lspci. Don't use this unless the driver complains that the base address is not set. @@ -63,3 +64,15 @@ miss once-only alarms. The driver only updates its values each 1.5 seconds; reading it more often will do no harm, but will return 'old' values. + +Known Issues +------------ + +This driver handles sensors integrated in some VIA south bridges. It is +possible that a motherboard maker used a VT82C686A/B chip as part of a +product design but was not interested in its hardware monitoring features, +in which case the sensor inputs will not be wired. This is the case of +the Asus K7V, A7V and A7V133 motherboards, to name only a few of them. +So, if you need the force_addr parameter, and end up with values which +don't seem to make any sense, don't look any further: your chip is simply +not wired for hardware monitoring. diff --git a/Documentation/i2c/busses/i2c-i810 b/Documentation/i2c/busses/i2c-i810 index 0544eb33288..83c3b9743c3 100644 --- a/Documentation/i2c/busses/i2c-i810 +++ b/Documentation/i2c/busses/i2c-i810 @@ -2,6 +2,7 @@ Kernel driver i2c-i810 Supported adapters: * Intel 82810, 82810-DC100, 82810E, and 82815 (GMCH) + * Intel 82845G (GMCH) Authors: Frodo Looijaard <frodol@dds.nl>, diff --git a/Documentation/i2c/busses/i2c-viapro b/Documentation/i2c/busses/i2c-viapro index 702f5ac68c0..9363b8bd610 100644 --- a/Documentation/i2c/busses/i2c-viapro +++ b/Documentation/i2c/busses/i2c-viapro @@ -4,17 +4,18 @@ Supported adapters: * VIA Technologies, Inc. VT82C596A/B Datasheet: Sometimes available at the VIA website - * VIA Technologies, Inc. VT82C686A/B + * VIA Technologies, Inc. VT82C686A/B Datasheet: Sometimes available at the VIA website * VIA Technologies, Inc. VT8231, VT8233, VT8233A, VT8235, VT8237 Datasheet: available on request from Via Authors: - Frodo Looijaard <frodol@dds.nl>, - Philip Edelbrock <phil@netroedge.com>, - Kyösti Mälkki <kmalkki@cc.hut.fi>, - Mark D. Studebaker <mdsxyz123@yahoo.com> + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, + Kyösti Mälkki <kmalkki@cc.hut.fi>, + Mark D. Studebaker <mdsxyz123@yahoo.com>, + Jean Delvare <khali@linux-fr.org> Module Parameters ----------------- @@ -28,20 +29,22 @@ Description ----------- i2c-viapro is a true SMBus host driver for motherboards with one of the -supported VIA southbridges. +supported VIA south bridges. Your lspci -n listing must show one of these : - device 1106:3050 (VT82C596 function 3) - device 1106:3051 (VT82C596 function 3) + device 1106:3050 (VT82C596A function 3) + device 1106:3051 (VT82C596B function 3) device 1106:3057 (VT82C686 function 4) device 1106:3074 (VT8233) device 1106:3147 (VT8233A) - device 1106:8235 (VT8231) - devide 1106:3177 (VT8235) - devide 1106:3227 (VT8237) + device 1106:8235 (VT8231 function 4) + device 1106:3177 (VT8235) + device 1106:3227 (VT8237R) If none of these show up, you should look in the BIOS for settings like enable ACPI / SMBus or even USB. - +Except for the oldest chips (VT82C596A/B, VT82C686A and most probably +VT8231), this driver supports I2C block transactions. Such transactions +are mainly useful to read from and write to EEPROMs. diff --git a/Documentation/i2c/chips/x1205 b/Documentation/i2c/chips/x1205 new file mode 100644 index 00000000000..09407c991fe --- /dev/null +++ b/Documentation/i2c/chips/x1205 @@ -0,0 +1,38 @@ +Kernel driver x1205 +=================== + +Supported chips: + * Xicor X1205 RTC + Prefix: 'x1205' + Addresses scanned: none + Datasheet: http://www.intersil.com/cda/deviceinfo/0,1477,X1205,00.html + +Authors: + Karen Spearel <kas11@tampabay.rr.com>, + Alessandro Zummo <a.zummo@towertech.it> + +Description +----------- + +This module aims to provide complete access to the Xicor X1205 RTC. +Recently Xicor has merged with Intersil, but the chip is +still sold under the Xicor brand. + +This chip is located at address 0x6f and uses a 2-byte register addressing. +Two bytes need to be written to read a single register, while most +other chips just require one and take the second one as the data +to be written. To prevent corrupting unknown chips, the user must +explicitely set the probe parameter. + +example: + +modprobe x1205 probe=0,0x6f + +The module supports one more option, hctosys, which is used to set the +software clock from the x1205. On systems where the x1205 is the +only hardware rtc, this parameter could be used to achieve a correct +date/time earlier in the system boot sequence. + +example: + +modprobe x1205 probe=0,0x6f hctosys=1 diff --git a/Documentation/i2c/functionality b/Documentation/i2c/functionality index 41ffefbdc60..60cca249e45 100644 --- a/Documentation/i2c/functionality +++ b/Documentation/i2c/functionality @@ -17,9 +17,10 @@ For the most up-to-date list of functionality constants, please check I2C_FUNC_I2C Plain i2c-level commands (Pure SMBus adapters typically can not do these) I2C_FUNC_10BIT_ADDR Handles the 10-bit address extensions - I2C_FUNC_PROTOCOL_MANGLING Knows about the I2C_M_REV_DIR_ADDR, - I2C_M_REV_DIR_ADDR and I2C_M_REV_DIR_NOSTART - flags (which modify the i2c protocol!) + I2C_FUNC_PROTOCOL_MANGLING Knows about the I2C_M_IGNORE_NAK, + I2C_M_REV_DIR_ADDR, I2C_M_NOSTART and + I2C_M_NO_RD_ACK flags (which modify the + I2C protocol!) I2C_FUNC_SMBUS_QUICK Handles the SMBus write_quick command I2C_FUNC_SMBUS_READ_BYTE Handles the SMBus read_byte command I2C_FUNC_SMBUS_WRITE_BYTE Handles the SMBus write_byte command diff --git a/Documentation/i2c/porting-clients b/Documentation/i2c/porting-clients index 4849dfd6961..184fac2377a 100644 --- a/Documentation/i2c/porting-clients +++ b/Documentation/i2c/porting-clients @@ -82,7 +82,7 @@ Technical changes: exit and exit_free. For i2c+isa drivers, labels should be named ERROR0, ERROR1 and ERROR2. Don't forget to properly set err before jumping to error labels. By the way, labels should be left-aligned. - Use memset to fill the client and data area with 0x00. + Use kzalloc instead of kmalloc. Use i2c_set_clientdata to set the client data (as opposed to a direct access to client->data). Use strlcpy instead of strcpy to copy the client name. diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients index 077275722a7..e94d9c6cc52 100644 --- a/Documentation/i2c/writing-clients +++ b/Documentation/i2c/writing-clients @@ -33,8 +33,8 @@ static struct i2c_driver foo_driver = { .command = &foo_command /* may be NULL */ } -The name can be chosen freely, and may be upto 40 characters long. Please -use something descriptive here. +The name field must match the driver name, including the case. It must not +contain spaces, and may be up to 31 characters long. Don't worry about the flags field; just put I2C_DF_NOTIFY into it. This means that your driver will be notified when new adapters are found. @@ -43,9 +43,6 @@ This is almost always what you want. All other fields are for call-back functions which will be explained below. -There use to be two additional fields in this structure, inc_use et dec_use, -for module usage count, but these fields were obsoleted and removed. - Extra client data ================= @@ -58,6 +55,7 @@ be very useful. An example structure is below. struct foo_data { + struct i2c_client client; struct semaphore lock; /* For ISA access in `sensors' drivers. */ int sysctl_id; /* To keep the /proc directory entry for `sensors' drivers. */ @@ -310,22 +308,15 @@ For now, you can ignore the `flags' parameter. It is there for future use. client structure, even though we cannot fill it completely yet. But it allows us to access several i2c functions safely */ - /* Note that we reserve some space for foo_data too. If you don't - need it, remove it. We do it here to help to lessen memory - fragmentation. */ - if (! (new_client = kmalloc(sizeof(struct i2c_client) + - sizeof(struct foo_data), - GFP_KERNEL))) { + if (!(data = kzalloc(sizeof(struct foo_data), GFP_KERNEL))) { err = -ENOMEM; goto ERROR0; } - /* This is tricky, but it will set the data to the right value. */ - client->data = new_client + 1; - data = (struct foo_data *) (client->data); + new_client = &data->client; + i2c_set_clientdata(new_client, data); new_client->addr = address; - new_client->data = data; new_client->adapter = adapter; new_client->driver = &foo_driver; new_client->flags = 0; @@ -451,7 +442,7 @@ much simpler than the attachment code, fortunately! release_region(client->addr,LM78_EXTENT); /* HYBRID SENSORS CHIP ONLY END */ - kfree(client); /* Frees client data too, if allocated at the same time */ + kfree(data); return 0; } @@ -576,12 +567,12 @@ SMBus communication extern s32 i2c_smbus_write_block_data(struct i2c_client * client, u8 command, u8 length, u8 *values); + extern s32 i2c_smbus_read_i2c_block_data(struct i2c_client * client, + u8 command, u8 *values); These ones were removed in Linux 2.6.10 because they had no users, but could be added back later if needed: - extern s32 i2c_smbus_read_i2c_block_data(struct i2c_client * client, - u8 command, u8 *values); extern s32 i2c_smbus_read_block_data(struct i2c_client * client, u8 command, u8 *values); extern s32 i2c_smbus_write_i2c_block_data(struct i2c_client * client, diff --git a/Documentation/input/yealink.txt b/Documentation/input/yealink.txt index 85f095a7ad0..0962c5c948b 100644 --- a/Documentation/input/yealink.txt +++ b/Documentation/input/yealink.txt @@ -2,7 +2,6 @@ Driver documentation for yealink usb-p1k phones 0. Status ~~~~~~~~~ - The p1k is a relatively cheap usb 1.1 phone with: - keyboard full support, yealink.ko / input event API - LCD full support, yealink.ko / sysfs API @@ -17,9 +16,8 @@ For vendor documentation see http://www.yealink.com 1. Compilation (stand alone version) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Currently only kernel 2.6.x.y versions are supported. -In order to build the yealink.ko module do: +In order to build the yealink.ko module do make @@ -28,6 +26,21 @@ the Makefile is pointing to the location where your kernel sources are located, default /usr/src/linux. +1.1 Troubleshooting +~~~~~~~~~~~~~~~~~~~ +Q: Module yealink compiled and installed without any problem but phone + is not initialized and does not react to any actions. +A: If you see something like: + hiddev0: USB HID v1.00 Device [Yealink Network Technology Ltd. VOIP USB Phone + in dmesg, it means that the hid driver has grabbed the device first. Try to + load module yealink before any other usb hid driver. Please see the + instructions provided by your distribution on module configuration. + +Q: Phone is working now (displays version and accepts keypad input) but I can't + find the sysfs files. +A: The sysfs files are located on the particular usb endpoint. On most + distributions you can do: "find /sys/ -name get_icons" for a hint. + 2. keyboard features ~~~~~~~~~~~~~~~~~~~~ diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 971589a9752..5dffcfefc3c 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1460,8 +1460,6 @@ running once the system is up. stifb= [HW] Format: bpp:<bpp1>[:<bpp2>[:<bpp3>...]] - stram_swap= [HW,M68k] - swiotlb= [IA-64] Number of I/O TLB slabs switches= [HW,M68k] @@ -1517,8 +1515,6 @@ running once the system is up. uart6850= [HW,OSS] Format: <io>,<irq> - usb-handoff [HW] Enable early USB BIOS -> OS handoff - usbhid.mousepoll= [USBHID] The interval which mice are to be polled at. diff --git a/Documentation/m68k/kernel-options.txt b/Documentation/m68k/kernel-options.txt index e191baad830..d5d3f064f55 100644 --- a/Documentation/m68k/kernel-options.txt +++ b/Documentation/m68k/kernel-options.txt @@ -626,7 +626,7 @@ ignored (others aren't affected). can be performed in optimal order. Not all SCSI devices support tagged queuing (:-(). -4.6 switches= +4.5 switches= ------------- Syntax: switches=<list of switches> @@ -661,28 +661,6 @@ correctly. earlier initialization ("ov_"-less) takes precedence. But the switching-off on reset still happens in this case. -4.5) stram_swap= ----------------- - -Syntax: stram_swap=<do_swap>[,<max_swap>] - - This option is available only if the kernel has been compiled with -CONFIG_STRAM_SWAP enabled. Normally, the kernel then determines -dynamically whether to actually use ST-RAM as swap space. (Currently, -the fraction of ST-RAM must be less or equal 1/3 of total memory to -enable this swapping.) You can override the kernel's decision by -specifying this option. 1 for <do_swap> means always enable the swap, -even if you have less alternate RAM. 0 stands for never swap to -ST-RAM, even if it's small enough compared to the rest of memory. - - If ST-RAM swapping is enabled, the kernel usually uses all free -ST-RAM as swap "device". If the kernel resides in ST-RAM, the region -allocated by it is obviously never used for swapping :-) You can also -limit this amount by specifying the second parameter, <max_swap>, if -you want to use parts of ST-RAM as normal system memory. <max_swap> is -in kBytes and the number should be a multiple of 4 (otherwise: rounded -down). - 5) Options for Amiga Only: ========================== diff --git a/Documentation/mips/AU1xxx_IDE.README b/Documentation/mips/AU1xxx_IDE.README new file mode 100644 index 00000000000..a7e4c4ea356 --- /dev/null +++ b/Documentation/mips/AU1xxx_IDE.README @@ -0,0 +1,168 @@ +README for MIPS AU1XXX IDE driver - Released 2005-07-15 + +ABOUT +----- +This file describes the 'drivers/ide/mips/au1xxx-ide.c', related files and the +services they provide. + +If you are short in patience and just want to know how to add your hard disc to +the white or black list, go to the 'ADD NEW HARD DISC TO WHITE OR BLACK LIST' +section. + + +LICENSE +------- + +Copyright (c) 2003-2005 AMD, Personal Connectivity Solutions + +This program is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free Software +Foundation; either version 2 of the License, or (at your option) any later +version. + +THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, +INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR +BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + +You should have received a copy of the GNU General Public License along with +this program; if not, write to the Free Software Foundation, Inc., +675 Mass Ave, Cambridge, MA 02139, USA. + +Note: for more information, please refer "AMD Alchemy Au1200/Au1550 IDE + Interface and Linux Device Driver" Application Note. + + +FILES, CONFIGS AND COMPATABILITY +-------------------------------- + +Two files are introduced: + + a) 'include/asm-mips/mach-au1x00/au1xxx_ide.h' + containes : struct _auide_hwif + struct drive_list_entry dma_white_list + struct drive_list_entry dma_black_list + timing parameters for PIO mode 0/1/2/3/4 + timing parameters for MWDMA 0/1/2 + + b) 'drivers/ide/mips/au1xxx-ide.c' + contains the functionality of the AU1XXX IDE driver + +Four configs variables are introduced: + + CONFIG_BLK_DEV_IDE_AU1XXX_PIO_DBDMA - enable the PIO+DBDMA mode + CONFIG_BLK_DEV_IDE_AU1XXX_MDMA2_DBDMA - enable the MWDMA mode + CONFIG_BLK_DEV_IDE_AU1XXX_BURSTABLE_ON - set Burstable FIFO in DBDMA + controler + CONFIG_BLK_DEV_IDE_AU1XXX_SEQTS_PER_RQ - maximum transfer size + per descriptor + +If MWDMA is enabled and the connected hard disc is not on the white list, the +kernel switches to a "safe mwdma mode" at boot time. In this mode the IDE +performance is substantial slower then in full speed mwdma. In this case +please add your hard disc to the white list (follow instruction from 'ADD NEW +HARD DISC TO WHITE OR BLACK LIST' section). + + +SUPPORTED IDE MODES +------------------- + +The AU1XXX IDE driver supported all PIO modes - PIO mode 0/1/2/3/4 - and all +MWDMA modes - MWDMA 0/1/2 -. There is no support for SWDMA and UDMA mode. + +To change the PIO mode use the program hdparm with option -p, e.g. +'hdparm -p0 [device]' for PIO mode 0. To enable the MWDMA mode use the option +-X, e.g. 'hdparm -X32 [device]' for MWDMA mode 0. + + +PERFORMANCE CONFIGURATIONS +-------------------------- + +If the used system doesn't need USB support enable the following kernel configs: + +CONFIG_IDE=y +CONFIG_BLK_DEV_IDE=y +CONFIG_IDE_GENERIC=y +CONFIG_BLK_DEV_IDEPCI=y +CONFIG_BLK_DEV_GENERIC=y +CONFIG_BLK_DEV_IDEDMA_PCI=y +CONFIG_IDEDMA_PCI_AUTO=y +CONFIG_BLK_DEV_IDE_AU1XXX=y +CONFIG_BLK_DEV_IDE_AU1XXX_MDMA2_DBDMA=y +CONFIG_BLK_DEV_IDE_AU1XXX_BURSTABLE_ON=y +CONFIG_BLK_DEV_IDE_AU1XXX_SEQTS_PER_RQ=128 +CONFIG_BLK_DEV_IDEDMA=y +CONFIG_IDEDMA_AUTO=y + +If the used system need the USB support enable the following kernel configs for +high IDE to USB throughput. + +CONFIG_BLK_DEV_IDEDISK=y +CONFIG_IDE_GENERIC=y +CONFIG_BLK_DEV_IDEPCI=y +CONFIG_BLK_DEV_GENERIC=y +CONFIG_BLK_DEV_IDEDMA_PCI=y +CONFIG_IDEDMA_PCI_AUTO=y +CONFIG_BLK_DEV_IDE_AU1XXX=y +CONFIG_BLK_DEV_IDE_AU1XXX_MDMA2_DBDMA=y +CONFIG_BLK_DEV_IDE_AU1XXX_SEQTS_PER_RQ=128 +CONFIG_BLK_DEV_IDEDMA=y +CONFIG_IDEDMA_AUTO=y + + +ADD NEW HARD DISC TO WHITE OR BLACK LIST +---------------------------------------- + +Step 1 : detect the model name of your hard disc + + a) connect your hard disc to the AU1XXX + + b) boot your kernel and get the hard disc model. + + Example boot log: + + --snipped-- + Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 + ide: Assuming 50MHz system bus speed for PIO modes; override with idebus=xx + Au1xxx IDE(builtin) configured for MWDMA2 + Probing IDE interface ide0... + hda: Maxtor 6E040L0, ATA DISK drive + ide0 at 0xac800000-0xac800007,0xac8001c0 on irq 64 + hda: max request size: 64KiB + hda: 80293248 sectors (41110 MB) w/2048KiB Cache, CHS=65535/16/63, (U)DMA + --snipped-- + + In this example 'Maxtor 6E040L0'. + +Step 2 : edit 'include/asm-mips/mach-au1x00/au1xxx_ide.h' + + Add your hard disc to the dma_white_list or dma_black_list structur. + +Step 3 : Recompile the kernel + + Enable MWDMA support in the kernel configuration. Recompile the kernel and + reboot. + +Step 4 : Tests + + If you have add a hard disc to the white list, please run some stress tests + for verification. + + +ACKNOWLEDGMENTS +--------------- + +These drivers wouldn't have been done without the base of kernel 2.4.x AU1XXX +IDE driver from AMD. + +Additional input also from: +Matthias Lenk <matthias.lenk@amd.com> + +Happy hacking! +Enrico Walther <enrico.walther@amd.com> diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index a55f0f95b17..b0fe41da007 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -777,7 +777,7 @@ doing so is the same as described in the "Configuring Multiple Bonds Manually" section, below. NOTE: It has been observed that some Red Hat supplied kernels -are apparently unable to rename modules at load time (the "-obonding1" +are apparently unable to rename modules at load time (the "-o bond1" part). Attempts to pass that option to modprobe will produce an "Operation not permitted" error. This has been reported on some Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels @@ -883,7 +883,8 @@ the above does not work, and the second bonding instance never sees its options. In that case, the second options line can be substituted as follows: -install bonding1 /sbin/modprobe bonding -obond1 mode=balance-alb miimon=50 +install bond1 /sbin/modprobe --ignore-install bonding -o bond1 \ + mode=balance-alb miimon=50 This may be repeated any number of times, specifying a new and unique name in place of bond1 for each subsequent instance. diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index b433c8a27e2..65895bb5141 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -309,7 +309,7 @@ tcp_tso_win_divisor - INTEGER can be consumed by a single TSO frame. The setting of this parameter is a choice between burstiness and building larger TSO frames. - Default: 8 + Default: 3 tcp_frto - BOOLEAN Enables F-RTO, an enhanced recovery algorithm for TCP retransmission |