aboutsummaryrefslogtreecommitdiff
path: root/Documentation/networking
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2008-07-20 17:43:29 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2008-07-20 17:43:29 -0700
commitdb6d8c7a4027b48d797b369a53f8470aaeed7063 (patch)
treee140c104a89abc2154e1f41a7db8ebecbb6fa0b4 /Documentation/networking
parent3a533374283aea50eab3976d8a6d30532175f009 (diff)
parentfb65a7c091529bfffb1262515252c0d0f6241c5c (diff)
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (1232 commits) iucv: Fix bad merging. net_sched: Add size table for qdiscs net_sched: Add accessor function for packet length for qdiscs net_sched: Add qdisc_enqueue wrapper highmem: Export totalhigh_pages. ipv6 mcast: Omit redundant address family checks in ip6_mc_source(). net: Use standard structures for generic socket address structures. ipv6 netns: Make several "global" sysctl variables namespace aware. netns: Use net_eq() to compare net-namespaces for optimization. ipv6: remove unused macros from net/ipv6.h ipv6: remove unused parameter from ip6_ra_control tcp: fix kernel panic with listening_get_next tcp: Remove redundant checks when setting eff_sacks tcp: options clean up tcp: Fix MD5 signatures for non-linear skbs sctp: Update sctp global memory limit allocations. sctp: remove unnecessary byteshifting, calculate directly in big-endian sctp: Allow only 1 listening socket with SO_REUSEADDR sctp: Do not leak memory on multiple listen() calls sctp: Support ipv6only AF_INET6 sockets. ...
Diffstat (limited to 'Documentation/networking')
-rw-r--r--Documentation/networking/bonding.txt110
-rw-r--r--Documentation/networking/dm9000.txt167
-rw-r--r--Documentation/networking/ip-sysctl.txt21
-rw-r--r--Documentation/networking/ixgb.txt419
-rw-r--r--Documentation/networking/mac80211_hwsim/README67
-rw-r--r--Documentation/networking/mac80211_hwsim/hostapd.conf11
-rw-r--r--Documentation/networking/mac80211_hwsim/wpa_supplicant.conf10
-rw-r--r--Documentation/networking/multiqueue.txt90
-rw-r--r--Documentation/networking/s2io.txt7
9 files changed, 675 insertions, 227 deletions
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index a0cda062bc3..7fa7fe71d7a 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -289,35 +289,73 @@ downdelay
fail_over_mac
Specifies whether active-backup mode should set all slaves to
- the same MAC address (the traditional behavior), or, when
- enabled, change the bond's MAC address when changing the
- active interface (i.e., fail over the MAC address itself).
-
- Fail over MAC is useful for devices that cannot ever alter
- their MAC address, or for devices that refuse incoming
- broadcasts with their own source MAC (which interferes with
- the ARP monitor).
-
- The down side of fail over MAC is that every device on the
- network must be updated via gratuitous ARP, vs. just updating
- a switch or set of switches (which often takes place for any
- traffic, not just ARP traffic, if the switch snoops incoming
- traffic to update its tables) for the traditional method. If
- the gratuitous ARP is lost, communication may be disrupted.
-
- When fail over MAC is used in conjuction with the mii monitor,
- devices which assert link up prior to being able to actually
- transmit and receive are particularly susecptible to loss of
- the gratuitous ARP, and an appropriate updelay setting may be
- required.
-
- A value of 0 disables fail over MAC, and is the default. A
- value of 1 enables fail over MAC. This option is enabled
- automatically if the first slave added cannot change its MAC
- address. This option may be modified via sysfs only when no
- slaves are present in the bond.
-
- This option was added in bonding version 3.2.0.
+ the same MAC address at enslavement (the traditional
+ behavior), or, when enabled, perform special handling of the
+ bond's MAC address in accordance with the selected policy.
+
+ Possible values are:
+
+ none or 0
+
+ This setting disables fail_over_mac, and causes
+ bonding to set all slaves of an active-backup bond to
+ the same MAC address at enslavement time. This is the
+ default.
+
+ active or 1
+
+ The "active" fail_over_mac policy indicates that the
+ MAC address of the bond should always be the MAC
+ address of the currently active slave. The MAC
+ address of the slaves is not changed; instead, the MAC
+ address of the bond changes during a failover.
+
+ This policy is useful for devices that cannot ever
+ alter their MAC address, or for devices that refuse
+ incoming broadcasts with their own source MAC (which
+ interferes with the ARP monitor).
+
+ The down side of this policy is that every device on
+ the network must be updated via gratuitous ARP,
+ vs. just updating a switch or set of switches (which
+ often takes place for any traffic, not just ARP
+ traffic, if the switch snoops incoming traffic to
+ update its tables) for the traditional method. If the
+ gratuitous ARP is lost, communication may be
+ disrupted.
+
+ When this policy is used in conjuction with the mii
+ monitor, devices which assert link up prior to being
+ able to actually transmit and receive are particularly
+ susecptible to loss of the gratuitous ARP, and an
+ appropriate updelay setting may be required.
+
+ follow or 2
+
+ The "follow" fail_over_mac policy causes the MAC
+ address of the bond to be selected normally (normally
+ the MAC address of the first slave added to the bond).
+ However, the second and subsequent slaves are not set
+ to this MAC address while they are in a backup role; a
+ slave is programmed with the bond's MAC address at
+ failover time (and the formerly active slave receives
+ the newly active slave's MAC address).
+
+ This policy is useful for multiport devices that
+ either become confused or incur a performance penalty
+ when multiple ports are programmed with the same MAC
+ address.
+
+
+ The default policy is none, unless the first slave cannot
+ change its MAC address, in which case the active policy is
+ selected by default.
+
+ This option may be modified via sysfs only when no slaves are
+ present in the bond.
+
+ This option was added in bonding version 3.2.0. The "follow"
+ policy was added in bonding version 3.3.0.
lacp_rate
@@ -338,7 +376,8 @@ max_bonds
Specifies the number of bonding devices to create for this
instance of the bonding driver. E.g., if max_bonds is 3, and
the bonding driver is not already loaded, then bond0, bond1
- and bond2 will be created. The default value is 1.
+ and bond2 will be created. The default value is 1. Specifying
+ a value of 0 will load bonding, but will not create any devices.
miimon
@@ -501,6 +540,17 @@ mode
swapped with the new curr_active_slave that was
chosen.
+num_grat_arp
+
+ Specifies the number of gratuitous ARPs to be issued after a
+ failover event. One gratuitous ARP is issued immediately after
+ the failover, subsequent ARPs are sent at a rate of one per link
+ monitor interval (arp_interval or miimon, whichever is active).
+
+ The valid range is 0 - 255; the default value is 1. This option
+ affects only the active-backup mode. This option was added for
+ bonding version 3.3.0.
+
primary
A string (eth0, eth2, etc) specifying which slave is the
diff --git a/Documentation/networking/dm9000.txt b/Documentation/networking/dm9000.txt
new file mode 100644
index 00000000000..65df3dea556
--- /dev/null
+++ b/Documentation/networking/dm9000.txt
@@ -0,0 +1,167 @@
+DM9000 Network driver
+=====================
+
+Copyright 2008 Simtec Electronics,
+ Ben Dooks <ben@simtec.co.uk> <ben-linux@fluff.org>
+
+
+Introduction
+------------
+
+This file describes how to use the DM9000 platform-device based network driver
+that is contained in the files drivers/net/dm9000.c and drivers/net/dm9000.h.
+
+The driver supports three DM9000 variants, the DM9000E which is the first chip
+supported as well as the newer DM9000A and DM9000B devices. It is currently
+maintained and tested by Ben Dooks, who should be CC: to any patches for this
+driver.
+
+
+Defining the platform device
+----------------------------
+
+The minimum set of resources attached to the platform device are as follows:
+
+ 1) The physical address of the address register
+ 2) The physical address of the data register
+ 3) The IRQ line the device's interrupt pin is connected to.
+
+These resources should be specified in that order, as the ordering of the
+two address regions is important (the driver expects these to be address
+and then data).
+
+An example from arch/arm/mach-s3c2410/mach-bast.c is:
+
+static struct resource bast_dm9k_resource[] = {
+ [0] = {
+ .start = S3C2410_CS5 + BAST_PA_DM9000,
+ .end = S3C2410_CS5 + BAST_PA_DM9000 + 3,
+ .flags = IORESOURCE_MEM,
+ },
+ [1] = {
+ .start = S3C2410_CS5 + BAST_PA_DM9000 + 0x40,
+ .end = S3C2410_CS5 + BAST_PA_DM9000 + 0x40 + 0x3f,
+ .flags = IORESOURCE_MEM,
+ },
+ [2] = {
+ .start = IRQ_DM9000,
+ .end = IRQ_DM9000,
+ .flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHLEVEL,
+ }
+};
+
+static struct platform_device bast_device_dm9k = {
+ .name = "dm9000",
+ .id = 0,
+ .num_resources = ARRAY_SIZE(bast_dm9k_resource),
+ .resource = bast_dm9k_resource,
+};
+
+Note the setting of the IRQ trigger flag in bast_dm9k_resource[2].flags,
+as this will generate a warning if it is not present. The trigger from
+the flags field will be passed to request_irq() when registering the IRQ
+handler to ensure that the IRQ is setup correctly.
+
+This shows a typical platform device, without the optional configuration
+platform data supplied. The next example uses the same resources, but adds
+the optional platform data to pass extra configuration data:
+
+static struct dm9000_plat_data bast_dm9k_platdata = {
+ .flags = DM9000_PLATF_16BITONLY,
+};
+
+static struct platform_device bast_device_dm9k = {
+ .name = "dm9000",
+ .id = 0,
+ .num_resources = ARRAY_SIZE(bast_dm9k_resource),
+ .resource = bast_dm9k_resource,
+ .dev = {
+ .platform_data = &bast_dm9k_platdata,
+ }
+};
+
+The platform data is defined in include/linux/dm9000.h and described below.
+
+
+Platform data
+-------------
+
+Extra platform data for the DM9000 can describe the IO bus width to the
+device, whether or not an external PHY is attached to the device and
+the availability of an external configuration EEPROM.
+
+The flags for the platform data .flags field are as follows:
+
+DM9000_PLATF_8BITONLY
+
+ The IO should be done with 8bit operations.
+
+DM9000_PLATF_16BITONLY
+
+ The IO should be done with 16bit operations.
+
+DM9000_PLATF_32BITONLY
+
+ The IO should be done with 32bit operations.
+
+DM9000_PLATF_EXT_PHY
+
+ The chip is connected to an external PHY.
+
+DM9000_PLATF_NO_EEPROM
+
+ This can be used to signify that the board does not have an
+ EEPROM, or that the EEPROM should be hidden from the user.
+
+DM9000_PLATF_SIMPLE_PHY
+
+ Switch to using the simpler PHY polling method which does not
+ try and read the MII PHY state regularly. This is only available
+ when using the internal PHY. See the section on link state polling
+ for more information.
+
+ The config symbol DM9000_FORCE_SIMPLE_PHY_POLL, Kconfig entry
+ "Force simple NSR based PHY polling" allows this flag to be
+ forced on at build time.
+
+
+PHY Link state polling
+----------------------
+
+The driver keeps track of the link state and informs the network core
+about link (carrier) availablilty. This is managed by several methods
+depending on the version of the chip and on which PHY is being used.
+
+For the internal PHY, the original (and currently default) method is
+to read the MII state, either when the status changes if we have the
+necessary interrupt support in the chip or every two seconds via a
+periodic timer.
+
+To reduce the overhead for the internal PHY, there is now the option
+of using the DM9000_FORCE_SIMPLE_PHY_POLL config, or DM9000_PLATF_SIMPLE_PHY
+platform data option to read the summary information without the
+expensive MII accesses. This method is faster, but does not print
+as much information.
+
+When using an external PHY, the driver currently has to poll the MII
+link status as there is no method for getting an interrupt on link change.
+
+
+DM9000A / DM9000B
+-----------------
+
+These chips are functionally similar to the DM9000E and are supported easily
+by the same driver. The features are:
+
+ 1) Interrupt on internal PHY state change. This means that the periodic
+ polling of the PHY status may be disabled on these devices when using
+ the internal PHY.
+
+ 2) TCP/UDP checksum offloading, which the driver does not currently support.
+
+
+ethtool
+-------
+
+The driver supports the ethtool interface for access to the driver
+state information, the PHY state and the EEPROM.
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 946b66e1b65..d84932650fd 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -551,8 +551,9 @@ icmp_echo_ignore_broadcasts - BOOLEAN
icmp_ratelimit - INTEGER
Limit the maximal rates for sending ICMP packets whose type matches
icmp_ratemask (see below) to specific targets.
- 0 to disable any limiting, otherwise the maximal rate in jiffies(1)
- Default: 100
+ 0 to disable any limiting,
+ otherwise the minimal space between responses in milliseconds.
+ Default: 1000
icmp_ratemask - INTEGER
Mask made of ICMP types for which rates are being limited.
@@ -1023,11 +1024,23 @@ max_addresses - INTEGER
autoconfigured addresses.
Default: 16
+disable_ipv6 - BOOLEAN
+ Disable IPv6 operation.
+ Default: FALSE (enable IPv6 operation)
+
+accept_dad - INTEGER
+ Whether to accept DAD (Duplicate Address Detection).
+ 0: Disable DAD
+ 1: Enable DAD (default)
+ 2: Enable DAD, and disable IPv6 operation if MAC-based duplicate
+ link-local address has been found.
+
icmp/*:
ratelimit - INTEGER
Limit the maximal rates for sending ICMPv6 packets.
- 0 to disable any limiting, otherwise the maximal rate in jiffies(1)
- Default: 100
+ 0 to disable any limiting,
+ otherwise the minimal space between responses in milliseconds.
+ Default: 1000
IPv6 Update by:
diff --git a/Documentation/networking/ixgb.txt b/Documentation/networking/ixgb.txt
index 7c98277777e..a0d0ffb5e58 100644
--- a/Documentation/networking/ixgb.txt
+++ b/Documentation/networking/ixgb.txt
@@ -1,7 +1,7 @@
-Linux* Base Driver for the Intel(R) PRO/10GbE Family of Adapters
-================================================================
+Linux Base Driver for 10 Gigabit Intel(R) Network Connection
+=============================================================
-November 17, 2004
+October 9, 2007
Contents
@@ -9,94 +9,151 @@ Contents
- In This Release
- Identifying Your Adapter
+- Building and Installation
- Command Line Parameters
- Improving Performance
+- Additional Configurations
+- Known Issues/Troubleshooting
- Support
+
In This Release
===============
-This file describes the Linux* Base Driver for the Intel(R) PRO/10GbE Family
-of Adapters, version 1.0.x.
+This file describes the ixgb Linux Base Driver for the 10 Gigabit Intel(R)
+Network Connection. This driver includes support for Itanium(R)2-based
+systems.
+
+For questions related to hardware requirements, refer to the documentation
+supplied with your 10 Gigabit adapter. All hardware requirements listed apply
+to use with Linux.
+
+The following features are available in this kernel:
+ - Native VLANs
+ - Channel Bonding (teaming)
+ - SNMP
+
+Channel Bonding documentation can be found in the Linux kernel source:
+/Documentation/networking/bonding.txt
+
+The driver information previously displayed in the /proc filesystem is not
+supported in this release. Alternatively, you can use ethtool (version 1.6
+or later), lspci, and ifconfig to obtain the same information.
+
+Instructions on updating ethtool can be found in the section "Additional
+Configurations" later in this document.
-For questions related to hardware requirements, refer to the documentation
-supplied with your Intel PRO/10GbE adapter. All hardware requirements listed
-apply to use with Linux.
Identifying Your Adapter
========================
-To verify your Intel adapter is supported, find the board ID number on the
-adapter. Look for a label that has a barcode and a number in the format
-A12345-001.
+The following Intel network adapters are compatible with the drivers in this
+release:
+
+Controller Adapter Name Physical Layer
+---------- ------------ --------------
+82597EX Intel(R) PRO/10GbE LR/SR/CX4 10G Base-LR (1310 nm optical fiber)
+ Server Adapters 10G Base-SR (850 nm optical fiber)
+ 10G Base-CX4(twin-axial copper cabling)
+
+For more information on how to identify your adapter, go to the Adapter &
+Driver ID Guide at:
+
+ http://support.intel.com/support/network/sb/CS-012904.htm
+
+
+Building and Installation
+=========================
+
+select m for "Intel(R) PRO/10GbE support" located at:
+ Location:
+ -> Device Drivers
+ -> Network device support (NETDEVICES [=y])
+ -> Ethernet (10000 Mbit) (NETDEV_10000 [=y])
+1. make modules && make modules_install
+
+2. Load the module:
+
+    modprobe ixgb <parameter>=<value>
+
+ The insmod command can be used if the full
+ path to the driver module is specified. For example:
+
+ insmod /lib/modules/<KERNEL VERSION>/kernel/drivers/net/ixgb/ixgb.ko
+
+ With 2.6 based kernels also make sure that older ixgb drivers are
+ removed from the kernel, before loading the new module:
-Use the above information and the Adapter & Driver ID Guide at:
+ rmmod ixgb; modprobe ixgb
- http://support.intel.com/support/network/adapter/pro100/21397.htm
+3. Assign an IP address to the interface by entering the following, where
+ x is the interface number:
-For the latest Intel network drivers for Linux, go to:
+ ifconfig ethx <IP_address>
+
+4. Verify that the interface works. Enter the following, where <IP_address>
+ is the IP address for another machine on the same subnet as the interface
+ that is being tested:
+
+ ping <IP_address>
- http://downloadfinder.intel.com/scripts-df/support_intel.asp
Command Line Parameters
=======================
-If the driver is built as a module, the following optional parameters are
-used by entering them on the command line with the modprobe or insmod command
-using this syntax:
+If the driver is built as a module, the following optional parameters are
+used by entering them on the command line with the modprobe command using
+this syntax:
modprobe ixgb [<option>=<VAL1>,<VAL2>,...]
- insmod ixgb [<option>=<VAL1>,<VAL2>,...]
+For example, with two 10GbE PCI adapters, entering:
-For example, with two PRO/10GbE PCI adapters, entering:
+ modprobe ixgb TxDescriptors=80,128
- insmod ixgb TxDescriptors=80,128
-
-loads the ixgb driver with 80 TX resources for the first adapter and 128 TX
+loads the ixgb driver with 80 TX resources for the first adapter and 128 TX
resources for the second adapter.
The default value for each parameter is generally the recommended setting,
-unless otherwise noted. Also, if the driver is statically built into the
-kernel, the driver is loaded with the default values for all the parameters.
-Ethtool can be used to change some of the parameters at runtime.
+unless otherwise noted.
FlowControl
Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx)
Default: Read from the EEPROM
- If EEPROM is not detected, default is 3
- This parameter controls the automatic generation(Tx) and response(Rx) to
- Ethernet PAUSE frames.
+ If EEPROM is not detected, default is 1
+ This parameter controls the automatic generation(Tx) and response(Rx) to
+ Ethernet PAUSE frames. There are hardware bugs associated with enabling
+ Tx flow control so beware.
RxDescriptors
Valid Range: 64-512
Default Value: 512
- This value is the number of receive descriptors allocated by the driver.
- Increasing this value allows the driver to buffer more incoming packets.
- Each descriptor is 16 bytes. A receive buffer is also allocated for
- each descriptor and can be either 2048, 4056, 8192, or 16384 bytes,
- depending on the MTU setting. When the MTU size is 1500 or less, the
+ This value is the number of receive descriptors allocated by the driver.
+ Increasing this value allows the driver to buffer more incoming packets.
+ Each descriptor is 16 bytes. A receive buffer is also allocated for
+ each descriptor and can be either 2048, 4056, 8192, or 16384 bytes,
+ depending on the MTU setting. When the MTU size is 1500 or less, the
receive buffer size is 2048 bytes. When the MTU is greater than 1500 the
- receive buffer size will be either 4056, 8192, or 16384 bytes. The
+ receive buffer size will be either 4056, 8192, or 16384 bytes. The
maximum MTU size is 16114.
RxIntDelay
Valid Range: 0-65535 (0=off)
-Default Value: 6
- This value delays the generation of receive interrupts in units of
- 0.8192 microseconds. Receive interrupt reduction can improve CPU
- efficiency if properly tuned for specific network traffic. Increasing
- this value adds extra latency to frame reception and can end up
- decreasing the throughput of TCP traffic. If the system is reporting
- dropped receives, this value may be set too high, causing the driver to
+Default Value: 72
+ This value delays the generation of receive interrupts in units of
+ 0.8192 microseconds. Receive interrupt reduction can improve CPU
+ efficiency if properly tuned for specific network traffic. Increasing
+ this value adds extra latency to frame reception and can end up
+ decreasing the throughput of TCP traffic. If the system is reporting
+ dropped receives, this value may be set too high, causing the driver to
run out of available receive descriptors.
TxDescriptors
Valid Range: 64-4096
Default Value: 256
This value is the number of transmit descriptors allocated by the driver.
- Increasing this value allows the driver to queue more transmits. Each
+ Increasing this value allows the driver to queue more transmits. Each
descriptor is 16 bytes.
XsumRX
@@ -105,51 +162,49 @@ Default Value: 1
A value of '1' indicates that the driver should enable IP checksum
offload for received packets (both UDP and TCP) to the adapter hardware.
-XsumTX
-Valid Range: 0-1
-Default Value: 1
- A value of '1' indicates that the driver should enable IP checksum
- offload for transmitted packets (both UDP and TCP) to the adapter
- hardware.
Improving Performance
=====================
-With the Intel PRO/10 GbE adapter, the default Linux configuration will very
-likely limit the total available throughput artificially. There is a set of
-things that when applied together increase the ability of Linux to transmit
-and receive data. The following enhancements were originally acquired from
-settings published at http://www.spec.org/web99 for various submitted results
-using Linux.
+With the 10 Gigabit server adapters, the default Linux configuration will
+very likely limit the total available throughput artificially. There is a set
+of configuration changes that, when applied together, will increase the ability
+of Linux to transmit and receive data. The following enhancements were
+originally acquired from settings published at http://www.spec.org/web99/ for
+various submitted results using Linux.
-NOTE: These changes are only suggestions, and serve as a starting point for
-tuning your network performance.
+NOTE: These changes are only suggestions, and serve as a starting point for
+ tuning your network performance.
The changes are made in three major ways, listed in order of greatest effect:
-- Use ifconfig to modify the mtu (maximum transmission unit) and the txqueuelen
+- Use ifconfig to modify the mtu (maximum transmission unit) and the txqueuelen
parameter.
- Use sysctl to modify /proc parameters (essentially kernel tuning)
-- Use setpci to modify the MMRBC field in PCI-X configuration space to increase
+- Use setpci to modify the MMRBC field in PCI-X configuration space to increase
transmit burst lengths on the bus.
-NOTE: setpci modifies the adapter's configuration registers to allow it to read
-up to 4k bytes at a time (for transmits). However, for some systems the
-behavior after modifying this register may be undefined (possibly errors of some
-kind). A power-cycle, hard reset or explicitly setting the e6 register back to
-22 (setpci -d 8086:1048 e6.b=22) may be required to get back to a stable
-configuration.
+NOTE: setpci modifies the adapter's configuration registers to allow it to read
+up to 4k bytes at a time (for transmits). However, for some systems the
+behavior after modifying this register may be undefined (possibly errors of
+some kind). A power-cycle, hard reset or explicitly setting the e6 register
+back to 22 (setpci -d 8086:1a48 e6.b=22) may be required to get back to a
+stable configuration.
- COPY these lines and paste them into ixgb_perf.sh:
#!/bin/bash
-echo "configuring network performance , edit this file to change the interface"
+echo "configuring network performance , edit this file to change the interface
+or device ID of 10GbE card"
# set mmrbc to 4k reads, modify only Intel 10GbE device IDs
-setpci -d 8086:1048 e6.b=2e
-# set the MTU (max transmission unit) - it requires your switch and clients to change too!
+# replace 1a48 with appropriate 10GbE device's ID installed on the system,
+# if needed.
+setpci -d 8086:1a48 e6.b=2e
+# set the MTU (max transmission unit) - it requires your switch and clients
+# to change as well.
# set the txqueuelen
# your ixgb adapter should be loaded as eth1 for this to work, change if needed
ifconfig eth1 mtu 9000 txqueuelen 1000 up
-# call the sysctl utility to modify /proc/sys entries
-sysctl -p ./sysctl_ixgb.conf
+# call the sysctl utility to modify /proc/sys entries
+sysctl -p ./sysctl_ixgb.conf
- END ixgb_perf.sh
- COPY these lines and paste them into sysctl_ixgb.conf:
@@ -159,54 +214,220 @@ sysctl -p ./sysctl_ixgb.conf
# several network benchmark tests, your mileage may vary
### IPV4 specific settings
-net.ipv4.tcp_timestamps = 0 # turns TCP timestamp support off, default 1, reduces CPU use
-net.ipv4.tcp_sack = 0 # turn SACK support off, default on
-# on systems with a VERY fast bus -> memory interface this is the big gainer
-net.ipv4.tcp_rmem = 10000000 10000000 10000000 # sets min/default/max TCP read buffer, default 4096 87380 174760
-net.ipv4.tcp_wmem = 10000000 10000000 10000000 # sets min/pressure/max TCP write buffer, default 4096 16384 131072
-net.ipv4.tcp_mem = 10000000 10000000 10000000 # sets min/pressure/max TCP buffer space, default 31744 32256 32768
+# turn TCP timestamp support off, default 1, reduces CPU use
+net.ipv4.tcp_timestamps = 0
+# turn SACK support off, default on
+# on systems with a VERY fast bus -> memory interface this is the big gainer
+net.ipv4.tcp_sack = 0
+# set min/default/max TCP read buffer, default 4096 87380 174760
+net.ipv4.tcp_rmem = 10000000 10000000 10000000
+# set min/pressure/max TCP write buffer, default 4096 16384 131072
+net.ipv4.tcp_wmem = 10000000 10000000 10000000
+# set min/pressure/max TCP buffer space, default 31744 32256 32768
+net.ipv4.tcp_mem = 10000000 10000000 10000000
### CORE settings (mostly for socket and UDP effect)
-net.core.rmem_max = 524287 # maximum receive socket buffer size, default 131071
-net.core.wmem_max = 524287 # maximum send socket buffer size, default 131071
-net.core.rmem_default = 524287 # default receive socket buffer size, default 65535
-net.core.wmem_default = 524287 # default send socket buffer size, default 65535
-net.core.optmem_max = 524287 # maximum amount of option memory buffers, default 10240
-net.core.netdev_max_backlog = 300000 # number of unprocessed input packets before kernel starts dropping them, default 300
+# set maximum receive socket buffer size, default 131071
+net.core.rmem_max = 524287
+# set maximum send socket buffer size, default 131071
+net.core.wmem_max = 524287
+# set default receive socket buffer size, default 65535
+net.core.rmem_default = 524287
+# set default send socket buffer size, default 65535
+net.core.wmem_default = 524287
+# set maximum amount of option memory buffers, default 10240
+net.core.optmem_max = 524287
+# set number of unprocessed input packets before kernel starts dropping them; default 300
+net.core.netdev_max_backlog = 300000
- END sysctl_ixgb.conf
-Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface
-your ixgb driver is using.
+Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface
+your ixgb driver is using and/or replace '1a48' with appropriate 10GbE device's
+ID installed on the system.
-NOTE: Unless these scripts are added to the boot process, these changes will
-only last only until the next system reboot.
+NOTE: Unless these scripts are added to the boot process, these changes will
+ only last only until the next system reboot.
Resolving Slow UDP Traffic
--------------------------
+If your server does not seem to be able to receive UDP traffic as fast as it
+can receive TCP traffic, it could be because Linux, by default, does not set
+the network stack buffers as large as they need to be to support high UDP
+transfer rates. One way to alleviate this problem is to allow more memory to
+be used by the IP stack to store incoming data.
-If your server does not seem to be able to receive UDP traffic as fast as it
-can receive TCP traffic, it could be because Linux, by default, does not set
-the network stack buffers as large as they need to be to support high UDP
-transfer rates. One way to alleviate this problem is to allow more memory to
-be used by the IP stack to store incoming data.
-
-For instance, use the commands:
+For instance, use the commands:
sysctl -w net.core.rmem_max=262143
and
sysctl -w net.core.rmem_default=262143
-to increase the read buffer memory max and default to 262143 (256k - 1) from
-defaults of max=131071 (128k - 1) and default=65535 (64k - 1). These variables
-will increase the amount of memory used by the network stack for receives, and
+to increase the read buffer memory max and default to 262143 (256k - 1) from
+defaults of max=131071 (128k - 1) and default=65535 (64k - 1). These variables
+will increase the amount of memory used by the network stack for receives, and
can be increased significantly more if necessary for your application.
+
+Additional Configurations
+=========================
+
+ Configuring the Driver on Different Distributions
+ -------------------------------------------------
+ Configuring a network driver to load properly when the system is started is
+ distribution dependent. Typically, the configuration process involves adding
+ an alias line to /etc/modprobe.conf as well as editing other system startup
+ scripts and/or configuration files. Many popular Linux distributions ship
+ with tools to make these changes for you. To learn the proper way to
+ configure a network device for your system, refer to your distribution
+ documentation. If during this process you are asked for the driver or module
+ name, the name for the Linux Base Driver for the Intel 10GbE Family of
+ Adapters is ixgb.
+
+ Viewing Link Messages
+ ---------------------
+ Link messages will not be displayed to the console if the distribution is
+ restricting system messages. In order to see network driver link messages on
+ your console, set dmesg to eight by entering the following:
+
+ dmesg -n 8
+
+ NOTE: This setting is not saved across reboots.
+
+
+ Jumbo Frames
+ ------------
+ The driver supports Jumbo Frames for all adapters. Jumbo Frames support is
+ enabled by changing the MTU to a value larger than the default of 1500.
+ The maximum value for the MTU is 16114. Use the ifconfig command to
+ increase the MTU size. For example:
+
+ ifconfig ethx mtu 9000 up
+
+ The maximum MTU setting for Jumbo Frames is 16114. This value coincides
+ with the maximum Jumbo Frames size of 16128.
+
+
+ Ethtool
+ -------
+ The driver utilizes the ethtool interface for driver configuration and
+ diagnostics, as well as displaying statistical information. Ethtool
+ version 1.6 or later is required for this functionality.
+
+ The latest release of ethtool can be found from
+ http://sourceforge.net/projects/gkernel
+
+ NOTE: Ethtool 1.6 only supports a limited set of ethtool options. Support
+ for a more complete ethtool feature set can be enabled by upgrading
+ to the latest version.
+
+
+ NAPI
+ ----
+
+ NAPI (Rx polling mode) is supported in the ixgb driver. NAPI is enabled
+ or disabled based on the configuration of the kernel. see CONFIG_IXGB_NAPI
+
+ See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI.
+
+
+Known Issues/Troubleshooting
+============================
+
+ NOTE: After installing the driver, if your Intel Network Connection is not
+ working, verify in the "In This Release" section of the readme that you have
+ installed the correct driver.
+
+ Intel(R) PRO/10GbE CX4 Server Adapter Cable Interoperability Issue with
+ Fujitsu XENPAK Module in SmartBits Chassis
+ ---------------------------------------------------------------------
+ Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4
+ Server adapter is connected to a Fujitsu XENPAK CX4 module in a SmartBits
+ chassis using 15 m/24AWG cable assemblies manufactured by Fujitsu or Leoni.
+ The CRC errors may be received either by the Intel(R) PRO/10GbE CX4
+ Server adapter or the SmartBits. If this situation occurs using a different
+ cable assembly may resolve the issue.
+
+ CX4 Server Adapter Cable Interoperability Issues with HP Procurve 3400cl
+ Switch Port
+ ------------------------------------------------------------------------
+ Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4 Server
+ adapter is connected to an HP Procurve 3400cl switch port using short cables
+ (1 m or shorter). If this situation occurs, using a longer cable may resolve
+ the issue.
+
+ Excessive CRC errors may be observed using Fujitsu 24AWG cable assemblies that
+ Are 10 m or longer or where using a Leoni 15 m/24AWG cable assembly. The CRC
+ errors may be received either by the CX4 Server adapter or at the switch. If
+ this situation occurs, using a different cable assembly may resolve the issue.
+
+
+ Jumbo Frames System Requirement
+ -------------------------------
+ Memory allocation failures have been observed on Linux systems with 64 MB
+ of RAM or less that are running Jumbo Frames. If you are using Jumbo
+ Frames, your system may require more than the advertised minimum
+ requirement of 64 MB of system memory.
+
+
+ Performance Degradation with Jumbo Frames
+ -----------------------------------------
+ Degradation in throughput performance may be observed in some Jumbo frames
+ environments. If this is observed, increasing the application's socket buffer
+ size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values may help.
+ See the specific application manual and /usr/src/linux*/Documentation/
+ networking/ip-sysctl.txt for more details.
+
+
+ Allocating Rx Buffers when Using Jumbo Frames
+ ---------------------------------------------
+ Allocating Rx buffers when using Jumbo Frames on 2.6.x kernels may fail if
+ the available memory is heavily fragmented. This issue may be seen with PCI-X
+ adapters or with packet split disabled. This can be reduced or eliminated
+ by changing the amount of available memory for receive buffer allocation, by
+ increasing /proc/sys/vm/min_free_kbytes.
+
+
+ Multiple Interfaces on Same Ethernet Broadcast Network
+ ------------------------------------------------------
+ Due to the default ARP behavior on Linux, it is not possible to have
+ one system on two IP networks in the same Ethernet broadcast domain
+ (non-partitioned switch) behave as expected. All Ethernet interfaces
+ will respond to IP traffic for any IP address assigned to the system.
+ This results in unbalanced receive traffic.
+
+ If you have multiple interfaces in a server, do either of the following:
+
+ - Turn on ARP filtering by entering:
+ echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
+
+ - Install the interfaces in separate broadcast domains - either in
+ different switches or in a switch partitioned to VLANs.
+
+
+ UDP Stress Test Dropped Packet Issue
+ --------------------------------------
+ Under small packets UDP stress test with 10GbE driver, the Linux system
+ may drop UDP packets due to the fullness of socket buffers. You may want
+ to change the driver's Flow Control variables to the minimum value for
+ controlling packet reception.
+
+
+ Tx Hangs Possible Under Stress
+ ------------------------------
+ Under stress conditions, if TX hangs occur, turning off TSO
+ "ethtool -K eth0 tso off" may resolve the problem.
+
+
Support
=======
-For general information and support, go to the Intel support website at:
+For general information, go to the Intel support website at:
http://support.intel.com
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+ http://sourceforge.net/projects/e1000
+
If an issue is identified with the released source code on the supported
-kernel with a supported adapter, email the specific information related to
-the issue to linux.nics@intel.com.
+kernel with a supported adapter, email the specific information related
+to the issue to e1000-devel@lists.sf.net
diff --git a/Documentation/networking/mac80211_hwsim/README b/Documentation/networking/mac80211_hwsim/README
new file mode 100644
index 00000000000..2ff8ccb8dc3
--- /dev/null
+++ b/Documentation/networking/mac80211_hwsim/README
@@ -0,0 +1,67 @@
+mac80211_hwsim - software simulator of 802.11 radio(s) for mac80211
+Copyright (c) 2008, Jouni Malinen <j@w1.fi>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License version 2 as
+published by the Free Software Foundation.
+
+
+Introduction
+
+mac80211_hwsim is a Linux kernel module that can be used to simulate
+arbitrary number of IEEE 802.11 radios for mac80211. It can be used to
+test most of the mac80211 functionality and user space tools (e.g.,
+hostapd and wpa_supplicant) in a way that matches very closely with
+the normal case of using real WLAN hardware. From the mac80211 view
+point, mac80211_hwsim is yet another hardware driver, i.e., no changes
+to mac80211 are needed to use this testing tool.
+
+The main goal for mac80211_hwsim is to make it easier for developers
+to test their code and work with new features to mac80211, hostapd,
+and wpa_supplicant. The simulated radios do not have the limitations
+of real hardware, so it is easy to generate an arbitrary test setup
+and always reproduce the same setup for future tests. In addition,
+since all radio operation is simulated, any channel can be used in
+tests regardless of regulatory rules.
+
+mac80211_hwsim kernel module has a parameter 'radios' that can be used
+to select how many radios are simulated (default 2). This allows
+configuration of both very simply setups (e.g., just a single access
+point and a station) or large scale tests (multiple access points with
+hundreds of stations).
+
+mac80211_hwsim works by tracking the current channel of each virtual
+radio and copying all transmitted frames to all other radios that are
+currently enabled and on the same channel as the transmitting
+radio. Software encryption in mac80211 is used so that the frames are
+actually encrypted over the virtual air interface to allow more
+complete testing of encryption.
+
+A global monitoring netdev, hwsim#, is created independent of
+mac80211. This interface can be used to monitor all transmitted frames
+regardless of channel.
+
+
+Simple example
+
+This example shows how to use mac80211_hwsim to simulate two radios:
+one to act as an access point and the other as a station that
+associates with the AP. hostapd and wpa_supplicant are used to take
+care of WPA2-PSK authentication. In addition, hostapd is also
+processing access point side of association.
+
+Please note that the current Linux kernel does not enable AP mode, so a
+simple patch is needed to enable AP mode selection:
+http://johannes.sipsolutions.net/patches/kernel/all/LATEST/006-allow-ap-vlan-modes.patch
+
+
+# Build mac80211_hwsim as part of kernel configuration
+
+# Load the module
+modprobe mac80211_hwsim
+
+# Run hostapd (AP) for wlan0
+hostapd hostapd.conf
+
+# Run wpa_supplicant (station) for wlan1
+wpa_supplicant -Dwext -iwlan1 -c wpa_supplicant.conf
diff --git a/Documentation/networking/mac80211_hwsim/hostapd.conf b/Documentation/networking/mac80211_hwsim/hostapd.conf
new file mode 100644
index 00000000000..08cde7e35f2
--- /dev/null
+++ b/Documentation/networking/mac80211_hwsim/hostapd.conf
@@ -0,0 +1,11 @@
+interface=wlan0
+driver=nl80211
+
+hw_mode=g
+channel=1
+ssid=mac80211 test
+
+wpa=2
+wpa_key_mgmt=WPA-PSK
+wpa_pairwise=CCMP
+wpa_passphrase=12345678
diff --git a/Documentation/networking/mac80211_hwsim/wpa_supplicant.conf b/Documentation/networking/mac80211_hwsim/wpa_supplicant.conf
new file mode 100644
index 00000000000..299128cff03
--- /dev/null
+++ b/Documentation/networking/mac80211_hwsim/wpa_supplicant.conf
@@ -0,0 +1,10 @@
+ctrl_interface=/var/run/wpa_supplicant
+
+network={
+ ssid="mac80211 test"
+ psk="12345678"
+ key_mgmt=WPA-PSK
+ proto=WPA2
+ pairwise=CCMP
+ group=CCMP
+}
diff --git a/Documentation/networking/multiqueue.txt b/Documentation/networking/multiqueue.txt
index ea5a42e8f79..d391ea63114 100644
--- a/Documentation/networking/multiqueue.txt
+++ b/Documentation/networking/multiqueue.txt
@@ -3,19 +3,11 @@
===========================================
Section 1: Base driver requirements for implementing multiqueue support
-Section 2: Qdisc support for multiqueue devices
-Section 3: Brief howto using PRIO or RR for multiqueue devices
-
Intro: Kernel support for multiqueue devices
---------------------------------------------------------
-Kernel support for multiqueue devices is only an API that is presented to the
-netdevice layer for base drivers to implement. This feature is part of the
-core networking stack, and all network devices will be running on the
-multiqueue-aware stack. If a base driver only has one queue, then these
-changes are transparent to that driver.
-
+Kernel support for multiqueue devices is always present.
Section 1: Base driver requirements for implementing multiqueue support
-----------------------------------------------------------------------
@@ -32,84 +24,4 @@ netif_{start|stop|wake}_subqueue() functions to manage each queue while the
device is still operational. netdev->queue_lock is still used when the device
comes online or when it's completely shut down (unregister_netdev(), etc.).
-Finally, the base driver should indicate that it is a multiqueue device. The
-feature flag NETIF_F_MULTI_QUEUE should be added to the netdev->features
-bitmap on device initialization. Below is an example from e1000:
-
-#ifdef CONFIG_E1000_MQ
- if ( (adapter->hw.mac.type == e1000_82571) ||
- (adapter->hw.mac.type == e1000_82572) ||
- (adapter->hw.mac.type == e1000_80003es2lan))
- netdev->features |= NETIF_F_MULTI_QUEUE;
-#endif
-
-
-Section 2: Qdisc support for multiqueue devices
------------------------------------------------
-
-Currently two qdiscs support multiqueue devices. A new round-robin qdisc,
-sch_rr, and sch_prio. The qdisc is responsible for classifying the skb's to
-bands and queues, and will store the queue mapping into skb->queue_mapping.
-Use this field in the base driver to determine which queue to send the skb
-to.
-
-sch_rr has been added for hardware that doesn't want scheduling policies from
-software, so it's a straight round-robin qdisc. It uses the same syntax and
-classification priomap that sch_prio uses, so it should be intuitive to
-configure for people who've used sch_prio.
-
-In order to utilitize the multiqueue features of the qdiscs, the network
-device layer needs to enable multiple queue support. This can be done by
-selecting NETDEVICES_MULTIQUEUE under Drivers.
-
-The PRIO qdisc naturally plugs into a multiqueue device. If
-NETDEVICES_MULTIQUEUE is selected, then on qdisc load, the number of
-bands requested is compared to the number of queues on the hardware. If they
-are equal, it sets a one-to-one mapping up between the queues and bands. If
-they're not equal, it will not load the qdisc. This is the same behavior
-for RR. Once the association is made, any skb that is classified will have
-skb->queue_mapping set, which will allow the driver to properly queue skb's
-to multiple queues.
-
-
-Section 3: Brief howto using PRIO and RR for multiqueue devices
----------------------------------------------------------------
-
-The userspace command 'tc,' part of the iproute2 package, is used to configure
-qdiscs. To add the PRIO qdisc to your network device, assuming the device is
-called eth0, run the following command:
-
-# tc qdisc add dev eth0 root handle 1: prio bands 4 multiqueue
-
-This will create 4 bands, 0 being highest priority, and associate those bands
-to the queues on your NIC. Assuming eth0 has 4 Tx queues, the band mapping
-would look like:
-
-band 0 => queue 0
-band 1 => queue 1
-band 2 => queue 2
-band 3 => queue 3
-
-Traffic will begin flowing through each queue if your TOS values are assigning
-traffic across the various bands. For example, ssh traffic will always try to
-go out band 0 based on TOS -> Linux priority conversion (realtime traffic),
-so it will be sent out queue 0. ICMP traffic (pings) fall into the "normal"
-traffic classification, which is band 1. Therefore pings will be send out
-queue 1 on the NIC.
-
-Note the use of the multiqueue keyword. This is only in versions of iproute2
-that support multiqueue networking devices; if this is omitted when loading
-a qdisc onto a multiqueue device, the qdisc will load and operate the same
-if it were loaded onto a single-queue device (i.e. - sends all traffic to
-queue 0).
-
-Another alternative to multiqueue band allocation can be done by using the
-multiqueue option and specify 0 bands. If this is the case, the qdisc will
-allocate the number of bands to equal the number of queues that the device
-reports, and bring the qdisc online.
-
-The behavior of tc filters remains the same, where it will override TOS priority
-classification.
-
-
Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>
diff --git a/Documentation/networking/s2io.txt b/Documentation/networking/s2io.txt
index 1e28e2ddb90..c3d6b4d5d01 100644
--- a/Documentation/networking/s2io.txt
+++ b/Documentation/networking/s2io.txt
@@ -52,13 +52,10 @@ d. MSI/MSI-X. Can be enabled on platforms which support this feature
(IA64, Xeon) resulting in noticeable performance improvement(upto 7%
on certain platforms).
-e. NAPI. Compile-time option(CONFIG_S2IO_NAPI) for better Rx interrupt
-moderation.
-
-f. Statistics. Comprehensive MAC-level and software statistics displayed
+e. Statistics. Comprehensive MAC-level and software statistics displayed
using "ethtool -S" option.
-g. Multi-FIFO/Ring. Supports up to 8 transmit queues and receive rings,
+f. Multi-FIFO/Ring. Supports up to 8 transmit queues and receive rings,
with multiple steering options.
4. Command line parameters