aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (56 commits) netns: Fix crash by making igmp per namespace bnx2x: Version update bnx2x: Checkpatch compliance bnx2x: Spelling mistakes bnx2x: Minor code improvements bnx2x: Driver info bnx2x: 1G LED does not turn off bnx2x: 8073 PHY changes bnx2x: Change GPIO for any port bnx2x: Pause settings bnx2x: Link order with external PHY bnx2x: No LRO without Rx checksum bnx2x: Wrong structure size bnx2x: WoL capability bnx2x: Clearing MAC addresses filters bnx2x: Delay in while loops bnx2x: PBA Table Page Alignment Workaround bnx2x: Self-test false positive bnx2x: Memory allocation bnx2x: HW attention lock ...
2008-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Handle stack trace attempts before irqstacks are setup. sparc64: Implement IRQ stacks. sparc: remove include of linux/of_device.h from asm/of_device.h sparc64: Fix recursion in stack overflow detection handling. sparc/drivers: use linux/of_device.h instead of asm/of_device.h sparc64: Don't MAGIC_SYSRQ ifdef smp_fetch_global_regs and support code.
2008-08-13sparc64: Handle stack trace attempts before irqstacks are setup.David S. Miller
Things like lockdep can try to do stack backtraces before the irqstack blocks have been setup. So don't try to match their ranges so early on. Also, remove unused variable in save_stack_trace(). Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13netns: Fix crash by making igmp per namespaceDaniel Lezcano
This patch makes the multicast socket to be per namespace. When a network namespace is created, other than the init_net and a multicast packet is received, the kernel goes to a hang or a kernel panic. How to reproduce ? * create a child network namespace * create a pair virtual device veth * ip link add type veth * move one side to the pair network device to the child namespace * ip link set netns <childpid> dev veth1 * ping -I veth0 224.0.0.1 The bug appears because the function ip_mc_init_dev does not initialize the different multicast fields as it exits because it is not the init_net. BUG: soft lockup - CPU#0 stuck for 61s! [avahi-daemon:2695] Modules linked in: irq event stamp: 50350 hardirqs last enabled at (50349): [<c03ee949>] _spin_unlock_irqrestore+0x34/0x39 hardirqs last disabled at (50350): [<c03ec639>] schedule+0x9f/0x5ff softirqs last enabled at (45712): [<c0374d4b>] ip_setsockopt+0x8e7/0x909 softirqs last disabled at (45710): [<c03ee682>] _spin_lock_bh+0x8/0x27 Pid: 2695, comm: avahi-daemon Not tainted (2.6.27-rc2-00029-g0872073 #3) EIP: 0060:[<c03ee47c>] EFLAGS: 00000297 CPU: 0 EIP is at __read_lock_failed+0x8/0x10 EAX: c4f38810 EBX: c4f38810 ECX: 00000000 EDX: c04cc22e ESI: fb0000e0 EDI: 00000011 EBP: 0f02000a ESP: c4e3faa0 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 8005003b CR2: 44618a40 CR3: 04e37000 CR4: 000006d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 [<c02311f8>] ? _raw_read_lock+0x23/0x25 [<c0390666>] ? ip_check_mc+0x1c/0x83 [<c036d478>] ? ip_route_input+0x229/0xe92 [<c022e2e4>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c0104c9c>] ? do_IRQ+0x69/0x7d [<c0102e64>] ? restore_nocheck_notrace+0x0/0xe [<c036fdba>] ? ip_rcv+0x227/0x505 [<c0358764>] ? netif_receive_skb+0xfe/0x2b3 [<c03588d2>] ? netif_receive_skb+0x26c/0x2b3 [<c035af31>] ? process_backlog+0x73/0xbd [<c035a8cd>] ? net_rx_action+0xc1/0x1ae [<c01218a8>] ? __do_softirq+0x7b/0xef [<c0121953>] ? do_softirq+0x37/0x4d [<c035b50d>] ? dev_queue_xmit+0x3d4/0x40b [<c0122037>] ? local_bh_enable+0x96/0xab [<c035b50d>] ? dev_queue_xmit+0x3d4/0x40b [<c012181e>] ? _local_bh_enable+0x79/0x88 [<c035fcb8>] ? neigh_resolve_output+0x20f/0x239 [<c0373118>] ? ip_finish_output+0x1df/0x209 [<c0373364>] ? ip_dev_loopback_xmit+0x62/0x66 [<c0371db5>] ? ip_local_out+0x15/0x17 [<c0372013>] ? ip_push_pending_frames+0x25c/0x2bb [<c03891b8>] ? udp_push_pending_frames+0x2bb/0x30e [<c038a189>] ? udp_sendmsg+0x413/0x51d [<c038a1a9>] ? udp_sendmsg+0x433/0x51d [<c038f927>] ? inet_sendmsg+0x35/0x3f [<c034f092>] ? sock_sendmsg+0xb8/0xd1 [<c012d554>] ? autoremove_wake_function+0x0/0x2b [<c022e6de>] ? copy_from_user+0x32/0x5e [<c022e6de>] ? copy_from_user+0x32/0x5e [<c034f238>] ? sys_sendmsg+0x18d/0x1f0 [<c0175e90>] ? pipe_write+0x3cb/0x3d7 [<c0170347>] ? do_sync_write+0xbe/0x105 [<c012d554>] ? autoremove_wake_function+0x0/0x2b [<c03503b2>] ? sys_socketcall+0x176/0x1b0 [<c01085ea>] ? syscall_trace_enter+0x6c/0x7b [<c0102e1a>] ? syscall_call+0x7/0xb Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Version updateEilon Greenstein
Version update Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Checkpatch complianceEilon Greenstein
Checkpatch compliance The latest version of checkpatch found the following style errors in the code Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Spelling mistakesEilon Greenstein
Spelling mistakes Spelling has to L's in it... Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Minor code improvementsEilon Greenstein
Minor code improvements Small changes to make the code a little bit more efficient and mostly more readable: - Using unified macros for EMAC_RD/WR which looks like normal REG_RD/WR - Removing the NIG_WR since it did nothing and was only confusing - On bnx2x_panic_dump, print only the used parts of the rings - define parameters only on the branch they are needed and not at the beginning of the function - using NETIF_MSG_INTR and not private BNX2X_MSG_SP for debug prints Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Driver infoEilon Greenstein
Driver info The internal FW which is downloaded by the driver should not be displayed - it is only causing confusion and it is redundant since it can be concluded from the driver version. Display only FW which is burned on the board nvram Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: 1G LED does not turn offEilon Greenstein
1G LED does not turn off The 1G LED was not switched to off when the link was lost Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: 8073 PHY changesYaniv Rosner
8073 PHY changes The initial support we had for this PHY needs some serious changing. The major change is that this PHY should be initialized only when the first function is loaded and not for each function. The official SPI-ROM of this PHY was released and it requires some changes in the initialization code as well Signed-off-by: Yaniv Rosner <yanivr@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Change GPIO for any portEilon Greenstein
Change GPIO for any port The set GPIO function should receive the port index to allow changing the GPIO of another port. This is needed for the common init phase (one the first driver is loaded for the chip) Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Pause settingsYaniv Rosner
Pause settings - 1G pause was not working due to missing write to the emac block (TX_MODE_FLOW_EN) - The flow control should use the negotiated result (after autoneg) so we should save both the requested autoneg and the result - The HW credits with flow control at 1G speed were not optimized and caused low throughput - It is recommended to turn off flow control if the MTU is bigger than 5000B due to internal buffers size Signed-off-by: Yaniv Rosner <yanivr@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Link order with external PHYYaniv Rosner
Link order with external PHY When external PHY exists (second chip with the PHY to translate to another physical medium) the link with the eternal PHY and the network should be established before setting the link between the 5771x and the PHY. This is the right order and it is important when using autoneg - the link to the network should use the autoneg and the link between the two chips should be forced to the network result. Signed-off-by: Yaniv Rosner <yanivr@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: No LRO without Rx checksumVladislav Zolotarov
No LRO without Rx checksum Disabling LRO when Rx checksum is disabled Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Wrong structure sizeYitchak Gertner
Wrong structure size The wrong structure was used in the sizeof to clear (luckily both structures have the same size in this version...) Signed-off-by: Yitchak Gertner <gertner@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: WoL capabilityEilon Greenstein
WoL capability All designs reported WoL capability regardless of HW limitations - check if this device is actually capable of WoL Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Clearing MAC addresses filtersYitchak Gertner
Clearing MAC addresses filters When the driver unloads, it should clear the MAC addresses filters in the HW - this prevents packets from entering the chip when the driver is re-loaded before initializing the right filters Signed-off-by: Yitchak Gertner <gertner@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Delay in while loopsYitchak Gertner
Delay in while loops The delay in the loop should be after the change. This has very little effect (can save one delay) but it is the right thing to do Signed-off-by: Yitchak Gertner <gertner@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: PBA Table Page Alignment WorkaroundEilon Greenstein
PBA Table Page Alignment Workaround The PBA table starts on the middle of the page and that's causing very low performance with virtualization. The solution is not to update via the BAR directly but via chip access to the same memory Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Self-test false positiveYitchak Gertner
Self-test false positive - The memory test should use a mask according to the chip type - In the register test, check the port only once and not inside the for loop (not causing a failure - just ugly) Signed-off-by: Yitchak Gertner <gertner@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Memory allocationEilon Greenstein
Memory allocation - The CQE ring was allocated to the max size even for a chip that does not support it. Fixed to allocate according to the chip type to save memory - The rx_page_ring was not freed on driver unload Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: HW attention lockEilon Greenstein
HW attention lock Making sure that only one function will handle the HW attention. This makes the device parameter aeu_mask redundant so it is removed Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: HW lock mechanismYitchak Gertner
HW lock mechanism Enhancing the HW lock to work per function and not only per port - this is needed for the next patch that protects races over HW attention detection between the different functions. At this chance, changing the functions names to be more inline with the current naming convention Signed-off-by: Yitchak Gertner <gertner@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Load/Unload under trafficVladislav Zolotarov
Load/Unload under traffic Few issues were found when loading and unloading under traffic: - When receiving Tx interrupt call netif_wake_queue if the queue is stopped but the state is open - Check that interrupts are enabled before doing anything else on the msix_fp_int function - In nic_load, enable the interrupts only when needed and ready for it - Function stop_leading returns status since it can fail - Add 1ms delay when unloading the driver to validate that there are no open transactions that already started by the FW - Splitting the "has work" function into Tx and Rx so the same function will be used on unload and interrupts - Do not request for WoL if only resetting the device (save the time that it takes the FW to set the link after reset) - Fixing the device reset after iSCSI boot and before driver load - all internal buffers must be cleared before the driver is loaded Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: FW Internal Memory structureEilon Greenstein
FW Internal Memory structure The FW uses data structures on the chip internal memory to aggregate the connections when TPA is enabled. The driver was clearing the wrong offsets and therefore one function could cause another function to loose packets. Changing the initialization of the chip internal memory to clear only the relevant memory for each function which is being loaded Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: StatisticsYitchak Gertner
Statistics - Making sure that each drop is accounted for in the driver statistics - Clearing the FW statistics when driver is loaded to prevent inconsistency with HW statistics - Once error is detected (bnx2x_panic_dump), stop the statistics before other actions (currently it is stopped last and can corrupt the data) - Adding HW checksum error counter to the statistics - Removing unused variable stats_ticks - Using macros instead of magic numbers to indicate which statistics are shared per port and which are per function Signed-off-by: Yitchak Gertner <gertner@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: Not dropping packets with L3/L4 checksum errorEilon Greenstein
Not dropping packets with L3/L4 checksum error Those packets should be passed to the OS. The problem is clear in forwarding mode. Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13bnx2x: FW (bootcode) interface fixesEilon Greenstein
FW (bootcode) interface fixes - Making sure that the device will not cause kernel panic of the bootcode is corrupted or missing - Removing module debug parameter "nomcp" since no one should work without the bootcode (this is a left over from the chip bring up days) - Instead of waiting fix amount of time for bootcode response, sample it every 10ms (usually the answer is ready after less than 10ms) Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: padlock - fix VIA PadLock instruction usage with irq_ts_save/restore() crypto: hash - Add missing top-level functions crypto: hash - Fix digest size check for digest type crypto: tcrypt - Fix AEAD chunk testing crypto: talitos - Add handling for SEC 3.x treatment of link table
2008-08-13pkt_sched: Protect gen estimators under est_lock.Jarek Poplawski
gen_kill_estimator() required rtnl_lock() protection, but since it is moved to an RCU callback __qdisc_destroy() let's use est_lock instead. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13pkt_sched: Fix queue quiescence testing in dev_deactivate().David S. Miller
Based upon discussions with Jarek P. and Herbert Xu. First, we're testing the wrong qdisc. We just reset the device queue qdiscs to &noop_qdisc and checking it's state is completely pointless here. We want to wait until the previous qdisc that was sitting at the ->qdisc pointer is not busy any more. And that would be ->qdisc_sleeping. Because of how we propagate the samples qdisc pointer down into qdisc_run and friends via per-cpu ->output_queue and netif_schedule, we have to wait also for the __QDISC_STATE_SCHED bit to clear as well. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13Merge git://oss.sgi.com:8090/xfs/linux-2.6Linus Torvalds
* git://oss.sgi.com:8090/xfs/linux-2.6: (45 commits) [XFS] Fix use after free in xfs_log_done(). [XFS] Make xfs_bmap_*_count_leaves void. [XFS] Use KM_NOFS for debug trace buffers [XFS] use KM_MAYFAIL in xfs_mountfs [XFS] refactor xfs_mount_free [XFS] don't call xfs_freesb from xfs_unmountfs [XFS] xfs_unmountfs should return void [XFS] cleanup xfs_mountfs [XFS] move root inode IRELE into xfs_unmountfs [XFS] stop using file_update_time [XFS] optimize xfs_ichgtime [XFS] update timestamp in xfs_ialloc manually [XFS] remove the sema_t from XFS. [XFS] replace dquot flush semaphore with a completion [XFS] replace inode flush semaphore with a completion [XFS] extend completions to provide XFS object flush requirements [XFS] replace the XFS buf iodone semaphore with a completion [XFS] clean up stale references to semaphores [XFS] use get_unaligned_* helpers [XFS] Fix compile failure in xfs_buf_trace() ...
2008-08-13pkt_sched: Fix oops in htb_delete.Jarek Poplawski
Recent changes introduced a bug in htb_delete(): cl->parent->children counter update misses checking cl->parent for NULL, which is used for root classes, so deleting them causes an oops. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: rename structs dlm: add missing kfrees
2008-08-13pktgen: prevent pktgen from using bad tx queueAndrew Gallatin
With the new multi-queue transmit code, it is possible to accidentally make pktgen pick a non-existing tx queue simply by using a stale script to drive pktgen. Access to this non-existing tx queue will then trigger a bad memory access and kill the machine. For example, setting "queue_map_max 2" will cause my machine to die when accessing a garbage spinlock in the non-existing tx queue: BUG: spinlock bad magic on CPU#0, kpktgend_0/564 lock: ffff88001ddf6718, .magic: ffffffff, .owner: /-1, .owner_cpu: 0 Pid: 564, comm: kpktgend_0 Not tainted 2.6.27-rc3 #35 Call Trace: [<ffffffff803a1228>] spin_bug+0xa4/0xac [<ffffffff803a1253>] _raw_spin_lock+0x23/0x123 [<ffffffff8055b06f>] _spin_lock_bh+0x17/0x1b [<ffffffff804cb57d>] pktgen_thread_worker+0xa97/0x1002 [<ffffffff8022874d>] ? finish_task_switch+0x38/0x97 [<ffffffff80242077>] ? autoremove_wake_function+0x0/0x36 [<ffffffff80242077>] ? autoremove_wake_function+0x0/0x36 [<ffffffff804caae6>] ? pktgen_thread_worker+0x0/0x1002 [<ffffffff80241a40>] kthread+0x44/0x6d [<ffffffff8020c399>] child_rip+0xa/0x11 [<ffffffff802419fc>] ? kthread+0x0/0x6d [<ffffffff8020c38f>] ? child_rip+0x0/0x11 The attached patch adds some sanity checking to prevent these sorts of configuration errors. Signed-off-by: Andrew Gallatin <gallatin@myri.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13[h8300] move include/asm-h8300 to arch/h8300/include/asmLinus Torvalds
Done as a script (well, a single "git mv" actually) on request from Yoshinori Sato as a way to avoid a huge diff. Requested-by: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-13dccp: change L/R must have at least one byte in the dccpsf_val fieldArnaldo Carvalho de Melo
Thanks to Eugene Teo for reporting this problem. Signed-off-by: Eugene Teo <eugenete@kernel.sg> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13xfrm: remove unnecessary variable in xfrm_output_resume() 2nd tryJean-Christophe DUBOIS
Small fix removing an unnecessary intermediate variable. Signed-off-by: Jean-Christophe DUBOIS <jcd@tribudubois.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13dlm: rename structsDavid Teigland
Add a dlm_ prefix to the struct names in config.c. This resolves a conflict with struct node in particular, when include/linux/node.h happens to be included. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Teigland <teigland@redhat.com>
2008-08-13dlm: add missing kfreesDavid Teigland
A couple of unlikely error conditions were missing a kfree on the error exit path. Reported-by: Juha Leppanen <juha_motorsportcom@luukku.com> Signed-off-by: David Teigland <teigland@redhat.com>
2008-08-13crypto: padlock - fix VIA PadLock instruction usage with irq_ts_save/restore()Suresh Siddha
Wolfgang Walter reported this oops on his via C3 using padlock for AES-encryption: ################################################################## BUG: unable to handle kernel NULL pointer dereference at 000001f0 IP: [<c01028c5>] __switch_to+0x30/0x117 *pde = 00000000 Oops: 0002 [#1] PREEMPT Modules linked in: Pid: 2071, comm: sleep Not tainted (2.6.26 #11) EIP: 0060:[<c01028c5>] EFLAGS: 00010002 CPU: 0 EIP is at __switch_to+0x30/0x117 EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300 ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000) Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046 c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000 c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0 Call Trace: [<c03b5b43>] ? schedule+0x285/0x2ff [<c0131856>] ? pm_qos_requirement+0x3c/0x53 [<c0239f54>] ? acpi_processor_idle+0x0/0x434 [<c01025fe>] ? cpu_idle+0x73/0x7f [<c03a4dcd>] ? rest_init+0x61/0x63 ======================= Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end() around the padlock instructions fix the oops. Suresh wrote: These padlock instructions though don't use/touch SSE registers, but it behaves similar to other SSE instructions. For example, it might cause DNA faults when cr0.ts is set. While this is a spurious DNA trap, it might cause oops with the recent fpu code changes. This is the code sequence that is probably causing this problem: a) new app is getting exec'd and it is somewhere in between start_thread() and flush_old_exec() in the load_xyz_binary() b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is cleared. c) Now we get an interrupt/softirq which starts using these encrypt/decrypt routines in the network stack. This generates a math fault (as cr0.ts is '1') which sets TS_USEDFPU and restores the math that is in the task's xstate. d) Return to exec code path, which does start_thread() which does free_thread_xstate() and sets xstate pointer to NULL while the TS_USEDFPU is still set. e) At the next context switch from the new exec'd task to another task, we have a scenarios where TS_USEDFPU is set but xstate pointer is null. This can cause an oops during unlazy_fpu() in __switch_to() Now: 1) This should happen with or with out pre-emption. Viro also encountered similar problem with out CONFIG_PREEMPT. 2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because kernel_fpu_begin() will manually do a clts() and won't run in to the situation of setting TS_USEDFPU in step "c" above. 3) This was working before the fpu changes, because its a spurious math fault which doesn't corrupt any fpu/sse registers and the task's math state was always in an allocated state. With out the recent lazy fpu allocation changes, while we don't see oops, there is a possible race still present in older kernels(for example, while kernel is using kernel_fpu_begin() in some optimized clear/copy page and an interrupt/softirq happens which uses these padlock instructions generating DNA fault). This is the failing scenario that existed even before the lazy fpu allocation changes: 0. CPU's TS flag is set 1. kernel using FPU in some optimized copy routine and while doing kernel_fpu_begin() takes an interrupt just before doing clts() 2. Takes an interrupt and ipsec uses padlock instruction. And we take a DNA fault as TS flag is still set. 3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts 4. We complete the padlock routine 5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes the optimized copy routine and does kernel_fpu_end(). At this point, we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll set and not cleared. 6. Now kernel resumes its user operation. And at the next context switch, kernel sees it has do a FP save as TS_USEDFPU is still set and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu() will take a DNA fault, as cr0.ts is '1' and now, because we are in __switch_to(), math_state_restore() will get confused and will restore the next task's FP state and will save it in prev tasks's FP state. Remember, in __switch_to() we are already on the stack of the next task but take a DNA fault for the prev task. This causes the fpu leakage. Fix the padlock instruction usage by calling them inside the context of new routines irq_ts_save/restore(), which clear/restore cr0.ts manually in the interrupt context. This will not generate spurious DNA in the context of the interrupt which will fix the oops encountered and the possible FPU leakage issue. Reported-and-bisected-by: Wolfgang Walter <wolfgang.walter@stwm.de> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-13crypto: hash - Add missing top-level functionsHerbert Xu
The top-level functions init/update/final were missing for ahash. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-13crypto: hash - Fix digest size check for digest typeHerbert Xu
The changeset ca786dc738f4f583b57b1bba7a335b5e8233f4b0 crypto: hash - Fixed digest size check missed one spot for the digest type. This patch corrects that error. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-13crypto: tcrypt - Fix AEAD chunk testingHerbert Xu
My changeset 4b22f0ddb6564210c9ded7ba25b2a1007733e784 crypto: tcrpyt - Remove unnecessary kmap/kunmap calls introduced a typo that broke AEAD chunk testing. In particular, axbuf should really be xbuf. There is also an issue with testing the last segment when encrypting. The additional part produced by AEAD wasn't tested. Similarly, on decryption the additional part of the AEAD input is mistaken for corruption. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-13crypto: talitos - Add handling for SEC 3.x treatment of link tableLee Nipper
Later SEC revision requires the link table (used for scatter/gather) to have an extra entry to account for the total length in descriptor [4], which contains cipher Input and ICV. This only applies to decrypt, not encrypt. Without this change, on 837x, a gather return/length error results when a decryption uses a link table to gather the fragments. This is observed by doing a ping with size of 1447 or larger with AES, or a ping with size 1455 or larger with 3des. So, add check for SEC compatible "fsl,3.0" for using extra link table entry. Signed-off-by: Lee Nipper <lee.nipper@freescale.com> Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-13net-sched: fix Action flushing return codeJamal Hadi Salim
Flushing must consistently return ENOMEM on failure of any allocation Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13net-sched: Fix actions flushingJamal Hadi Salim
Flushing of actions has been broken since we changed the semantics of netlink parsed tb[X] to mean X is an attribute type. This makes the flushing work. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13net/rxrpc: Use an IS_ERR test rather than a NULL testJulien Brunel
In case of error, the function rxrpc_get_transport returns an ERR pointer, but never returns a NULL pointer. So after a call to this function, a NULL test should be replaced by an IS_ERR test. A simplified version of the semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @correct_null_test@ expression x,E; statement S1, S2; @@ x = rxrpc_get_transport(...) <... when != x = E if ( ( - x@p2 != NULL + ! IS_ERR ( x ) | - x@p2 == NULL + IS_ERR( x ) ) ) S1 else S2 ...> ? x = E; // </smpl> Signed-off-by: Julien Brunel <brunel@diku.dk> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13wext: Send name on eventsJamal Hadi Salim
In the minimal the wireless extensions oughta send at least the name in addition to the ifindex. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>