aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2007-11-14Merge branch 'master' of ā†µLinus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: rt_check_expire() can take a long time, add a cond_resched() [ISDN] sc: Really, really fix warning [ISDN] sc: Fix sndpkt to have the correct number of arguments [TCP] FRTO: Clear frto_highmark only after process_frto that uses it [NET]: Remove notifier block from chain when register_netdevice_notifier fails [FS_ENET]: Fix module build. [TCP]: Make sure write_queue_from does not begin with NULL ptr [TCP]: Fix size calculation in sk_stream_alloc_pskb [S2IO]: Fixed memory leak when MSI-X vector allocation fails [BONDING]: Fix resource use after free [SYSCTL]: Fix warning for token-ring from sysctl checker [NET] random : secure_tcp_sequence_number should not assume CONFIG_KTIME_SCALAR [IWLWIFI]: Not correctly dealing with hotunplug. [TCP] FRTO: Plug potential LOST-bit leak [TCP] FRTO: Limit snd_cwnd if TCP was application limited [E1000]: Fix schedule while atomic when called from mii-tool. [NETX]: Fix build failure added by 2.6.24 statistics cleanup. [EP93xx_ETH]: Build fix after 2.6.24 NAPI changes. [PKT_SCHED]: Check subqueue status before calling hard_start_xmit
2007-11-14sunrpc/xprtrdma/transport.c: fix use-after-freeAdrian Bunk
Fix an obvious use-after-free spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14[NET]: rt_check_expire() can take a long time, add a cond_resched()Eric Dumazet
On commit 39c90ece7565f5c47110c2fa77409d7a9478bd5b: [IPV4]: Convert rt_check_expire() from softirq processing to workqueue. we converted rt_check_expire() from softirq to workqueue, allowing the function to perform all work it was supposed to do. When the IP route cache is big, rt_check_expire() can take a long time to run. (default settings : 20% of the hash table is scanned at each invocation) Adding cond_resched() helps giving cpu to higher priority tasks if necessary. Using a "if (need_resched())" test before calling "cond_resched();" is necessary to avoid spending too much time doing the resched check. (My tests gave a time reduction from 88 ms to 25 ms per rt_check_expire() run on my i686 test machine) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-14[TCP] FRTO: Clear frto_highmark only after process_frto that uses itIlpo Järvinen
I broke this in commit 3de96471bd7fb76406e975ef6387abe3a0698149: [TCP]: Wrap-safed reordering detection FRTO check tcp_process_frto should always see a valid frto_highmark. An invalid frto_highmark (zero) is very likely what ultimately caused a seqno compare in tcp_frto_enter_loss to do the wrong leading to the LOST-bit leak. Having LOST-bits integry ensured like done after commit 23aeeec365dcf8bc87fae44c533e50d0bb4f23cc: [TCP] FRTO: Plug potential LOST-bit leak won't hurt. It may still be useful in some other, possibly legimate, scenario. Reported by Chazarain Guillaume <guichaz@yahoo.fr>. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-14[NET]: Remove notifier block from chain when register_netdevice_notifier failsPavel Emelyanov
Commit fcc5a03ac42564e9e255c1134dda47442289e466: [NET]: Allow netdev REGISTER/CHANGENAME events to fail makes the register_netdevice_notifier() handle the error from the NETDEV_REGISTER event, sent to the registering block. The bad news is that in this case the notifier block is not removed from the list, but the error is returned to the caller. In case the caller is in module init function and handles this error this can abort the module loading. The notifier block will be then removed from the kernel, but will be left in the list. Oops :( I think that the notifier block should be removed from the chain in case of error, regardless whether this error is handled by the caller or not. In the worst case (the error is _not_ handled) module will not receive the events any longer. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-14[TCP]: Make sure write_queue_from does not begin with NULL ptrIlpo Järvinen
NULL ptr can be returned from tcp_write_queue_head to cached_skb and then assigned to skb if packets_out was zero. Without this, system is vulnerable to a carefully crafted ACKs which obviously is remotely triggerable. Besides, there's very little that needs to be done in sacktag if there weren't any packets outstanding, just skipping the rest doesn't hurt. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13[TCP] FRTO: Plug potential LOST-bit leakIlpo Järvinen
It might be possible that, in some extreme scenario that I just cannot now construct in my mind, end_seq <= frto_highmark check does not match causing the lost_out and LOST bits become out-of-sync due to clearing and recounting in the loop. This may fix LOST-bit leak reported by Chazarain Guillaume <guichaz@yahoo.fr>. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13[TCP] FRTO: Limit snd_cwnd if TCP was application limitedIlpo Järvinen
Otherwise TCP might violate packet ordering principles that FRTO is based on. If conventional recovery path is chosen, this won't be significant at all. In practice, any small enough value will be sufficient to provide proper operation for FRTO, yet other users of snd_cwnd might benefit from a "close enough" value. FRTO's formula is now equal to what tcp_enter_cwr() uses. FRTO used to check application limitedness a bit differently but I changed that in commit 575ee7140dabe9b9c4f66f4f867039b97e548867 and as a result checking for application limitedness became completely non-existing. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13[PKT_SCHED]: Check subqueue status before calling hard_start_xmitPeter P Waskiewicz Jr
The only qdiscs that check subqueue state before dequeue'ing are PRIO and RR. The other qdiscs, including the default pfifo_fast qdisc, will allow traffic bound for subqueue 0 through to hard_start_xmit. The check for netif_queue_stopped() is done above in pkt_sched.h, so it is unnecessary for qdisc_restart(). However, if the underlying driver is multiqueue capable, and only sets queue states on subqueues, this will allow packets to enter the driver when it's currently unable to process packets, resulting in expensive requeues and driver entries. This patch re-adds the check for the subqueue status before calling hard_start_xmit, so we can try and avoid the driver entry when the queues are stopped. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13[NETFILTER]: xt_time should not assume CONFIG_KTIME_SCALAREric Dumazet
It is not correct to assume one can get nsec from a ktime directly by using .tv64 field. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13[NET]: Move unneeded data to initdata section.Denis V. Lunev
This patch reverts Eric's commit 2b008b0a8e96b726c603c5e1a5a7a509b5f61e35 It diets .text & .data section of the kernel if CONFIG_NET_NS is not set. This is safe after list operations cleanup. Signed-of-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13[NET]: Cleanup pernet operation without CONFIG_NET_NSDenis V. Lunev
If CONFIG_NET_NS is not set, the only namespace is possible. This patch removes list of pernet_operations and cleanups code a bit. This list is not needed if there are no namespaces. We should just call ->init method. Additionally, the ->exit will be called on module unloading only. This case is safe - the code is not discarded. For the in/kernel code, ->exit should never be called. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13[NETFILTER]: bridge: fix double POSTROUTING hook invocationPatrick McHardy
Packets routed between bridges have the POST_ROUTING hook invoked twice since bridging mistakes them for bridged packets because they have skb->nf_bridge set. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13[NETFILTER]: Consolidate nf_sockopt and compat_nf_sockoptPavel Emelyanov
Both lookup the nf_sockopt_ops object to call the get/set callbacks from, but they perform it in a completely similar way. Introduce the helper for finding the ops. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13[NETFILTER]: nf_nat: fix memset errorLi Zefan
The size passing to memset is the size of a pointer. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12[INET]: Use list_head-s in inetpeer.cPavel Emelyanov
The inetpeer.c tracks the LRU list of inet_perr-s, but makes it by hands. Use the list_head-s for this. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12[IPVS]: Remove unused exports.Adrian Bunk
This patch removes the following unused EXPORT_SYMBOL's: - ip_vs_try_bind_dest - ip_vs_find_dest Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12[NET]: Unexport sysctl_{r,w}mem_max.Adrian Bunk
sysctl_{r,w}mem_max can now be unexported. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12[AF_PACKET]: Fix minor code duplicationUrs Thuermann
Simplify some code by eliminating duplicate if-else clauses in packet_do_bind(). Signed-off-by: Urs Thuermann <urs@isnogud.escape.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12Merge branch 'pending' of ā†µDavid S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev
2007-11-12[NET]: Add the helper kernel_sock_shutdown()Trond Myklebust
...and fix a couple of bugs in the NBD, CIFS and OCFS2 socket handlers. Looking at the sock->op->shutdown() handlers, it looks as if all of them take a SHUT_RD/SHUT_WR/SHUT_RDWR argument instead of the RCV_SHUTDOWN/SEND_SHUTDOWN arguments. Add a helper, and then define the SHUT_* enum to ensure that kernel users of shutdown() don't get confused. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12[IPV6]: Add ifindex field to ND user option messages.Pierre Ynard
Userland neighbor discovery options are typically heavily involved with the interface on which thay are received: add a missing ifindex field to the original struct. Thanks to Rémi Denis-Courmont. Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12Fix memory leak in discard case of sctp_sf_abort_violation()Jesper Juhl
In net/sctp/sm_statefuns.c::sctp_sf_abort_violation() we may leak the storage allocated for 'abort' by returning from the function without using or freeing it. This happens in case "sctp_auth_recv_cid(SCTP_CID_ABORT, asoc)" is true and we jump to the 'discard' label. Spotted by the Coverity checker. The simple fix is to simply move the creation of the "abort chunk" to after the possible jump to the 'discard' label. This way we don't even have to allocate the memory at all in the problem case. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2007-11-10[INET]: Small possible memory leak in FIB rulesDenis V. Lunev
This patch fixes a small memory leak. Default fib rules can be deleted by the user if the rule does not carry FIB_RULE_PERMANENT flag, f.e. by ip rule flush Such a rule will not be freed as the ref-counter has 2 on start and becomes clearly unreachable after removal. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[NETNS]: init dev_base_lock only onceAlexey Dobriyan
* it already statically initialized * reinitializing live global spinlock every time netns is setup is also wrong Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[UNIX]: The unix_nr_socks limit can be exceededPavel Emelyanov
The unix_nr_socks value is limited with the 2 * get_max_files() value, as seen from the unix_create1(). However, the check and the actual increment are separated with the GFP_KERNEL allocation, so this limit can be exceeded under a memory pressure - task may go to sleep freeing the pages and some other task will be allowed to allocate a new sock and so on and so forth. So make the increment before the check (similar thing is done in the sock_kmalloc) and go to kmalloc after this. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[AF_UNIX]: Convert socks to unix_socks in scan_inflight, not in callbacksPavel Emelyanov
The scan_inflight() routine scans through the unix sockets and calls some passed callback. The fact is that all these callbacks work with the unix_sock objects, not the sock ones, so make this conversion in the scan_inflight() before calling the callbacks. This removes one unneeded variable from the inc_inflight_move_tail(). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[AF_UNIX]: Make unix_tot_inflight counter non-atomicPavel Emelyanov
This counter is _always_ modified under the unix_gc_lock spinlock, so its atomicity can be provided w/o additional efforts. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[AF_PACKET]: Allow multicast traffic to be caught by ORIGDEV when bondedPeter P Waskiewicz Jr
The socket option for packet sockets to return the original ifindex instead of the bonded ifindex will not match multicast traffic. Since this socket option is the most useful for layer 2 traffic and multicast traffic, make the option multicast-aware. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10mac80211: fix MAC80211_RCSIMPLE KconfigJohannes Berg
I meant for this to be selectable only with EMBEDDED, not enabled only with EMBEDDED. This does it that way. Sorry. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10mac80211: make "decrypt failed" messages conditional upon MAC80211_DEBUGJohn W. Linville
Make "decrypt failed" and "have no key" debugging messages compile conditionally upon CONFIG_MAC80211_DEBUG. They have been useful for finding certain problems in the past, but in many cases they just clutter a user's logs. A typical example is an enviornment where multiple SSIDs are using a single BSSID but with different protection schemes or different keys for each SSID. In such an environment these messages are just noise. Let's just leave them for those interested enough to turn-on debugging. Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10mac80211: use IW_AUTH_PRIVACY_INVOKED rather than IW_AUTH_KEY_MGMTJohannes Berg
In the long bug-hunt for why dynamic WEP networks didn't work it turned out that mac80211 incorrectly uses IW_AUTH_KEY_MGMT while it should use IW_AUTH_PRIVACY_INVOKED to determine whether to associate to protected networks or not. This patch changes the behaviour to be that way and clarifies the existing code. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10mac80211: remove unused driver opsJohannes Berg
The driver operations set_ieee8021x(), set_port_auth() and set_privacy_invoked() are not used by any drivers, except set_privacy_invoked() they aren't even used by mac80211. Remove them at least until we need to support drivers with mac80211 that require getting this information. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10mac80211: remove ieee80211_common.hJohannes Berg
Robert pointed out that I missed this file when removing the management interface. Do it now. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10rfkill: Fix sparse warningMichael Buesch
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10rfkill: Use mutex_lock() at register and add sanity checkMichael Buesch
Replace mutex_lock_interruptible() by mutex_lock() in rfkill_register(), as interruptible doesn't make sense there. Add a sanity check for rfkill->type, as that's used for an unchecked dereference in an array and might cause hard to debug crashes if the driver sets this to an invalid value. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10mac80211: allow driver to ask for a rate control algorithmJohannes Berg
This allows a driver to ask for a specific rate control algorithm. The rate control algorithm asked for must be registered and be available as a module or built-in. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10mac80211: don't allow registering the same rate control twiceJohannes Berg
Previously, mac80211 would allow registering the same rate control algorithm twice. This is a programming error in the registration and should not happen; additionally the second version could never be selected. Disallow this and warn about it. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10rfkill: Use subsys_initcallMichael Buesch
We must use subsys_initcall, because we must initialize before a driver calls rfkill_register(). Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10mac80211: make simple rate control algorithm built-inJohannes Berg
Too frequently people do not have module autoloading enabled or fail to install the rate control module correctly, hence their hardware probing fails due to no rate control algorithm being available. This makes the 'simple' algorithm built into the mac80211 module unless EMBEDDED is enabled in which case it can be disabled (eg. if the wanted driver requires another rate control algorithm.) Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10rfkill: Register LED triggers before registering switchMichael Buesch
Registering the switch triggers a LED event, so we must register LED triggers before the switch. This has a potential to fix a crash, depending on how the device driver initializes the rfkill data structure. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10softmac: fix wext MLME request reason code endiannessJohannes Berg
The MLME request reason code is host-endian and our passing it to the low level functions is host-endian as well since they do the swapping. I noticed that the reason code 768 was sent (0x300) rather than 3 when wpa_supplicant terminates. This removes the superfluous cpu_to_le16() call. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-11-10[PKT_SCHED] CLS_U32: Use ffs() instead of C code on hash mask to get first ā†µRadu Rendec
set bit. Computing the rank of the first set bit in the hash mask (for using later in u32_hash_fold()) was done with plain C code. Using ffs() instead makes the code more readable and improves performance (since ffs() is better optimized in assembler). Using the conditional operator on hash mask before applying ntohl() also saves one ntohl() call if mask is 0. Signed-off-by: Radu Rendec <radu.rendec@ines.ro> Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[VLAN]: Allow setting mac address while device is upPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[VLAN]: Don't synchronize addresses while the vlan device is downPatrick McHardy
While the VLAN device is down, the unicast addresses are not configured on the underlying device, so we shouldn't attempt to sync them. Noticed by Dmitry Butskoy <buc@odusz.so-cdu.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[INET]: Cleanup the xfrm4_tunnel_(un)registerPavel Emelyanov
Both check for the family to select an appropriate tunnel list. Consolidate this check and make the for() loop more readable. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[INET]: Add missed tunnel64_err handlerPavel Emelyanov
The tunnel64_protocol uses the tunnel4_protocol's err_handler and thus calls the tunnel4_protocol's handlers. This is not very good, as in case of (icmp) error the wrong error handlers will be called (e.g. ipip ones instead of sit) and this won't be noticed at all, because the error is not reported. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[IPX]: Use existing sock refcnt debugging infrastructurePavel Emelyanov
Just like in the af_packet.c, the ipx_sock_nr variable is used for debugging purposes. Switch to using existing infrastructure. Thanks to Arnaldo for pointing this out. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[PACKET]: Use existing sock refcnt debugging infrastructurePavel Emelyanov
The packet_socks_nr variable is used purely for debugging the number of sockets. As Arnaldo pointed out, there's already an infrastructure for this purposes, so switch to using it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[NET]: Fix infinite loop in dev_mc_unsync().Joe Perches
From: Joe Perches <joe@perches.com> Based upon an initial patch and report by Luis R. Rodriguez. Signed-off-by: David S. Miller <davem@davemloft.net>