aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2006-08-31[IPV4]: Fix SNMPv2 "ipFragFails" counter errorWei Dong
When I tested Linux kernel 2.6.17.7 about statistics "ipFragFails",found that this counter couldn't increase correctly. The criteria is RFC2011: RFC2011 ipFragFails OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of IP datagrams that have been discarded because they needed to be fragmented at this entity but could not be, e.g., because their Don't Fragment flag was set." ::= { ip 18 } When I send big IP packet to a router with DF bit set to 1 which need to be fragmented, and router just sends an ICMP error message ICMP_FRAG_NEEDED but no increments for this counter(in the function ip_fragment). Signed-off-by: Wei Dong <weid@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-31[NET]: Rate limiting for socket allocation failure messages.Akinobu Mita
This patch limits the warning messages when socket allocation failures happen. It happens under memory pressure. Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-31[IPV6]: Fix kernel OOPs when setting sticky socket options.YOSHIFUJI Hideaki
Bug noticed by Remi Denis-Courmont <rdenis@simphalempin.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-29[IPV6]: ipv6_add_addr should install dstentry earlierKeir Fraser
ipv6_add_addr allocates a struct inet6_ifaddr and a dstentry, but it doesn't install the dstentry in ifa->rt until after it releases the addrconf_hash_lock. This means other CPUs will be able to see the new address while it hasn't been initialized completely yet. One possible fix would be to grab the ifp->lock spinlock when creating the address struct; a simpler fix is to just move the assignment. Acked-by: jbeulich@novell.com Acked-by: okir@suse.de Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-29[NETLINK]: Call panic if nl_table allocation failsAkinobu Mita
This patch makes crash happen if initialization of nl_table fails in initcalls. It is better than getting use after free crash later. Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-29[TCP]: Two RFC3465 Appropriate Byte Count fixes.Daikichi Osuga
1) fix slow start after retransmit timeout 2) fix case of L=2*SMSS acked bytes comparison Signed-off-by: Daikichi Osuga <osugad@s1.nttdocomo.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-29[IPV6]: SNMPv2 "ipv6IfStatsInAddrErrors" counter errorLv Liangying
When I tested Linux kernel 2.6.17.7 about statistics "ipv6IfStatsInAddrErrors", found that this counter couldn't increase correctly. The criteria is RFC2465: ipv6IfStatsInAddrErrors OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of input datagrams discarded because the IPv6 address in their IPv6 header's destination field was not a valid address to be received at this entity. This count includes invalid addresses (e.g., ::0) and unsupported addresses (e.g., addresses with unallocated prefixes). For entities which are not IPv6 routers and therefore do not forward datagrams, this counter includes datagrams discarded because the destination address was not a local address." ::= { ipv6IfStatsEntry 5 } When I send packet to host with destination that is ether invalid address(::0) or unsupported addresses(1::1), the Linux kernel just discard the packet, and the counter doesn't increase(in the function ip6_pkt_discard). Signed-off-by: Lv Liangying <lvly@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-29[SCTP]: Fix sctp_primitive_ABORT() call in sctp_close().Sridhar Samudrala
With the recent fix, the callers of sctp_primitive_ABORT() need to create an ABORT chunk and pass it as an argument rather than msghdr that was passed earlier. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-26[DCCP]: Fix CCID3Ian McDonald
This fixes CCID3 to give much closer performance to RFC4342. CCID3 is meant to alter sending rate based on RTT and loss. The performance was verified against: http://wand.net.nz/~perry/max_download.php For example I tested with netem and had the following parameters: Delayed Acks 1, MSS 256 bytes, RTT 105 ms, packet loss 5%. This gives a theoretical speed of 71.9 Kbits/s. I measured across three runs with this patch set and got 70.1 Kbits/s. Without this patchset the average was 232 Kbits/s which means Linux can't be used for CCID3 research properly. I also tested with netem turned off so box just acting as router with 1.2 msec RTT. The performance with this is the same with or without the patch at around 30 Mbit/s. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-26[BRIDGE] netfilter: memory corruption fixStephen Hemminger
The bridge-netfilter code will overwrite memory if there is not headroom in the skb to save the header. This first showed up when using Xen with sky2 driver that doesn't allocate the extra space. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-26[DCCP]: Introduce dccp_rx_hist_find_entryIan McDonald
This adds a new function dccp_rx_hist_find_entry. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-26[DCCP]: Introduces follows48 functionIan McDonald
This adds a new function to see if two sequence numbers follow each other. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-26[DCCP]: Update contact details and copyrightIan McDonald
Just updating copyright and contacts Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-26[DCCP]: Fix typoIan McDonald
This fixes a small typo in net/dccp/libs/packet_history.c Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-26[IPV6]: Segmentation offload not set correctly on TCP childrenStephen Hemminger
TCP over IPV6 would incorrectly inherit the GSO settings. This would cause kernel to send Tcp Segmentation Offload packets for IPV6 data to devices that can't handle it. It caused the sky2 driver to lock http://bugzilla.kernel.org/show_bug.cgi?id=7050 and the e1000 would generate bogus packets. I can't blame the hardware for gagging if the upper layers feed it garbage. This was a new bug in 2.6.18 introduced with GSO support. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-24Merge branch 'fixes' of git://git.linux-nfs.org/pub/linux/nfs-2.6Greg Kroah-Hartman
2006-08-24NFS: Check lengths more thoroughly in NFS4 readdir XDR decodeDavid Howells
Check the bounds of length specifiers more thoroughly in the XDR decoding of NFS4 readdir reply data. Currently, if the server returns a bitmap or attr length that causes the current decode point pointer to wrap, this could go undetected (consider a small "negative" length on a 32-bit machine). Also add a check into the main XDR decode handler to make sure that the amount of data is a multiple of four bytes (as specified by RFC-1014). This makes sure that we can do u32* pointer subtraction in the NFS client without risking an undefined result (the result is undefined if the pointers are not correctly aligned with respect to one another). Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 5861fddd64a7eaf7e8b1a9997455a24e7f688092 commit)
2006-08-24SUNRPC: Fix dentry refcounting issues with users of rpc_pipefsTrond Myklebust
rpc_unlink() and rpc_rmdir() will dput the dentry reference for you. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from a05a57effa71a1f67ccbfc52335c10c8b85f3f6a commit)
2006-08-24SUNRPC: rpc_unlink() must check for unhashed dentriesTrond Myklebust
A prior call to rpc_depopulate() by rpc_rmdir() on the parent directory may have already called simple_unlink() on this entry. Add the same check to rpc_rmdir(). Also remove a redundant call to rpc_close_pipes() in rpc_rmdir. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 0bbfb9d20f6437c4031aa3bf9b4d311a053e58e3 commit)
2006-08-24NFS: clean up rpc_rmdirTrond Myklebust
Make it take a dentry argument instead of a path Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 648d4116eb2509f010f7f34704a650150309b3e7 commit)
2006-08-24SUNRPC: make rpc_unlink() take a dentry argument instead of a pathTrond Myklebust
Signe-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from 88bf6d811b01a4be7fd507d18bf5f1c527989089 commit)
2006-08-22[TCP]: Limit window scaling if window is clamped.Stephen Hemminger
This small change allows for easy per-route workarounds for broken hosts or middleboxes that are not compliant with TCP standards for window scaling. Rather than having to turn off window scaling globally. This patch allows reducing or disabling window scaling if window clamp is present. Example: Mark Lord reported a problem with 2.6.17 kernel being unable to access http://www.everymac.com # ip route add 216.145.246.23/32 via 10.8.0.1 window 65535 Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-22[NETFILTER]: arp_tables: fix table locking in arpt_do_tablePatrick McHardy
table->private might change because of ruleset changes, don't use it without holding the lock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-22Fix sctp privilege elevation (CVE-2006-3745)Sridhar Samudrala
sctp_make_abort_user() now takes the msg_len along with the msg so that we don't have to recalculate the bytes in iovec. It also uses memcpy_fromiovec() so that we don't go beyond the length allocated. It is good to have this fix even if verify_iovec() is fixed to return error on overflow. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-17[BRIDGE]: Disable SG/GSO if TX checksum is offHerbert Xu
When the bridge recomputes features, it does not maintain the constraint that SG/GSO must be off if TX checksum is off. This patch adds that constraint. On a completely unrelated note, I've also added TSO6 and TSO_ECN feature bits if GSO is enabled on the underlying device through the new NETIF_F_GSO_SOFTWARE macro. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[NETFILTER]: ip_tables: fix table locking in ipt_do_tablePatrick McHardy
table->private might change because of ruleset changes, don't use it without holding the lock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[NETFILTER]: ctnetlink: fix deadlock in table dumpingPatrick McHardy
ip_conntrack_put must not be called while holding ip_conntrack_lock since destroy_conntrack takes it again. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[IPV4]: severe locking bug in fib_semantics.cAlexey Kuznetsov
Found in 2.4 by Yixin Pan <yxpan@hotmail.com>. > When I read fib_semantics.c of Linux-2.4.32, write_lock(&fib_info_lock) = > is used in fib_release_info() instead of write_lock_bh(&fib_info_lock). = > Is the following case possible: a BH interrupts fib_release_info() while = > holding the write lock, and calls ip_check_fib_default() which calls = > read_lock(&fib_info_lock), and spin forever. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[MCAST]: Fix filter leak on device removal.David L Stevens
This fixes source filter leakage when a device is removed and a process leaves the group thereafter. This also includes corresponding fixes for IPv6 multicast source filters on device removal. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[NET]: Disallow whitespace in network device names.David S. Miller
It causes way too much trouble and confusion in userspace. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[PKT_SCHED] cls_u32: Fix typo.Ralf Hildebrandt
Signed-off-by: Ralf Hildebrandt <Ralf.Hildebrandt@charite.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[ATM]: Compile error on ARMKevin Hilman
atm_proc_exit() is declared as __exit, and thus in .exit.text. On some architectures (ARM) .exit.text is discarded at compile time, and since atm_proc_exit() is called by some other __init functions, it results in a link error. Signed-off-by: Kevin Hilman <khilman@mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[IPV4]: Possible leak of multicast source filter sctructureMichal Ruzicka
There is a leak of a socket's multicast source filter list structure on closing a socket with a multicast source filter set on an interface that does not exist any more. Signed-off-by: Michal Ruzicka <michal.ruzicka@comstar.cz> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[IPV6] lockdep: annotate __icmpv6_socketIngo Molnar
Split off __icmpv6_socket's sk->sk_dst_lock class, because it gets used from softirqs, which is safe for __icmpv6_sockets (because they never get directly used via userspace syscalls), but unsafe for normal sockets. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[NETFILTER]: xt_physdev build fixAndrew Morton
It needs netfilter_bridge.h for brnf_deferred_hooks Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[NET]: Fix potential stack overflow in net/core/utils.cSuresh Siddha
On High end systems (1024 or so cpus) this can potentially cause stack overflow. Fix the stack usage. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-17[VLAN]: Make sure bonding packet drop checks get done in hwaccel RX path.David S. Miller
Since __vlan_hwaccel_rx() is essentially bypassing the netif_receive_skb() call that would have occurred if we did the VLAN decapsulation in software, we are missing the skb_bond() call and the assosciated checks it does. Export those checks via an inline function, skb_bond_should_drop(), and use this in __vlan_hwaccel_rx(). Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[INET]: Use pskb_trim_unique when trimming paged unique skbsHerbert Xu
The IPv4/IPv6 datagram output path was using skb_trim to trim paged packets because they know that the packet has not been cloned yet (since the packet hasn't been given to anything else in the system). This broke because skb_trim no longer allows paged packets to be trimmed. Paged packets must be given to one of the pskb_trim functions instead. This patch adds a new pskb_trim_unique function to cover the IPv4/IPv6 datagram output path scenario and replaces the corresponding skb_trim calls with it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[NETFILTER]: ulog: fix panic on SMP kernelsMark Huang
Fix kernel panic on various SMP machines. The culprit is a null ub->skb in ulog_send(). If ulog_timer() has already been scheduled on one CPU and is spinning on the lock, and ipt_ulog_packet() flushes the queue on another CPU by calling ulog_send() right before it exits, there will be no skbuff when ulog_timer() acquires the lock and calls ulog_send(). Cancelling the timer in ulog_send() doesn't help because it has already been scheduled and is running on the first CPU. Similar problem exists in ebt_ulog.c and nfnetlink_log.c. Signed-off-by: Mark Huang <mlhuang@cs.princeton.edu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[NETFILTER]: {arp,ip,ip6}_tables: proper error recovery in init pathPatrick McHardy
Neither of {arp,ip,ip6}_tables cleans up behind itself when something goes wrong during initialization. Noticed by Rennie deGraaf <degraaf@cpsc.ucalgary.ca> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[LLC]: multicast receive device matchStephen Hemminger
Fix from Aji_Srinivas@emc.com, STP packets are incorrectly received on all LLC datagram sockets, whichever interface they are bound to. The llc_sap datagram receive logic sends packets with a unicast destination MAC to one socket bound to that SAP and MAC, and multicast packets to all sockets bound to that SAP. STP packets are multicast, and we do need to know on which interface they were received. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[IPSEC]: Validate properly in xfrm_dst_check()David S. Miller
If dst->obsolete is -1, this is a signal from the bundle creator that we want the XFRM dst and the dsts that it references to be validated on every use. I misunderstood this intention when I changed xfrm_dst_check() to always return NULL. Now, when we purge a dst entry, by running dst_free() on it. This will set the dst->obsolete to a positive integer, and we want to return NULL in that case so that the socket does a relookup for the route. Thus, if dst->obsolete<0, let stale_bundle() validate the state, else always return NULL. In general, we need to do things more intelligently here because we flush too much state during rule changes. Herbert Xu has some ideas wherein the key manager gives us some help in this area. We can also use smarter state management algorithms inside of the kernel as well. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[NETFILTER]: xt_hashlimit: fix limit off-by-onePatrick McHardy
Hashlimit doesn't account for the first packet, which is inconsistent with the limit match. Reported by ryan.castellucci@gmail.com, netfilter bugzilla #500. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[NETFILTER]: xt_string: fix negationPhil Oester
The xt_string match is broken with ! negation. This resolves a portion of netfilter bugzilla #497. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-13[TCP]: Fix botched memory leak fix to tcpprobe_read().David S. Miller
Somehow I clobbered James's original fix and only my subsequent compiler warning change went in for that changeset. Get the real fix in there. Noticed by Jesper Juhl. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-09[IPX]: Fix typo, ipxhdr() --> ipx_hdr()David S. Miller
Noticed by Dave Jones. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-09[IPV6]: The ifa lock is a BH lockHerbert Xu
The ifa lock is expected to be taken in BH context (by addrconf timers) so we must disable BH when accessing it from user context. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-09Merge gregkh@master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Greg Kroah-Hartman
2006-08-09[NET]: add_timer -> mod_timer() in dst_run_gc()Dmitry Mishin
Patch from Dmitry Mishin <dim@openvz.org>: Replace add_timer() by mod_timer() in dst_run_gc in order to avoid BUG message. CPU1 CPU2 dst_run_gc() entered dst_run_gc() entered spin_lock(&dst_lock) ..... del_timer(&dst_gc_timer) fail to get lock .... mod_timer() <--- puts timer back to the list add_timer(&dst_gc_timer) <--- BUG because timer is in list already. Found during OpenVZ internal testing. At first we thought that it is OpenVZ specific as we added dst_run_gc(0) call in dst_dev_event(), but as Alexey pointed to me it is possible to trigger this condition in mainstream kernel. F.e. timer has fired on CPU2, but the handler was preeempted by an irq before dst_lock is tried. Meanwhile, someone on CPU1 adds an entry to gc list and starts the timer. If CPU2 was preempted long enough, this timer can expire simultaneously with resuming timer handler on CPU1, arriving exactly to the situation described. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-08Merge branch 'upstream-fixes' of ↵Jeff Garzik
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes