aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2009-01-26Phonet: handle rtnetlink registration failureremi.denis-courmont@nokia
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-26Phonet: allow phonet_device_init() to fail, put it to __init sectionremi.denis-courmont@nokia
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-26Phonet: check destination before delivering packets locallyremi.denis-courmont@nokia
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-26Phonet: move to Networking options like other protocol stacksremi.denis-courmont@nokia
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-26gre: optimize hash lookupTimo Teras
Instead of keeping candidate tunnel device from all categories, keep only one candidate with best score. This optimizes stack usage and speeds up exit code. Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-26Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2009-01-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (92 commits) gianfar: Revive VLAN support vlan: Export symbols as non GPL symbols. bnx2x: tx_has_work should not wait for FW netxen: reduce memory footprint netxen: fix vlan tso/checksum offload net: Fix linux/if_frad.h's suitability for userspace. net: Move config NET_NS to from net/Kconfig to init/Kconfig isdn: Fix missing ifdef in isdn_ppp networking: document "nc" in addition to "netcat" in netconsole.txt e1000e: workaround hw errata af_key: initialize xfrm encap_oa virtio_net: Fix MAX_PACKET_LEN to support 802.1Q VLANs lcs: fix compilation for !CONFIG_IP_MULTICAST rtl8187: Add termination packet to prevent stall iwlwifi: fix rs_get_rate WARN_ON() p54usb: fix packet loss with first generation devices sctp: Fix another socket race during accept/peeloff sctp: Properly timestamp outgoing data chunks for rtx purposes sctp: Correctly start rtx timer on new packet transmissions. sctp: Fix crc32c calculations on big-endian arhes. ...
2009-01-26vlan: Export symbols as non GPL symbols.Ben Greear
In previous kernels, any kernel module could get access to the 'real-device' and the VLAN-ID for a particular VLAN. In more recent kernels, the code was restructured such that this is hard to do without accessing private .h files for any module that cannot use GPL-only symbols. Attached is a patch to once again allow non-GPL modules the ability to access the real-device and VLAN id for VLANs. Signed-off-by: Ben Greear <greearb@candelatech.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-26net: Move config NET_NS to from net/Kconfig to init/KconfigMatt Helsley
Make NET_NS available underneath the generic Namespaces config option since all of the other namespace options are there. Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-25af_key: initialize xfrm encap_oaTimo Teras
Currently encap_oa is left uninitialized, so it contains garbage data which is visible to userland via Netlink. Initialize it by zeroing it out. Signed-off-by: Timo Teras <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22sctp: Fix another socket race during accept/peeloffVlad Yasevich
There is a race between sctp_rcv() and sctp_accept() where we have moved the association from the listening socket to the accepted socket, but sctp_rcv() processing cached the old socket and continues to use it. The easy solution is to check for the socket mismatch once we've grabed the socket lock. If we hit a mis-match, that means that were are currently holding the lock on the listening socket, but the association is refrencing a newly accepted socket. We need to drop the lock on the old socket and grab the lock on the new one. A more proper solution might be to create accepted sockets when the new association is established, similar to TCP. That would eliminate the race for 1-to-1 style sockets, but it would still existing for 1-to-many sockets where a user wished to peeloff an association. For now, we'll live with this easy solution as it addresses the problem. Reported-by: Michal Hocko <mhocko@suse.cz> Reported-by: Karsten Keil <kkeil@suse.de> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22sctp: Properly timestamp outgoing data chunks for rtx purposesVlad Yasevich
Recent changes to the retransmit code exposed a long standing bug where it was possible for a chunk to be time stamped after the retransmit timer was reset. This caused a rare situation where the retrnamist timer has expired, but nothing was marked for retrnasmission because all of timesamps on data were less then 1 rto ago. As result, the timer was never restarted since nothing was retransmitted, and this resulted in a hung association that did couldn't complete the data transfer. The solution is to timestamp the chunk when it's added to the packet for transmission purposes. After the packet is trsnmitted the rtx timer is restarted. This guarantees that when the timer expires, there will be data to retransmit. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22sctp: Correctly start rtx timer on new packet transmissions.Vlad Yasevich
Commit 62aeaff5ccd96462b7077046357a6d7886175a57 (sctp: Start T3-RTX timer when fast retransmitting lowest TSN) introduced a regression where it was possible to forcibly restart the sctp retransmit timer at the transmission of any new chunk. This resulted in much longer timeout times and sometimes hung sctp connections. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22netns: ipmr: enable namespace support in ipv4 multicast routing codeBenjamin Thery
This last patch makes the appropriate changes to use and propagate the network namespace where needed in IPv4 multicast routing code. This consists mainly in replacing all the remaining init_net occurences with current netns pointer retrieved from sockets, net devices or mfc_caches depending on the routines' contexts. Some routines receive a new 'struct net' parameter to propagate the current netns: * vif_add/vif_delete * ipmr_new_tunnel * mroute_clean_tables * ipmr_cache_find * ipmr_cache_report * ipmr_cache_unresolved * ipmr_mfc_add/ipmr_mfc_delete * ipmr_get_route * rt_fill_info (in route.c) Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22netns: ipmr: declare ipmr /proc/net entries per-namespaceBenjamin Thery
Declare IPv4 multicast forwarding /proc/net entries per-namespace: /proc/net/ip_mr_vif /proc/net/ip_mr_cache Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22netns: ipmr: declare reg_vif_num per-namespaceBenjamin Thery
Preliminary work to make IPv4 multicast routing netns-aware. Declare variable 'reg_vif_num' per-namespace, move into struct netns_ipv4. At the moment, this variable is only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22netns: ipmr: declare mroute_do_assert and mroute_do_pim per-namespaceBenjamin Thery
Preliminary work to make IPv4 multicast routing netns-aware. Declare IPv multicast routing variables 'mroute_do_assert' and 'mroute_do_pim' per-namespace in struct netns_ipv4. At the moment, these variables are only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22netns: ipmr: declare counter cache_resolve_queue_len per-namespaceBenjamin Thery
Preliminary work to make IPv4 multicast routing netns-aware. Declare variable cache_resolve_queue_len per-namespace: move it into struct netns_ipv4. This variable counts the number of unresolved cache entries queued in the list mfc_unres_queue. This list is kept global to all netns as the number of entries per namespace is limited to 10 (hardcoded in routine ipmr_cache_unresolved). Entries belonging to different namespaces in mfc_unres_queue will be identified by matching the mfc_net member introduced previously in struct mfc_cache. Keeping this list global to all netns, also allows us to keep a single timer (ipmr_expire_timer) to handle their expiration. In some places cache_resolve_queue_len value was tested for arming or deleting the timer. These tests were equivalent to testing mfc_unres_queue value instead and are replaced in this patch. At the moment, cache_resolve_queue_len is only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22netns: ipmr: dynamically allocate mfc_cache_arrayBenjamin Thery
Preliminary work to make IPv4 multicast routing netns-aware. Dynamically allocate IPv4 multicast forwarding cache, mfc_cache_array, and move it to struct netns_ipv4. At the moment, mfc_cache_array is only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22netns: ipmr: store netns in struct mfc_cacheBenjamin Thery
This patch stores into struct mfc_cache the network namespace each mfc_cache belongs to. The new member is mfc_net. mfc_net is assigned at cache allocation and doesn't change during the rest of the cache entry life. A new net parameter is added to ipmr_cache_alloc/ipmr_cache_alloc_unres. This will help to retrieve the current netns around the IPv4 multicast routing code. At the moment, all mfc_cache are allocated in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22netns: ipmr: dynamically allocate vif_tableBenjamin Thery
Preliminary work to make IPv6 multicast routing netns-aware. Dynamically allocate interface table vif_table and move it to struct netns_ipv4, and update MIF_EXISTS() macro. At the moment, vif_table is only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22netns: ipmr: allocate mroute_socket per-namespace.Benjamin Thery
Preliminary work to make IPv4 multicast routing netns-aware. Make IPv4 multicast routing mroute_socket per-namespace, moves it into struct netns_ipv4. At the moment, mroute_socket is only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22sctp/ipv6.c: use ipv6_addr_copyJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-22mac80211: fix slot time debug messageChristian Lamparter
wlan0: switched to short barker preamble (BSSID=00:01:aa:bb:cc:dd) wlan0: switched to short slot (BSSID=) <something is missing here> should be: wlan0: switched to short barker preamble (BSSID=00:01:aa:bb:cc:dd) wlan0: switched to short slot (BSSID=00:01:aa:bb:cc:dd) Signed-off-by: Christian Lamparter <chunkeey@web.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-01-22mac80211: decrement ref count to netdev after launching mesh discoveryBrian Cavagnolo
After launching mesh discovery in tx path, reference count was not being decremented. This was preventing module unload. Signed-off-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: Andrey Yurovsky <andrey@cozybit.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-01-22fs/Kconfig: move sunrpc outAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-01-21gre: strict physical device bindingTimo Teras
Check the device on receive path and allow otherwise identical devices as long as the physical device differs. This is useful for NBMA tunnels, where you want to use different gre IP for each public IP available via different physical devices. Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21inet: Allowing more than 64k connections and heavily optimize bind(0) time.Evgeniy Polyakov
With simple extension to the binding mechanism, which allows to bind more than 64k sockets (or smaller amount, depending on sysctl parameters), we have to traverse the whole bind hash table to find out empty bucket. And while it is not a problem for example for 32k connections, bind() completion time grows exponentially (since after each successful binding we have to traverse one bucket more to find empty one) even if we start each time from random offset inside the hash table. So, when hash table is full, and we want to add another socket, we have to traverse the whole table no matter what, so effectivelly this will be the worst case performance and it will be constant. Attached picture shows bind() time depending on number of already bound sockets. Green area corresponds to the usual binding to zero port process, which turns on kernel port selection as described above. Red area is the bind process, when number of reuse-bound sockets is not limited by 64k (or sysctl parameters). The same exponential growth (hidden by the green area) before number of ports reaches sysctl limit. At this time bind hash table has exactly one reuse-enbaled socket in a bucket, but it is possible that they have different addresses. Actually kernel selects the first port to try randomly, so at the beginning bind will take roughly constant time, but with time number of port to check after random start will increase. And that will have exponential growth, but because of above random selection, not every next port selection will necessary take longer time than previous. So we have to consider the area below in the graph (if you could zoom it, you could find, that there are many different times placed there), so area can hide another. Blue area corresponds to the port selection optimization. This is rather simple design approach: hashtable now maintains (unprecise and racely updated) number of currently bound sockets, and when number of such sockets becomes greater than predefined value (I use maximum port range defined by sysctls), we stop traversing the whole bind hash table and just stop at first matching bucket after random start. Above limit roughly corresponds to the case, when bind hash table is full and we turned on mechanism of allowing to bind more reuse-enabled sockets, so it does not change behaviour of other sockets. Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Tested-by: Denys Fedoryschenko <denys@visp.net.lb> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21dccp: Debugging functions for feature negotiationGerrit Renker
Since all feature-negotiation processing now takes place in feat.c, functions for producing verbose debugging output are concentrated there. New functions to print out values, entry records, and options are provided, and also a macro is defined to not always have the function name in the output line. Thanks a lot to Wei Yongjun and Giuseppe Galeota for help and discussion with an earlier revision of this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21dccp: Initialisation and type-checking of feature sysctlsGerrit Renker
This patch takes care of initialising and type-checking sysctls related to feature negotiation. Type checking is important since some of the sysctls now directly impact the feature-negotiation process. The sysctls are initialised with the known default values for each feature. For the type-checking the value constraints from RFC 4340 are used: * Sequence Window uses the specified Wmin=32, the maximum is ulong (4 bytes), tested and confirmed that it works up to 4294967295 - for Gbps speed; * Ack Ratio is between 0 .. 0xffff (2-byte unsigned integer); * CCIDs are between 0 .. 255; * request_retries, retries1, retries2 also between 0..255 for good measure; * tx_qlen is checked to be non-negative; * sync_ratelimit remains as before. Notes: ------ 1. Die s@sysctl_dccp_feat@sysctl_dccp@g since the sysctls are now in feat.c. 2. As pointed out by Arnaldo, the pattern of type-checking repeats itself in other places, sometimes with exactly the same kind of definitions (e.g. "static int zero;"). It may be a good idea (kernel janitors?) to consolidate type checking. For the sake of keeping the changeset small and in order not to affect other subsystems, I have not strived to generalise here. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21dccp: Implement both feature-local and feature-remote Sequence Window featureGerrit Renker
This adds full support for local/remote Sequence Window feature, from which the * sequence-number-validity (W) and * acknowledgment-number-validity (W') windows derive as specified in RFC 4340, 7.5.3. Specifically, the following is contained in this patch: * integrated new socket fields into dccp_sk; * updated the update_gsr/gss routines with regard to these fields; * updated handler code: the Sequence Window feature is located at the TX side, so the local feature is meant if the handler-rx flag is false; * the initialisation of `rcv_wnd' in reqsk is removed, since - rcv_wnd is not used by the code anywhere; - sequence number checks are not done in the LISTEN state (cf. 7.5.3); - dccp_check_req checks the Ack number validity more rigorously; * the `struct dccp_minisock' became empty and is now removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21dccp: Initialisation framework for feature negotiationGerrit Renker
This initialises feature negotiation from two tables, which are in turn are initialised from sysctls. As a novel feature, specifics of the implementation (e.g. that short seqnos and ECN are not yet available) are advertised for robustness. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21appletalk: remove unneeded stubsStephen Hemminger
With net_device_ops if set_mac_address is null, then error is -EOPNOTSUPPORTED. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21rose: convert to network_device_opsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21rose: convert to internal net_device_statsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21netrom: convert to net_device_opsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21netrom: convert to internal net_device_statsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21lec: convert to net_device_opsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21lec: convert to internal network_device_statsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21clip: convert to internal network_device_statsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21br2684: convert to net_device_opsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21atm: br2684 internal statsStephen Hemminger
Now that stats are in net_device, use them. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21netfilter: ctnetlink: fix scheduling while atomicPatrick McHardy
Caused by call to request_module() while holding nf_conntrack_lock. Reported-and-tested-by: Kövesdi György <kgy@teledigit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-20gro: Fix merging of paged packetsHerbert Xu
The previous fix to paged packets broke the merging because it reset the skb->len before we added it to the merged packet. This wasn't detected because it simply resulted in the truncation of the packet while the missing bit is subsequently retransmitted. The fix is to store skb->len before we clobber it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-20gro: Fix error handling on extremely short fragsHerbert Xu
When a frag is shorter than an Ethernet header, we'd return a zeroed packet instead of aborting. This patch fixes that. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-20gro: Fix handling of complete checksums in IPv6Herbert Xu
We need to perform skb_postpull_rcsum after pulling the IPv6 header in order to maintain the correctness of the complete checksum. This patch also adds a missing iph reload after pulling. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-20NET: net_namespace, fix lock imbalanceJiri Slaby
register_pernet_gen_subsys omits mutex_unlock in one fail path. Fix it. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-20Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2009-01-20Revert "xfrm: For 32/64 compatability wrt. xfrm_usersa_info"David S. Miller
This reverts commit fc8c7dc1b29560c016a67a34ccff32a712b5aa86. As indicated by Jiri Klimes, this won't work. These numbers are not only used the size validation, they are also used to locate attributes sitting after the message. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-19net: Fix data corruption when splicing from sockets.Jarek Poplawski
The trick in socket splicing where we try to convert the skb->data into a page based reference using virt_to_page() does not work so well. The idea is to pass the virt_to_page() reference via the pipe buffer, and refcount the buffer using a SKB reference. But if we are splicing from a socket to a socket (via sendpage) this doesn't work. The from side processing will grab the page (and SKB) references. The sendpage() calls will grab page references only, return, and then the from side processing completes and drops the SKB ref. The page based reference to skb->data is not enough to keep the kmalloc() buffer backing it from being reused. Yet, that is all that the socket send side has at this point. This leads to data corruption if the skb->data buffer is reused by SLAB before the send side socket actually gets the TX packet out to the device. The fix employed here is to simply allocate a page and copy the skb->data bytes into that page. This will hurt performance, but there is no clear way to fix this properly without a copy at the present time, and it is important to get rid of the data corruption. With fixes from Herbert Xu. Tested-by: Willy Tarreau <w@1wt.eu> Foreseen-by: Changli Gao <xiaosuo@gmail.com> Diagnosed-by: Willy Tarreau <w@1wt.eu> Reported-by: Willy Tarreau <w@1wt.eu> Fixed-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>