aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2008-01-28[XFRM]: Fix potential race vs xfrm_state(only)_find and xfrm_hash_resize.Pavel Emelyanov
The _find calls calculate the hash value using the xfrm_state_hmask, without the xfrm_state_lock. But the value of this mask can change in the _resize call under the state_lock, so we risk to fail in finding the desired entry in hash. I think, that the hash value is better to calculate under the state lock. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[UDP]: Move udp_stats_in6 into net/ipv4/udp.cHerbert Xu
Now that external users may increment the counters directly, we need to ensure that udp_stats_in6 is always available. Otherwise we'd either have to requrie the external users to be built as modules or ipv6 to be built-in. This isn't too bad because udp_stats_in6 is just a pair of pointers plus an EXPORT, e.g., just 40 (16 + 24) bytes on x86-64. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[SUNRPC]: Use htonl() where appropriate.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[RXRPC]: Use cpu_to_be32() where appropriate.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[MAC80211]: Use htons() where appropriate.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IRDA]: Use htons() where appropriate.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPVS]: Use htons() where appropriate.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IEEE80211]: Use htons() where appropriate.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DECNET]: Use htons() where appropriate.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[BRIDGE]: Use cpu_to_be16() where appropriate.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Introducing CCMPSGerrit Renker
This introduces a CCMPS field for setting a CCID-specific upper bound on the application payload size, as is defined in RFC 4340, section 14. Only the TX CCID is considered in setting this limit, since the RX CCID generates comparatively small (DCCP-Ack) feedback packets. The CCMPS field includes network and transport layer header lengths. The only current CCMPS customer is CCID4 (via RFC 4828). A wrapper is used to allow querying the CCMPS even at times where the CCID modules may not have been fully negotiated yet. In dccp_sync_mss() the variable `mss_now' has been renamed into `cur_mps', to reflect that we are dealing with an MPS, but not an MSS. Since the DCCP code closely follows the TCP code, the identifiers `dccp_sync_mss' and `dccps_mss_cache' have been kept, as they have direct TCP counterparts. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID]: More informative registrationGerrit Renker
The patch makes the registration messages of CCID 2/3 a bit more informative: instead of repeating the CCID number as currently done, "CCID: Registered CCID 2 (ccid2)" or "CCID: Registered CCID 3 (ccid3)", the descriptive names of the CCID's (from RFCs) are now used: "CCID: Registered CCID 2 (TCP-like)" and "CCID: Registered CCID 3 (TCP-Friendly Rate Control)". To allow spaces in the name, the slab name string has been changed to refer to the numeric CCID identifier, using the same format as before. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Documentation for CCID operationsGerrit Renker
This adds documentation for the ccid_operations structure. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPV4]: Thresholds in fib_trie.c are used as consts, so make them const.Denis V. Lunev
There are several thresholds for trie fib hash management. They are used in the code as a constants. Make them constants from the compiler point of view. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPV6] sit: Rebinding of SIT tunnels to other interfacesMichal Schmidt
This is similar to the change already done for IPIP tunnels. Once created, a SIT tunnel can't be bound to another device. To reproduce: # create a tunnel: ip tunnel add tunneltest0 mode sit remote 10.0.0.1 dev eth0 # try to change the bounding device from eth0 to eth1: ip tunnel change tunneltest0 dev eth1 # show the result: ip tunnel show tunneltest0 tunneltest0: ipv6/ip remote 10.0.0.1 local any dev eth0 ttl inherit Notice the bound device has not changed from eth0 to eth1. This patch fixes it. When changing the binding, it also recalculates the MTU according to the new bound device's MTU. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IP_GRE]: Rebinding of GRE tunnels to other interfacesMichal Schmidt
This is similar to the change already done for IPIP tunnels. Once created, a GRE tunnel can't be bound to another device. To reproduce: # create a tunnel: ip tunnel add tunneltest0 mode gre remote 10.0.0.1 dev eth0 # try to change the bounding device from eth0 to eth1: ip tunnel change tunneltest0 dev eth1 # show the result: ip tunnel show tunneltest0 tunneltest0: gre/ip remote 10.0.0.1 local any dev eth0 ttl inherit Notice the bound device has not changed from eth0 to eth1. This patch fixes it. When changing the binding, it also recalculates the MTU according to the new bound device's MTU. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPV6]: Always pass a valid nl_info to inet6_rt_notify.Denis V. Lunev
This makes the code in the inet6_rt_notify more straightforward and provides groud for namespace passing. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPSEC]: Fix zero return value in xfrm_lookup on errorHerbert Xu
Further testing shows that my ICMP relookup patch can cause xfrm_lookup to return zero on error which isn't very nice since it leads to the caller dying on null pointer dereference. The bug is due to not setting err to ENOENT just before we leave xfrm_lookup in case of no policy. This patch moves the err setting to where it should be. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Ignore feature negotiation on Data packetsGerrit Renker
This implements [RFC 4340, p. 32]: "any feature negotiation options received on DCCP-Data packets MUST be ignored". Also added a FIXME for further processing, since the code currently (wrongly) classifies empty Confirm options as invalid - this needs to be resolved in a separate patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Make code assumptions explicitGerrit Renker
This removes several `XXX' references which indicate a missing support for non-1-byte feature values: this is unnecessary, as all currently known (standardised) SP feature values are 1-byte quantities. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Remove unused and redundant validation functionsGerrit Renker
This removes two inlines which were both called in a single function only: 1) dccp_feat_change() is always called with either DCCPO_CHANGE_L or DCCPO_CHANGE_R as argument * from dccp_set_socktopt_change() via do_dccp_setsockopt() with DCCP_SOCKOPT_CHANGE_R/L * from __dccp_feat_init() via dccp_feat_init() also with DCCP_SOCKOPT_CHANGE_R/L. Hence the dccp_feat_is_valid_type() is completely unnecessary and always returns true. 2) Due to (1), the length test reduces to 'len >= 4', which in turn makes dccp_feat_is_valid_length() unnecessary. Furthermore, the inline function dccp_feat_is_reserved() was unfolded, since only called in a single place. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Support inserting options during the 3-way handshakeGerrit Renker
This provides a separate routine to insert options during the initial handshake. The main purpose is to conduct feature negotiation, for the moment the only user is the timestamp echo needed for the (CCID3) handshake RTT sample. Padding of options has been put into a small separate routine, to be shared among the two functions. This could also be used as a generic routine to finish inserting options. Also removed an `XXX' comment since its content was obvious. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Handle timestamps on Request/Response exchange separatelyGerrit Renker
In DCCP, timestamps can occur on packets anytime, CCID3 uses a timestamp(/echo) on the Request/Response exchange. This patch addresses the following situation: * timestamps are recorded on the listening socket; * Responses are sent from dccp_request_sockets; * suppose two connections reach the listening socket with very small time in between: * the first timestamp value gets overwritten by the second connection request. This is not really good, so this patch separates timestamps into * those which are received by the server during the initial handshake (on dccp_request_sock); * those which are received by the client or the client after connection establishment. As before, a timestamp of 0 is regarded as indicating that no (meaningful) timestamp has been received (in addition, a warning message is printed if hosts send 0-valued timestamps). The timestamp-echoing now works as follows: * when a timestamp is present on the initial Request, it is placed into dreq, due to the call to dccp_parse_options in dccp_v{4,6}_conn_request; * when a timestamp is present on the Ack leading from RESPOND => OPEN, it is copied over from the request_sock into the child cocket in dccp_create_openreq_child; * timestamps received on an (established) dccp_sock are treated as before. Since Elapsed Time is measured in hundredths of milliseconds (13.2), the new dccp_timestamp() function is used, as it is expected that the time between receiving the timestamp and sending the timestamp echo will be very small against the wrap-around time. As a byproduct, this allows smaller timestamping-time fields. Furthermore, inserting the Timestamp Echo option has been taken out of the block starting with '!dccp_packet_without_ack()', since Timestamp Echo can be carried on any packet (5.8 and 13.3). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Add (missing) option parsing to request_sock processingGerrit Renker
This adds option-parsing code to processing of Acks in the listening state on request_socks on the server, serving two purposes (i) resolves a FIXME (removed); (ii) paves the way for feature-negotiation during connection-setup. There is an intended subtlety here with regard to dccp_check_req: Parsing options happens only after testing whether the received packet is a retransmitted Request. Otherwise, if the Request contained (a possibly large number of) feature-negotiation options, recomputing state would have to happen each time a retransmitted Request arrives, which opens the door to an easy DoS attack. Since in a genuine retransmission the options should not be different from the original, reusing the already computed state seems better. The other point is - if there are timestamp options on the Request, they will not be answered; which means that in the presence of retransmission (likely due to loss and/or other problems), the use of Request/Response RTT sampling is suspended, so that startup problems here do not propagate. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Allow to parse options on Request SocketsGerrit Renker
The option parsing code currently only parses on full sk's. This causes a problem for options sent during the initial handshake (in particular timestamps and feature-negotiation options). Therefore, this patch extends the option parsing code with an additional argument for request_socks: if it is non-NULL, options are parsed on the request socket, otherwise the normal path (parsing on the sk) is used. Subsequent patches, which implement feature negotiation during connection setup, make use of this facility. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Collapse repeated `len' statements into oneGerrit Renker
This replaces 4 individual assignments for `len' with a single one, placed where the control flow of those 4 leads to. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Support for server holding timewait stateGerrit Renker
This adds a socket option and signalling support for the case where the server holds timewait state on closing the connection, as described in RFC 4340, 8.3. Since holding timewait state at the server is the non-usual case, it is enabled via a socket option. Documentation for this socket option has been added. The setsockopt statement has been made resilient against different possible cases of expressing boolean `true' values using a suggestion by Ian McDonald. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Use maximum-RTO backoff from DCCP specGerrit Renker
This removes another Fixme, using the TCP maximum RTO rather than the value specified by the DCCP specification. Across the sections in RFC 4340, 64 seconds is consistently suggested as maximum RTO backoff value; and this is the value which is now used. I have checked both termination cases for retransmissions of Close/CloseReq: with the default value 15 of `retries2', and an initial icsk_retransmit = 0, it takes about 614 seconds to declare a non-responding peer as dead, after which the final terminating Reset is sent. With the TCP maximum RTO value of 120 seconds it takes (as might be expected) almost twice as long, about 23 minutes. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Shift the retransmit timer for active-close into output.cGerrit Renker
When performing active close, RFC 4340, 8.3. requires to retransmit the Close/CloseReq with a backoff-retransmit timer starting at intially 2 RTTs. This patch shifts the existing code for active-close retransmit timer into output.c, so that the retransmit timer is started when the first Close/CloseReq is sent. Previously, the timer was started when, after releasing the socket in dccp_close(), the actively-closing side had not yet reached the CLOSED/TIMEWAIT state. The patch further reduces the initial timeout from 3 seconds to the required 2 RTTs, where - in absence of a known RTT - the fallback value specified in RFC 4340, 3.4 is used. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPV6]: fix section mismatch warningsDaniel Lezcano
Removed useless and buggy __exit section in the different ipv6 subsystems. Otherwise they will be called inside an init section during rollbacking in case of an error in the protocol initialization. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Perform SHUT_RD and SHUT_WR on receiving closeGerrit Renker
This patch performs two changes: 1) Close the write-end in addition to the read-end when a fin-like segment (Close or CloseReq) is received by DCCP. This accounts for the fact that DCCP, in contrast to TCP, does not have a half-close. RFC 4340 says in this respect that when a fin-like segment has been sent there is no guarantee at all that any further data will be processed. Thus this patch performs SHUT_WR in addition to the SHUT_RD when a fin-like segment is encountered. 2) Minor change: I noted that code appears twice in different places and think it makes sense to put this into a self-contained function (dccp_enqueue()). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DECNET]: Fix inverted wait flag in xfrm_lookup callHerbert Xu
My previous patch made the wait flag take the opposite value to what it should be. This patch fixes that. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET]: Check RTNL status in unregister_netdeviceHerbert Xu
The caller must hold the RTNL so let's check it in unregister_netdevice. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPSEC]: Do not let packets pass when ICMP flag is offHerbert Xu
This fixes a logical error in ICMP policy checks which lets packets through if the state ICMP flag is off. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPSEC]: Make callers of xfrm_lookup to use XFRM_LOOKUP_WAITHerbert Xu
This patch converts all callers of xfrm_lookup that used an explicit value of 1 to indiciate blocking to use the new flag XFRM_LOOKUP_WAIT. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPSEC]: Fix reversed ICMP6 policy checkHerbert Xu
The policy check I added for ICMP on IPv6 is reversed. This patch fixes that. It also adds an skb->sp check so that unprotected packets that fail the policy check do not crash the machine. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPIP]: Allow rebinding the tunnel to another interfaceMichal Schmidt
Once created, an IP tunnel can't be bound to another device. (reported as https://bugzilla.redhat.com/show_bug.cgi?id=419671) To reproduce: # create a tunnel: ip tunnel add tunneltest0 mode ipip remote 10.0.0.1 dev eth0 # try to change the bounding device from eth0 to eth1: ip tunnel change tunneltest0 dev eth1 # show the result: ip tunnel show tunneltest0 tunneltest0: ip/ip remote 10.0.0.1 local any dev eth0 ttl inherit Notice the bound device has not changed from eth0 to eth1. This patch fixes it. When changing the binding, it also recalculates the MTU according to the new bound device's MTU. If the change is acceptable, I'll do the same for GRE and SIT tunnels. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: network namespace was passed into dev_getbyhwaddr but not usedDenis V. Lunev
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPSEC]: Add ICMP host relookup supportHerbert Xu
RFC 4301 requires us to relookup ICMP traffic that does not match any policies using the reverse of its payload. This patch implements this for ICMP traffic that originates from or terminates on localhost. This is activated on outbound with the new policy flag XFRM_POLICY_ICMP, and on inbound by the new state flag XFRM_STATE_ICMP. On inbound the policy check is now performed by the ICMP protocol so that it can repeat the policy check where necessary. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPSEC]: Added xfrm_decode_session_reverse and xfrmX_policy_check_reverseHerbert Xu
RFC 4301 requires us to relookup ICMP traffic that does not match any policies using the reverse of its payload. This patch adds the functions xfrm_decode_session_reverse and xfrmX_policy_check_reverse so we can get the reverse flow to perform such a lookup. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPSEC]: Make xfrm_lookup flags argument a bit-fieldHerbert Xu
This patch introduces an enum for bits in the flags argument of xfrm_lookup. This is so that we can cram more information into it later. Since all current users use just the values 0 and 1, XFRM_LOOKUP_WAIT has been added with the value 1 << 0 to represent the current meaning of flags. The test in __xfrm_lookup has been changed accordingly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Remove previous loss intervals implementationGerrit Renker
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Interface CCID3 code with newer Loss Intervals DatabaseGerrit Renker
This hooks up the TFRC Loss Interval database with CCID 3 packet reception. In addition, it makes the CCID-specific computation of the first loss interval (which requires access to all the guts of CCID3) local to ccid3.c. The patch also fixes an omission in the DCCP code, that of a default / fallback RTT value (defined in section 3.4 of RFC 4340 as 0.2 sec); while at it, the upper bound of 4 seconds for an RTT sample has been reduced to match the initial TCP RTO value of 3 seconds from[RFC 1122, 4.2.3.1]. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: CCID3 (and CCID4) needs to access these inlinesGerrit Renker
This moves two inlines back to packet_history.h: these are not private to packet_history.c, but are needed by CCID3/4 to detect whether a new loss is indicated, or whether a loss is already pending. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Redundant debugging output / documentationGerrit Renker
Each time feedback is sent two lines are printed: ccid3_hc_rx_send_feedback: client ... - entry ccid3_hc_rx_send_feedback: Interval ...usec, X_recv=..., 1/p=... The first line is redundant and thus removed. Further, documentation of ccid3_hc_rx_sock (capitalisation) is made consistent. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Ringbuffer to track loss interval historyGerrit Renker
A ringbuffer-based implementation of loss interval history is easier to maintain, allocate, and update. The `swap' routine to keep the RX history sorted is due to and was written by Arnaldo Carvalho de Melo, simplifying an earlier macro-based variant. Details: * access to the Loss Interval Records via macro wrappers (with safety checks); * simplified, on-demand allocation of entries (no extra memory consumption on lossless links); cache allocation is local to the module / exported as service; * provision of RFC-compliant algorithm to re-compute average loss interval; * provision of comprehensive, new loss detection algorithm - support for all cases of loss, including re-ordered/duplicate packets; - waiting for NDUPACK=3 packets to fill the hole; - updating loss records when a late-arriving packet fills a hole. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Loss interval code needs the macros/inlines that were movedGerrit Renker
This moves the inlines (which were previously declared as macros) back into packet_history.h since the loss detection code needs to be able to read entries from the RX history in order to create the relevant loss entries: it needs at least tfrc_rx_hist_loss_prev() and tfrc_rx_hist_last_rcv(), which in turn require the definition of the other inlines (macros). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Put RX/TX initialisation into tfrc.cGerrit Renker
This separates RX/TX initialisation and puts all packet history / loss intervals initialisation into tfrc.c. The organisation is uniform: slab declaration -> {rx,tx}_init() -> {rx,tx}_exit() Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: separate af_packet netns dataDenis V. Lunev
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: struct net content re-work (v3)Denis V. Lunev
Recently David Miller and Herbert Xu pointed out that struct net becomes overbloated and un-maintainable. There are two solutions: - provide a pointer to a network subsystem definition from struct net. This costs an additional dereferrence - place sub-system definition into the structure itself. This will speedup run-time access at the cost of recompilation time The second approach looks better for us. Other sub-systems will follow. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>