aboutsummaryrefslogtreecommitdiff
path: root/net/dccp
AgeCommit message (Collapse)Author
2008-01-28[DCCP]: Allow to parse options on Request SocketsGerrit Renker
The option parsing code currently only parses on full sk's. This causes a problem for options sent during the initial handshake (in particular timestamps and feature-negotiation options). Therefore, this patch extends the option parsing code with an additional argument for request_socks: if it is non-NULL, options are parsed on the request socket, otherwise the normal path (parsing on the sk) is used. Subsequent patches, which implement feature negotiation during connection setup, make use of this facility. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Collapse repeated `len' statements into oneGerrit Renker
This replaces 4 individual assignments for `len' with a single one, placed where the control flow of those 4 leads to. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Support for server holding timewait stateGerrit Renker
This adds a socket option and signalling support for the case where the server holds timewait state on closing the connection, as described in RFC 4340, 8.3. Since holding timewait state at the server is the non-usual case, it is enabled via a socket option. Documentation for this socket option has been added. The setsockopt statement has been made resilient against different possible cases of expressing boolean `true' values using a suggestion by Ian McDonald. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Use maximum-RTO backoff from DCCP specGerrit Renker
This removes another Fixme, using the TCP maximum RTO rather than the value specified by the DCCP specification. Across the sections in RFC 4340, 64 seconds is consistently suggested as maximum RTO backoff value; and this is the value which is now used. I have checked both termination cases for retransmissions of Close/CloseReq: with the default value 15 of `retries2', and an initial icsk_retransmit = 0, it takes about 614 seconds to declare a non-responding peer as dead, after which the final terminating Reset is sent. With the TCP maximum RTO value of 120 seconds it takes (as might be expected) almost twice as long, about 23 minutes. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Shift the retransmit timer for active-close into output.cGerrit Renker
When performing active close, RFC 4340, 8.3. requires to retransmit the Close/CloseReq with a backoff-retransmit timer starting at intially 2 RTTs. This patch shifts the existing code for active-close retransmit timer into output.c, so that the retransmit timer is started when the first Close/CloseReq is sent. Previously, the timer was started when, after releasing the socket in dccp_close(), the actively-closing side had not yet reached the CLOSED/TIMEWAIT state. The patch further reduces the initial timeout from 3 seconds to the required 2 RTTs, where - in absence of a known RTT - the fallback value specified in RFC 4340, 3.4 is used. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Perform SHUT_RD and SHUT_WR on receiving closeGerrit Renker
This patch performs two changes: 1) Close the write-end in addition to the read-end when a fin-like segment (Close or CloseReq) is received by DCCP. This accounts for the fact that DCCP, in contrast to TCP, does not have a half-close. RFC 4340 says in this respect that when a fin-like segment has been sent there is no guarantee at all that any further data will be processed. Thus this patch performs SHUT_WR in addition to the SHUT_RD when a fin-like segment is encountered. 2) Minor change: I noted that code appears twice in different places and think it makes sense to put this into a self-contained function (dccp_enqueue()). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPSEC]: Make callers of xfrm_lookup to use XFRM_LOOKUP_WAITHerbert Xu
This patch converts all callers of xfrm_lookup that used an explicit value of 1 to indiciate blocking to use the new flag XFRM_LOOKUP_WAIT. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Remove previous loss intervals implementationGerrit Renker
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Interface CCID3 code with newer Loss Intervals DatabaseGerrit Renker
This hooks up the TFRC Loss Interval database with CCID 3 packet reception. In addition, it makes the CCID-specific computation of the first loss interval (which requires access to all the guts of CCID3) local to ccid3.c. The patch also fixes an omission in the DCCP code, that of a default / fallback RTT value (defined in section 3.4 of RFC 4340 as 0.2 sec); while at it, the upper bound of 4 seconds for an RTT sample has been reduced to match the initial TCP RTO value of 3 seconds from[RFC 1122, 4.2.3.1]. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: CCID3 (and CCID4) needs to access these inlinesGerrit Renker
This moves two inlines back to packet_history.h: these are not private to packet_history.c, but are needed by CCID3/4 to detect whether a new loss is indicated, or whether a loss is already pending. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Redundant debugging output / documentationGerrit Renker
Each time feedback is sent two lines are printed: ccid3_hc_rx_send_feedback: client ... - entry ccid3_hc_rx_send_feedback: Interval ...usec, X_recv=..., 1/p=... The first line is redundant and thus removed. Further, documentation of ccid3_hc_rx_sock (capitalisation) is made consistent. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Ringbuffer to track loss interval historyGerrit Renker
A ringbuffer-based implementation of loss interval history is easier to maintain, allocate, and update. The `swap' routine to keep the RX history sorted is due to and was written by Arnaldo Carvalho de Melo, simplifying an earlier macro-based variant. Details: * access to the Loss Interval Records via macro wrappers (with safety checks); * simplified, on-demand allocation of entries (no extra memory consumption on lossless links); cache allocation is local to the module / exported as service; * provision of RFC-compliant algorithm to re-compute average loss interval; * provision of comprehensive, new loss detection algorithm - support for all cases of loss, including re-ordered/duplicate packets; - waiting for NDUPACK=3 packets to fill the hole; - updating loss records when a late-arriving packet fills a hole. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Loss interval code needs the macros/inlines that were movedGerrit Renker
This moves the inlines (which were previously declared as macros) back into packet_history.h since the loss detection code needs to be able to read entries from the RX history in order to create the relevant loss entries: it needs at least tfrc_rx_hist_loss_prev() and tfrc_rx_hist_last_rcv(), which in turn require the definition of the other inlines (macros). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Put RX/TX initialisation into tfrc.cGerrit Renker
This separates RX/TX initialisation and puts all packet history / loss intervals initialisation into tfrc.c. The organisation is uniform: slab declaration -> {rx,tx}_init() -> {rx,tx}_exit() Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: HC-receiver should not insert timestamps as HC-sender doesn't uses itGerrit Renker
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: The function tfrc_rx_hist_entry_delete() is not used anymoreGerrit Renker
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Move comment.Gerrit Renker
Moved up the comment "Receiver routines" above the first occurrence of RX history routines. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: New rx history codeArnaldo Carvalho de Melo
Credit here goes to Gerrit Renker, that provided the initial implementation for this new codebase. I modified it just to try to make it closer to the existing API, renaming some functions, add namespacing and fix one bug where the tfrc_rx_hist_alloc was not freeing the allocated ring entries on the error path. Original changeset comment from Gerrit: ----------- This provides a new, self-contained and generic RX history service for TFRC based protocols. Details: * new data structure, initialisation and cleanup routines; * allocation of dccp_rx_hist entries local to packet_history.c, as a service exported by the dccp_tfrc_lib module. * interface to automatically track highest-received seqno; * receiver-based RTT estimation (needed for instance by RFC 3448, 6.3.1); * a generic function to test for `data packets' as per RFC 4340, sec. 7.7. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: The receiver of a half-connection does not set window counter valuesGerrit Renker
Only the sender sets window counters [RFC 4342, sections 5 and 8.1]. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Rename dccp_rx_ to tfrc_rx_Arnaldo Carvalho de Melo
This is in preparation for merging the new rx history code written by Gerrit Renker. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Make the rx history slab be globalArnaldo Carvalho de Melo
This is in preparation for merging the new rx history code written by Gerrit Renker. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Rename tfrc_tx_hist to tfrc_tx_hist_slab, for consistencyArnaldo Carvalho de Melo
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Introduce generic function to test for `data packets'Gerrit Renker
as per RFC 4340, sec. 7.7. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Provide central source file and debug facilityGerrit Renker
This patch changes the tfrc_lib module in the following manner: (1) a dedicated tfrc source file to call the packet history & loss interval init/exit functions. (2) a dedicated tfrc_pr_debug macro with toggle switch `tfrc_debug'. Commiter note: renamed tfrc_module.c to tfrc.c, and made CONFIG_IP_DCCP_CCID3 select IP_DCCP_TFRC_LIB. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Hide tx history details from the CCIDsArnaldo Carvalho de Melo
Based on a previous patch by Gerrit Renker. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Remove duplicate test for CloseReqGerrit Renker
This removes a redundant test for unexpected packet types. In dccp_rcv_state_process it is tested twice whether a DCCP-server has received a CloseReq (Step 7): * first in the combined if-statement, * then in the call to dccp_rcv_closereq(). The latter is necesssary since dccp_rcv_closereq() is also called from __dccp_rcv_established(). This patch removes the duplicate test. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Integrate state transitions for passive-closeGerrit Renker
This adds the necessary state transitions for the two forms of passive-close * PASSIVE_CLOSE - which is entered when a host receives a Close; * PASSIVE_CLOSEREQ - which is entered when a client receives a CloseReq. Here is a detailed account of what the patch does in each state. 1) Receiving CloseReq The pseudo-code in 8.5 says: Step 13: Process CloseReq If P.type == CloseReq and S.state < CLOSEREQ, Generate Close S.state := CLOSING Set CLOSING timer. This means we need to address what to do in CLOSED, LISTEN, REQUEST, RESPOND, PARTOPEN, and OPEN. * CLOSED: silently ignore - it may be a late or duplicate CloseReq; * LISTEN/RESPOND: will not appear, since Step 7 is performed first (we know we are the client); * REQUEST: perform Step 13 directly (no need to enqueue packet); * OPEN/PARTOPEN: enter PASSIVE_CLOSEREQ so that the application has a chance to process unread data. When already in PASSIVE_CLOSEREQ, no second CloseReq is enqueued. In any other state, the CloseReq is ignored. I think that this offers some robustness against rare and pathological cases: e.g. a simultaneous close where the client sends a Close and the server a CloseReq. The client will then be retransmitting its Close until it gets the Reset, so ignoring the CloseReq while in state CLOSING is sane. 2) Receiving Close The code below from 8.5 is unconditional. Step 14: Process Close If P.type == Close, Generate Reset(Closed) Tear down connection Drop packet and return Thus we need to consider all states: * CLOSED: silently ignore, since this can happen when a retransmitted or late Close arrives; * LISTEN: dccp_rcv_state_process() will generate a Reset ("No Connection"); * REQUEST: perform Step 14 directly (no need to enqueue packet); * RESPOND: dccp_check_req() will generate a Reset ("Packet Error") -- left it at that; * OPEN/PARTOPEN: enter PASSIVE_CLOSE so that application has a chance to process unread data; * CLOSEREQ: server performed active-close -- perform Step 14; * CLOSING: simultaneous-close: use a tie-breaker to avoid message ping-pong (see comment); * PASSIVE_CLOSEREQ: ignore - the peer has a bug (sending first a CloseReq and now a Close); * TIMEWAIT: packet is ignored. Note that the condition of receiving a packet in state CLOSED here is different from the condition "there is no socket for such a connection": the socket still exists, but its state indicates it is unusable. Last, dccp_finish_passive_close sets either DCCP_CLOSED or DCCP_CLOSING = TCP_CLOSING, so that sk_stream_wait_close() will wait for the final Reset (which will trigger CLOSING => CLOSED). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Dedicated auxiliary states to support passive-closeGerrit Renker
This adds two auxiliary states to deal with passive closes: * PASSIVE_CLOSE (reached from OPEN via reception of Close) and * PASSIVE_CLOSEREQ (reached from OPEN via reception of CloseReq) as internal intermediate states. These states are used to allow a receiver to process unread data before acknowledging the received connection-termination-request (the Close/CloseReq). Without such support, it will happen that passively-closed sockets enter CLOSED state while there is still unprocessed data in the queue; leading to unexpected and erratic API behaviour. PASSIVE_CLOSE has been mapped into TCPF_CLOSE_WAIT, so that the code will seamlessly work with inet_accept() (which tests for this state). The state names are thanks to Arnaldo, who suggested this naming scheme following an earlier revision of this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Use AF-independent rebuild_header routineGerrit Renker
This fixes a nasty bug: dccp_send_reset() is called by both DCCPv4 and DCCPv6, but uses inet_sk_rebuild_header() in each case. This leads to unpredictable and weird behaviour: under some conditions, DCCPv6 Resets were sent, in other not. The fix is to use the AF-independent rebuild_header routine. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Migrate TX history to singly-linked lisArnaldo Carvalho de Melo
This patch was based on another made by Gerrit Renker, his changelog was: ------------------------------------------------------ The patch set migrates TFRC TX history to a singly-linked list. The details are: * use of a consistent naming scheme (all TFRC functions now begin with `tfrc_'); * allocation and cleanup are taken care of internally; * provision of a lookup function, which is used by the CCID TX infrastructure to determine the time a packet was sent (in turn used for RTT sampling); * integration of the new interface with the present use in CCID3. ------------------------------------------------------ Simplifications I did: . removing the tfrc_tx_hist_head that had a pointer to the list head and another for the slabcache. . No need for creating a slabcache for each CCID that wants to use the TFRC tx history routines, create a single slabcache when the dccp_tfrc_lib module init routine is called. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET]: Name magic constants in sock_wake_async()Pavel Emelyanov
The sock_wake_async() performs a bit different actions depending on "how" argument. Unfortunately this argument ony has numerical magic values. I propose to give names to their constants to help people reading this function callers understand what's going on without looking into this function all the time. I suppose this is 2.6.25 material, but if it's not (or the naming seems poor/bad/awful), I can rework it against the current net-2.6 tree. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Add support for abortive releaseGerrit Renker
This continues from the previous patch and adds support for actively aborting a DCCP connection, using a Reset Code 2, "Aborted" to inform the peer of an abortive release. I have tried this in various client/server settings and it works as expected. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Check for unread data on closeGerrit Renker
This removes one FIXME with regard to close when there is still unread data. The mechanism is implemented similar to TCP: with regard to DCCP-specifics, a Reset with Code 2, "Aborted" is sent to the peer. This corresponds in part to RFC 4340, 8.1.1 and 8.1.5. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Remove misleading commentGerrit Renker
This removes a comment which identifies an `issue' with dccp_write_xmit() where there is none. The comment assumes it is possible that a packet is sent between the calls to ccid_hc_tx_send_packet(), dccp_transmit_skb(), ccid_hc_tx_packet_sent() (in the above order) in dccp_write_xmit(). I think that this is impossible, since dccp_write_xmit() is always called under lock: * when called as dccp_write_xmit(sk, 1) from dccp_send_close(), the socket is locked (see code comment above dccp_send_close()); * when called as dccp_write_xmit(sk, 0) from dccp_send_msg(), it is after lock_sock() has been called; * when called as dccp_write_xmit(sk, 0) from dccp_write_xmit_timer(), bh_lock_sock() has been called and the if/else statement has made sure that sk_lock.owner is not set; * there are no other places where dccp_write_xmit() is called. Furthermore, the debug statement for printing the sequence number of the packet just sent has been removed, since the entire list is being printed anyway and so the entry of that number appears last. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Remove redundant ack-counting variableGerrit Renker
The code used two different variables to count Acks, one of them redundant. This patch reduces the number of Ack counters to one. The type of the Ack counter has also been changed to u32 (twice the range of int); and the variable has been renamed into `packets_acked' - for consistency with RFC 3465 (and similarly named variables are used by TCP and SCTP). Lastly, a slightly less aggressive `maxincr' increment is used (for even Ack Ratios, maxincr was Ack Ratio/2 + 1 instead of Ack Ratio/2). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Remove redundant synchronisation variableGerrit Renker
This removes the synchronisation variable `ccid2hctx_sendwait', which is set to 1 when the CCID2 sender may send a new packet, and which is set to 0 otherwise The variable is redundant, since it is only used in combination with the hc_tx_send_packet/ hc_tx_packet_sent function pair. Both functions are called under socket lock, so the following happens when the CCID2 may send a new packet: * it sets sendwait = 1 in tx_send_packet and returns 0; * the subsequent call to tx_packet_sent clears the sendwait flag; * since tx_send_packet returns 0 if and only if sendwait == 1, the BUG_ON condition in tx_packet_sent is never satisfied, since that function is never called when tx_send_packet returns a value different from 0 (cf. dccp_write_xmit); * the call to tx_packet_sent clears the flag so that the condition "!sendwait" is true the next time tx_packet_sent is called. In other words, it is sufficient to just return 0 / not-0 to synchronise tx_send_packet and tx_packet_sent -- which is what the patch does. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Redundant debugging outputGerrit Renker
This reduces the amount of redundant debugging messages: * pipe/cwnd are printed in both tx_send_packet() and tx_packet_sent(). Both functions are called immediately after one another, so one occurrence is sufficient. * Since tx_packet_sent() prints pipe/cwnd already, the second printk for pipe is redundant. * In tx_packet_sent() the check_sanity function is called twice (at the begin and at the end). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Replace pipe assignment-function with assignmentGerrit Renker
The function ccid2_change_pipe only does an assignment. This patch simplifies the code by replacing the function with the assignment it performs. Furthermore, the type of pipe is promoted from `signed' to unsigned (increasing the range). As a result, a BUG_ON test for negative values now becomes obsolete (for safety not removed, but replaced with a less annoying `DCCP_BUG'). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Replace cwnd assignment-function with assignmentGerrit Renker
The current function ccid2_change_cwnd in effect makes only an assignment, as the test whether cwnd has reached 0 is only required when cwnd is halved. This patch simplifies the code by replacing the function with the assignment it performs. Furthermore, since ssthresh derives from cwnd and appears in many assignments and comparisons, the type of ssthresh has also been changed to match that of cwnd. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Replace read-only variable with constantGerrit Renker
This replaces the field member `numdupack', which was used as a read-only constant in the code, with a #define. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Remove unused variableGerrit Renker
This removes a variable `ccid2hctx_sent' which is incremented but never referenced/read (i.e., dead code). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Disable broken Ack Ratio adaptation algorithmGerrit Renker
This comments out a problematic section comprising a half-finished algorithm: - The variable `ccid2hctx_ackloss' is never initialised to a value different from 0 and hence in fact is a read-only constant. - The `arsent' variable counts packets other than Acks (it is incremented for every packet), and there is no test for Ack Loss. - The concept of counting Acks as such leads to a complex calculation, and the calculation at the moment is inconsistent with this concept. The problem is that the number of Acks - rather than the number of windows - is counted, which leads to a complex (cubic/quadratic) expression - this is not even implemented. In its current state, the commented-out algorithm interfers with normal processing by changing Ack Ratio incorrectly, and at the wrong times. A new algorithm is necessary, which will not necessarily use the same variables as used by the unfinished one; hence the old variables have been removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Larger initial windows also for CCID2Gerrit Renker
RFC 4341, sec. 5 states that "The cwnd parameter is initialized to at most four packets for new connections, following the rules from [RFC3390]", which is implemented by this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Initialize dccp_sock before calling the ccid constructorsArnaldo Carvalho de Melo
This is because in the next patch CCID2 will assume that dccps_mss_cache is non-zero. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Deadlock and spurious timeouts when Ack Ratio > cwndGerrit Renker
This patch removes a bug in the current code. I agree with Andrea's comment that there is a problem here but the way it is treated does not fix it. The problem is that whenever Ack Ratio > cwnd, starvation/deadlock occurs: * the receiver will not send an Ack until (Ack Ratio - cwnd) data packets have arrived; * the sender will not send any data packet before the receipt of an Ack advances the send window. The only way that the connection then progresses was via RTO timeout. In one extreme case (bulk transfer), it was observed that this happened for every single packet; i.e. hundreds of packets, each a RTO timeout of 1..3 seconds apart: a transfer which normally would take a fraction of a second thus grew to several minutes. The solution taken by this approach is to observe the relation "Ack Ratio <= cwnd" by using the constraint (1) from RFC 4341, 6.1.2; i.e. set Ack Ratio = ceil(cwnd / 2) and update it whenever either Ack Ratio or cwnd change. This ensures that the deadlock problem can not arise. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Don't assign negative values to Ack RatioGerrit Renker
Since it makes not sense to assign negative values to Ack Ratio, this patch disallows this possibility. As a consequence, a Bug test for negative Ack Ratio values becomes obsolete. Furthermore, a check against overflow (as Ack Ratio may not exceed 2 bytes, due to RFC 4340, 11.3) has been added. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Fix sequence number arithmetic/comparisonsGerrit Renker
This replaces use of normal subtraction with modulo-48 subtraction. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Bug in reading Ack VectorsGerrit Renker
In CCID2 the receiver-history is sorted in ascending order of sequence number, but the processing of received Ack Vectors requires the list traversal in the opposite direction. The current code has a bug in this regard: the list traversal is upwards. As a consequence, only Ack Vectors with a run length of 1 will pass, in all other Ack Vectors the remaining (acked) sequence numbers are missed, and may later falsely be identified as lost. Note: This bug is only visible when Ack Ratio > 1, since otherwise the run lengths of Ack Vectors are 0. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[ACKVEC]: Reduce length of identifiersGerrit Renker
This is reduces the length of the struct ackvec/ackvec_record fields. It is a purely text-based replacement: s#dccpavr_#avr_#g; s#dccpav_#av_#g; and increases readability somewhat. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Ignore Ack Vectors / Elapsed Time on DCCP-Request alsoGerrit Renker
Small update with regard to RFC 4340 (references added as documentation): on Requests, Ack Vectors / Elapsed Time should be ignored. Length handling of Elapsed Time also simplified. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>