aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2008-10-08netfilter: netns: fix {ip,6}_route_me_harder() in netnsAlexey Dobriyan
Take netns from skb->dst->dev. It should be safe because, they are called from LOCAL_OUT hook where dst is valid (though, I'm not exactly sure about IPVS and queueing packets to userspace). [Patrick: its safe everywhere since they already expect skb->dst to be set] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: netns nf_conntrack: per-netns conntrack hashAlexey Dobriyan
* make per-netns conntrack hash Other solution is to add ->ct_net pointer to tuplehashes and still has one hash, I tried that it's ugly and requires more code deep down in protocol modules et al. * propagate netns pointer to where needed, e. g. to conntrack iterators. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: netns nf_conntrack: per-netns conntrack countAlexey Dobriyan
Sysctls and proc files are stubbed to init_net's one. This is temporary. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: netns nf_conntrack: add ->ct_net -- pointer from conntrack to netnsAlexey Dobriyan
Conntrack (struct nf_conn) gets pointer to netns: ->ct_net -- netns in which it was created. It comes from netdevice. ->ct_net is write-once field. Every conntrack in system has ->ct_net initialized, no exceptions. ->ct_net doesn't pin netns: conntracks are recycled after timeouts and pinning background traffic will prevent netns from even starting shutdown sequence. Right now every conntrack is created in init_net. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: netns nf_conntrack: add netns boilerplateAlexey Dobriyan
One comment: #ifdefs around #include is necessary to overcome amazing compile breakages in NOTRACK-in-netns patch (see below). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: netns: ip6t_REJECT in netns for realAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: netns: ip6table_mangle in netns for realAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: netns: ip6table_raw in netns for realAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: netns: remove nf_*_net() wrappersAlexey Dobriyan
Now that dev_net() exists, the usefullness of them is even less. Also they're a big problem in resolving circular header dependencies necessary for NOTRACK-in-netns patch. See below. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: implement NFPROTO_UNSPEC as a wildcard for extensionsJan Engelhardt
When a match or target is looked up using xt_find_{match,target}, Xtables will also search the NFPROTO_UNSPEC module list. This allows for protocol-independent extensions (like xt_time) to be reused from other components (e.g. arptables, ebtables). Extensions that take different codepaths depending on match->family or target->family of course cannot use NFPROTO_UNSPEC within the registration structure (e.g. xt_pkttype). Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: x_tables: use NFPROTO_* in extensionsJan Engelhardt
Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: Introduce NFPROTO_* constantsJan Engelhardt
The netfilter subsystem only supports a handful of protocols (much less than PF_*) and even non-PF protocols like ARP and pseudo-protocols like PF_BRIDGE. By creating NFPROTO_*, we can earn a few memory savings on arrays that previously were always PF_MAX-sized and keep the pseudo-protocols to ourselves. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: xt_recent: IPv6 supportJan Engelhardt
This updates xt_recent to support the IPv6 address family. The new /proc/net/xt_recent directory must be used for this. The old proc interface can also be configured out. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: rename ipt_recent to xt_recentJan Engelhardt
Like with other modules (such as ipt_state), ipt_recent.h is changed to forward definitions to (IOW include) xt_recent.h, and xt_recent.c is changed to use the new constant names. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08netfilter: Use unsigned types for hooknum and pf varsJan Engelhardt
and (try to) consistently use u_int8_t for the L3 family. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-07netns: make uplitev6 mib per/namespaceDenis V. Lunev
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07netns: make udpv6 mib per/namespaceDenis V. Lunev
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07netns: add stub functions for per/namespace mibs allocationDenis V. Lunev
The content of init_ipv6_mibs/cleanup_ipv6_mibs will be moved to new calls one by one next. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07netns: allow per device ipv6 snmp statistics in non-initial namespaceDenis V. Lunev
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07netns: register global ipv6 mibs statistics in each namespaceDenis V. Lunev
Unused net variable will become used very soon. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07ipv6: separate seq_ops for global & per/device ipv6 statisticsDenis V. Lunev
idev has been stored on seq->private. NULL has been stored for global statistics. The situation is changed with net namespace. We need to store pointer to struct net and the only place is seq->private. So, we'll have for /proc/net/dev_snmp6/* and for /proc/net/snmp6 pointers of two different types stored in the same field. This effectively requires to separate seq_ops of these files. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07ipv6: consolidate ipv6 sock_stat code at the beginning of net/ipv6/proc.cDenis V. Lunev
Simple, comsolidate sockstat6 staff in one place, at the beginning of the file. Right now sockstat6_seq_open/sockstat6_seq_fops looks like an intrusion in the middle of snmp6 code. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07netns: register /proc/net/dev_snmp6/* in each nsDenis V. Lunev
Do the same for /proc/net/snmp6. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07netns: move /proc/net/dev_snmp6 to struct netDenis V. Lunev
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07tcp: cleanup messy initializerIlpo Järvinen
I'm quite sure that if I give this function in its old format for you to inspect, you start to wonder what is the type of demanded or if it's a global variable. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07tcp: kill pointless urg_modeIlpo Järvinen
It all started from me noticing that this urgent check in tcp_clean_rtx_queue is unnecessarily inside the loop. Then I took a longer look to it and found out that the users of urg_mode can trivially do without, well almost, there was one gotcha. Bonus: those funny people who use urg with >= 2^31 write_seq - snd_una could now rejoice too (that's the only purpose for the between being there, otherwise a simple compare would have done the thing). Not that I assume that the rest of the tcp code happily lives with such mind-boggling numbers :-). Alas, it turned out to be impossible to set wmem to such numbers anyway, yes I really tried a big sendfile after setting some wmem but nothing happened :-). ...Tcp_wmem is int and so is sk_sndbuf... So I hacked a bit variable to long and found out that it seems to work... :-) Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07net: packet split receive apiPeter Zijlstra
Add some packet-split receive hooks. For one this allows to do NUMA node affine page allocs. Later on these hooks will be extended to do emergency reserve allocations for fragments. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07net: wrap sk->sk_backlog_rcv()Peter Zijlstra
Wrap calling sk->sk_backlog_rcv() in a function. This will allow extending the generic sk_backlog_rcv behaviour. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07ipv6: initialize ip6_route sysctl vars in ip6_route_net_init()Peter Zijlstra
This makes that ip6_route_net_init() does all of the route init code. There used to be a race between ip6_route_net_init() and ip6_net_init() and someone relying on the combined result was left out cold. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07ipv6: clean up ip6_route_net_init() error handlingPeter Zijlstra
ip6_route_net_init() error handling looked less than solid, fix 'er up. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07inet: Don't lookup the socket if there's a socket attached to the skbKOVACS Krisztian
Use the socket cached in the skb if it's present. Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07inet: Add udplib_lookup_skb() helpersKOVACS Krisztian
To be able to use the cached socket reference in the skb during input processing we add a new set of lookup functions that receive the skb on their argument list. Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07inet_hashtables: Add inet_lookup_skb helpersArnaldo Carvalho de Melo
To be able to use the cached socket reference in the skb during input processing we add a new set of lookup functions that receive the skb on their argument list. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-06tcp: Respect SO_RCVLOWAT in tcp_poll().David S. Miller
Based upon a report by Vito Caputo. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-06pkt_sched: Simplify dev_requeue_skb and dequeue_skbJarek Poplawski
qdisc->requeue was planned to universally replace all requeuing code, but at the top level we never requeue more than one skb, so qdisc-> gso_skb is enough for this. qdisc->requeue would be used on the lower levels only for one level deep requeuing (like in sch_hfsc) after finishing all the changes. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-06pkt_sched: Fix handling of gso skbs on requeuingJarek Poplawski
Jay Cliburn noticed and diagnosed a bug triggered in dev_gso_skb_destructor() after last change from qdisc->gso_skb to qdisc->requeue list. Since gso_segmented skbs can't be queued to another list this patch brings back qdisc->gso_skb for them. Reported-by: Jay Cliburn <jcliburn@gmail.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-05xfrm: MIGRATE enhancements (draft-ebalard-mext-pfkey-enhanced-migrate)Arnaud Ebalard
Provides implementation of the enhancements of XFRM/PF_KEY MIGRATE mechanism specified in draft-ebalard-mext-pfkey-enhanced-migrate-00. Defines associated PF_KEY SADB_X_EXT_KMADDRESS extension and XFRM/netlink XFRMA_KMADDRESS attribute. Signed-off-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-05Phonet: implement GPRS virtual interface over PEP socketRémi Denis-Courmont
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-05Phonet: receive pipe control requests as out-of-band dataRémi Denis-Courmont
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-05Phonet: Pipe End Point for Phonet Pipes protocolRémi Denis-Courmont
This protocol provides some connection handling and negotiated congestion control. Nokia cellular modems use it for bulk transfers. It provides packet boundaries (hence SOCK_SEQPACKET). Congestion control is per packet rather per byte, so we do not re-use the generic socket memory accounting. Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-05Phonet: connected sockets glueRémi Denis-Courmont
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-05Phonet: modules auto-loading supportRémi Denis-Courmont
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01sctp: correctly save sctp_adaptation from parameter.Vlad Yasevich
The INIT perameter carries the adapatation value in network-byte order. We need to store it in host byte order as expected by data types and the user API. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-10-01sctp: enable cookie-echo retransmission transport switchVlad Yasevich
This patch enables cookie-echo retransmission transport switch feature. If COOKIE-ECHO retransmission happens, it will be sent to the address other than the one last sent to. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-10-01sctp: Fix the SNMP counter of SCTP_MIB_OUTOFBLUESWei Yongjun
RFC3873 defined SCTP_MIB_OUTOFBLUES: sctpOutOfBlues OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of out of the blue packets received by the host. An out of the blue packet is an SCTP packet correctly formed, including the proper checksum, but for which the receiver was unable to identify an appropriate association." REFERENCE "Section 8.4 in RFC2960 deals with the Out-Of-The-Blue (OOTB) packet definition and procedures." But OOTB packet INIT, INIT-ACK and SHUTDOWN-ACK(COOKIE-WAIT or COOKIE-ECHOED state) are not counted by SCTP_MIB_OUTOFBLUES. Case 1(INIT): Endpoint A Endpoint B (CLOSED) (CLOSED) INIT ----------> <---------- ABORT Case 2(INIT-ACK): Endpoint A Endpoint B (CLOSED) (CLOSED) INIT-ACK ----------> <---------- ABORT Case 3(SHUTDOWN-ACK): Endpoint A Endpoint B (CLOSED) (CLOSED) <---------- INIT SHUTDOWN-ACK ----------> <---------- SHUTDOWN-COMPLETE Case 4(SHUTDOWN-ACK): Endpoint A Endpoint B (CLOSED) (COOKIE-ECHOED) SHUTDOWN-ACK ----------> <---------- SHUTDOWN-COMPLETE This patch fixed the problem. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-10-01sctp: Fix to start T5-shutdown-guard timer while enter SHUTDOWN-SENT stateWei Yongjun
RFC 4960: Section 9.2 The sender of the SHUTDOWN MAY also start an overall guard timer 'T5-shutdown-guard' to bound the overall time for the shutdown sequence. At the expiration of this timer, the sender SHOULD abort the association by sending an ABORT chunk. If the 'T5-shutdown- guard' timer is used, it SHOULD be set to the recommended value of 5 times 'RTO.Max'. The timer 'T5-shutdown-guard' is used to counter the overall time for shutdown sequence, and it's start by the sender of the SHUTDOWN. So timer 'T5-shutdown-guard' should be start when we send the first SHUTDOWN chunk and enter the SHUTDOWN-SENT state, not start when we receipt of the SHUTDOWN primitive and enter SHUTDOWN-PENDING state. If 'T5-shutdown-guard' timer is start at SHUTDOWN-PENDING state, the association may be ABORT while data is still transmitting. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-10-01sctp: try harder to figure out address family when checking wildcardsVlad Yasevich
sctp_is_any() function that is used to check for wildcard addresses only looks at the address itself to determine the address family. This function is used in the API to check the address passed in from the user. If the user simply zerroes out the sockaddr_storage and pass that in, we'll end up failing. So, let's try harder to determine the address family by also checking the socket if it's possible. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-10-01sctp: reduce memory footprint of sctp_chunk structureNeil Horman
sctp_chunks should be put on a diet. This is some of the low hanging fruit that we can strip out. Changes all the __s8/__u8 flags to bitfields. Saves 12 bytes per chunk. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-10-01sctp: Retransmit list is ineligable for missing indicationsVlad Yasevich
Chunks placed on the retransmit list are marked as inelegible for fast retrasnmission. Since missing indications determine when fast reransmission is done, there is not point in calling sctp_mark_missing() on the retransmit list since those chunks will not be marked. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-10-01sctp: Optimize SFR-CACC transport list walking during SACK processingVlad Yasevich
There is a possibility of walking the transport list twice during SACK processing when doing SFR-CACC algorithm. We can restructure the code to only do this once. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>