aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2005-06-18[PKT_SCHED]: Add and use prio2list() in the pfifo_fast qdiscThomas Graf
prio2list() returns the relevant sk_buff_head for the band specified by the priority for a given skb. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[PKT_SCHED]: Transform pfifo_fast to use generic queue management interfaceThomas Graf
Gives pfifo_fast a byte based backlog. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[PKT_SCHED]: Cleanup fifo qdisc and remove unnecessary codeThomas Graf
Removes the skb trimming code which is not needed since we never touch the skb upon failure. Removes unnecessary includes, initializers, and simplifies the code a bit. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[PKT_SCHED]: Transform fifo qdisc to use generic queue management interfaceThomas Graf
The simplicity of the fifo qdisc allows several qdisc operations to be redirected to the relevant queue management function directly. Saves a lot of code lines and gives the pfifo a byte based backlog. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[SCTP]: Replace spin_lock_irqsave with spin_lock_bhHerbert Xu
This patch replaces the spin_lock_irqsave call on the receive queue lock in SCTP with spin_lock_bh. Despite the proliferation of spin_lock_irqsave calls in this stack, it is only entered from the IPv4/IPv6 stack and user space. That is, it is never entered from hardirq context. The call in question is only called from recvmsg which means that IRQs aren't disabled. Therefore it is safe to replace it with spin_lock_bh. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[IPV4/IPV6]: Replace spin_lock_irq with spin_lock_bhHerbert Xu
In light of my recent patch to net/ipv4/udp.c that replaced the spin_lock_irq calls on the receive queue lock with spin_lock_bh, here is a similar patch for all other occurences of spin_lock_irq on receive/error queue locks in IPv4 and IPv6. In these stacks, we know that they can only be entered from user or softirq context. Therefore it's safe to disable BH only. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NETLINK]: Set correct pid for ioctl originating netlink eventsJamal Hadi Salim
This patch ensures that netlink events created as a result of programns using ioctls (such as ifconfig, route etc) contains the correct PID of those events. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NETLINK]: Explicit typingJamal Hadi Salim
This patch converts "unsigned flags" to use more explict types like u16 instead and incrementally introduces NLMSG_NEW(). Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[DECNET]: Remove unnecessary initilization of unused variable entriesThomas Graf
This patch was supposed to be part of the neighbour tables related patchset but apparently got lost. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[IPSEC]: Add XFRMA_SA/XFRMA_POLICY for delete notificationHerbert Xu
This patch changes the format of the XFRM_MSG_DELSA and XFRM_MSG_DELPOLICY notification so that the main message sent is of the same format as that received by the kernel if the original message was via netlink. This also means that we won't lose the byid information carried in km_event. Since this user interface is introduced by Jamal's patch we can still afford to change it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NETLINK]: Correctly set NLM_F_MULTI without checking the pidJamal Hadi Salim
This patch rectifies some rtnetlink message builders that derive the flags from the pid. It is now explicit like the other cases which get it right. Also fixes half a dozen dumpers which did not set NLM_F_MULTI at all. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NETLINK]: Introduce NLMSG_NEW macro to better handle netlink flagsThomas Graf
Introduces a new macro NLMSG_NEW which extends NLMSG_PUT but takes a flags argument. NLMSG_PUT stays there for compatibility but now calls NLMSG_NEW with flags == 0. NLMSG_PUT_ANSWER is renamed to NLMSG_NEW_ANSWER which now also takes a flags argument. Also converts the users of NLMSG_PUT_ANSWER to use NLMSG_NEW_ANSWER and fixes the two direct users of __nlmsg_put to either provide the flags or use NLMSG_NEW(_ANSWER). Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[PKT_SCHED]: Logic simplifications and codingstyle/whitespace cleanupsThomas Graf
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[PKT_SCHED]: Make dsmark use the new dumping macrosThomas Graf
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[PKT_SCHED]: Fix dsmark to apply changes consistentThomas Graf
Fixes dsmark to do all configuration sanity checks first and only apply the changes if all of them can be applied without any errors. Also fixes the weak sanity checks for DSMARK_VALUE and DSMASK_MASK. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NEIGH]: Fix use of uninitialized variable when trimming in neightbl_fill_parmsThomas Graf
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NETLINK]: Kill bogus NLMSG_SET_MULTIPART uses.Thomas Graf
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NETLINK]: Neighbour table configuration and statistics via rtnetlinkThomas Graf
To retrieve the neighbour tables send RTM_GETNEIGHTBL with the NLM_F_DUMP flag set. Every neighbour table configuration is spread over multiple messages to avoid running into message size limits on systems with many interfaces. The first message in the sequence transports all not device specific data such as statistics, configuration, and the default parameter set. This message is followed by 0..n messages carrying device specific parameter sets. Although the ordering should be sufficient, NDTA_NAME can be used to identify sequences. The initial message can be identified by checking for NDTA_CONFIG. The device specific messages do not contain this TLV but have NDTPA_IFINDEX set to the corresponding interface index. To change neighbour table attributes, send RTM_SETNEIGHTBL with NDTA_NAME set. Changeable attribute include NDTA_THRESH[1-3], NDTA_GC_INTERVAL, and all TLVs in NDTA_PARMS unless marked otherwise. Device specific parameter sets can be changed by setting NDTPA_IFINDEX to the interface index of the corresponding device. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NET]: Move sysctl_max_syn_backlog into request_sock.cDavid S. Miller
This fixes the CONFIG_INET=n build failure noticed by Andrew Morton. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NET] rename struct tcp_listen_opt to struct listen_sockArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NET] Generalise tcp_listen_optArnaldo Carvalho de Melo
This chunks out the accept_queue and tcp_listen_opt code and moves them to net/core/request_sock.c and include/net/request_sock.h, to make it useful for other transport protocols, DCCP being the first one to use it. Next patches will rename tcp_listen_opt to accept_sock and remove the inline tcp functions that just call a reqsk_queue_ function. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NET] Rename open_request to request_sockArnaldo Carvalho de Melo
Ok, this one just renames some stuff to have a better namespace and to dissassociate it from TCP: struct open_request -> struct request_sock tcp_openreq_alloc -> reqsk_alloc tcp_openreq_free -> reqsk_free tcp_openreq_fastfree -> __reqsk_free With this most of the infrastructure closely resembles a struct sock methods subset. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[NET] Generalise TCP's struct open_request minisock infrastructureArnaldo Carvalho de Melo
Kept this first changeset minimal, without changing existing names to ease peer review. Basicaly tcp_openreq_alloc now receives the or_calltable, that in turn has two new members: ->slab, that replaces tcp_openreq_cachep ->obj_size, to inform the size of the openreq descendant for a specific protocol The protocol specific fields in struct open_request were moved to a class hierarchy, with the things that are common to all connection oriented PF_INET protocols in struct inet_request_sock, the TCP ones in tcp_request_sock, that is an inet_request_sock, that is an open_request. I.e. this uses the same approach used for the struct sock class hierarchy, with sk_prot indicating if the protocol wants to use the open_request infrastructure by filling in sk_prot->rsk_prot with an or_calltable. Results? Performance is improved and TCP v4 now uses only 64 bytes per open request minisock, down from 96 without this patch :-) Next changeset will rename some of the structs, fields and functions mentioned above, struct or_calltable is way unclear, better name it struct request_sock_ops, s/struct open_request/struct request_sock/g, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[IPSEC] Use NLMSG_LENGTH in xfrm_exp_state_notifyJamal Hadi Salim
Small fixup to use netlink macros instead of hardcoding. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[IPSEC] Fix xfrm_state leaks in error pathPatrick McHardy
Herbert Xu wrote: > @@ -1254,6 +1326,7 @@ static int pfkey_add(struct sock *sk, st > if (IS_ERR(x)) > return PTR_ERR(x); > > + xfrm_state_hold(x); This introduces a leak when xfrm_state_add()/xfrm_state_update() fail. We hold two references (one from xfrm_state_alloc(), one from xfrm_state_hold()), but only drop one. We need to take the reference because the reference from xfrm_state_alloc() can be dropped by __xfrm_state_delete(), so the fix is to drop both references on error. Same problem in xfrm_user.c. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[IPSEC] Use XFRM_MSG_* instead of XFRM_SAP_*Herbert Xu
This patch removes XFRM_SAP_* and converts them over to XFRM_MSG_*. The netlink interface is meant to map directly onto the underlying xfrm subsystem. Therefore rather than using a new independent representation for the events we can simply use the existing ones from xfrm_user. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-06-18[IPSEC] Set byid for km_event in xfrm_get_policyHerbert Xu
This patch fixes policy deletion in xfrm_user so that it sets km_event.data.byid. This puts xfrm_user on par with what af_key does in this case. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-06-18[IPSEC] Turn km_event.data into a unionHerbert Xu
This patch turns km_event.data into a union. This makes code that uses it clearer. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-06-18[IPSEC] Fix xfrm to pfkey SA state conversionHerbert Xu
This patch adjusts the SA state conversion in af_key such that XFRM_STATE_ERROR/XFRM_STATE_DEAD will be converted to SADB_STATE_DEAD instead of SADB_STATE_DYING. According to RFC 2367, SADB_STATE_DYING SAs can be turned into mature ones through updating their lifetime settings. Since SAs which are in the states XFRM_STATE_ERROR/XFRM_STATE_DEAD cannot be resurrected, this value is unsuitable. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-06-18[IPSEC] Kill spurious hard expire messagesHerbert Xu
This patch ensures that the hard state/policy expire notifications are only sent when the state/policy is successfully removed from their respective tables. As it is, it's possible for a state/policy to both expire through reaching a hard limit, as well as being deleted by the user. Note that this behaviour isn't actually forbidden by RFC 2367. However, it is a quality of implementation issue. As an added bonus, the restructuring in this patch will help eventually in moving the expire notifications from softirq context into process context, thus improving their reliability. One important side-effect from this change is that SAs reaching their hard byte/packet limits are now deleted immediately, just like SAs that have reached their hard time limits. Previously they were announced immediately but only deleted after 30 seconds. This is bad because it prevents the system from issuing an ACQUIRE command until the existing state was deleted by the user or expires after the time is up. In the scenario where the expire notification was lost this introduces a 30 second delay into the system for no good reason. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-06-18[IPSEC] Add complete xfrm event notificationJamal Hadi Salim
Heres the final patch. What this patch provides - netlink xfrm events - ability to have events generated by netlink propagated to pfkey and vice versa. - fixes the acquire lets-be-happy-with-one-success issue Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-06-18Merge master.kernel.org:/pub/scm/linux/kernel/git/dwmw2/audit-2.6Linus Torvalds
2005-06-18Manual merge of ↵Linus Torvalds
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git This is a fixed-up version of the broken "upstream-2.6.13" branch, where I re-did the manual merge of drivers/net/r8169.c by hand, and made sure the history is all good.
2005-06-18Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.gitDavid Woodhouse
2005-06-15[NETFILTER]: ipt_recent: last_pkts is an array of "unsigned long" not ↵David S. Miller
"u_int32_t" This fixes various crashes on 64-bit when using this module. Based upon a patch by Juergen Kreileder <jk@blackdown.de>. Signed-off-by: David S. Miller <davem@davemloft.net> ACKed-by: Patrick McHardy <kaber@trash.net>
2005-06-13[NETFILTER]: Advance seq-file position in exp_next_seq()Patrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-13[IPV4]: Sysctl configurable icmp error source address.J. Simonetti
This patch alows you to change the source address of icmp error messages. It applies cleanly to 2.6.11.11 and retains the default behaviour. In the old (default) behaviour icmp error messages are sent with the ip of the exiting interface. The new behaviour (when the sysctl variable is toggled on), it will send the message with the ip of the interface that received the packet that caused the icmp error. This is the behaviour network administrators will expect from a router. It makes debugging complicated network layouts much easier. Also, all 'vendor routers' I know of have the later behaviour. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-13[SCTP] Fix incorrect setting of sk_bound_dev_if when binding/sending to a ipv6Sridhar Samudrala
link local address. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-13[SCTP] Add support for ip_nonlocal_bind sysctl & IP_FREEBIND socket optionNeil Horman
Signed-off-by: Neil Horman <nhorman@redhat.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-13[SCTP] Extend the info exported via /proc/net/sctp to support netstat for SCTP.Vladislav Yasevich
Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-13[SCTP] Support SO_BINDTODEVICE socket option on incoming packets.Neil Horman
Signed-off-by: Neil Horman <nhorman@redhat.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-13[SCTP]: Fix bug in restart of peeled-off associations.Vladislav Yasevich
Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-13[IPv6] Don't generate temporary for TUN devicesRémi Denis-Courmont
Userland layer-2 tunneling devices allocated through the TUNTAP driver (drivers/net/tun.c) have a type of ARPHRD_NONE, and have no link-layer address. The kernel complains at regular interval when IPv6 Privacy extension are enabled because it can't find an hardware address : Dec 29 11:02:04 auguste kernel: __ipv6_regen_rndid(idev=cb3e0c00): cannot get EUI64 identifier; use random bytes. IPv6 Privacy extensions should probably be disabled on that sort of device. They won't work anyway. If userland wants a more usual Ethernet-ish interface with usual IPv6 autoconfiguration, it will use a TAP device with an emulated link-layer and a random hardware address rather than a TUN device. As far as I could fine, TUN virtual device from TUNTAP is the very only sort of device using ARPHRD_NONE as kernel device type. Signed-off-by: Rémi Denis-Courmont <rdenis@simphalempin.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-13[IPV6]: Ensure to use icmpv6_socket in non-preemptive context.YOSHIFUJI Hideaki
We saw following trace several times: |BUG: using smp_processor_id() in preemptible [00000001] code: httpd/30137 |caller is icmpv6_send+0x23/0x540 | [<c01ad63b>] smp_processor_id+0x9b/0xb8 | [<c02993e7>] icmpv6_send+0x23/0x540 This is because of icmpv6_socket, which is the only one user of smp_processor_id() in icmpv6_send(), AFAIK. Since it should be used in non-preemptive context, let's defer the dereference after disabling preemption (by icmpv6_xmit_lock()). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-13[NET]: Move the netdev list to vger.kernel.org.Ralf Baechle
From: Ralf Baechle <ralf@linux-mips.org> There are archives of the old list at http://oss.sgi.com/archives/netdev Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-13[IPV4]: Multipath modules need a license to prevent kernel tainting.Randy Dunlap
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-13[TCP]: Adjust TCP mem order check to new alloc_large_system_hashAndi Kleen
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-08[PKT_SCHED]: Fix numeric comparison in meta ematchThomas Graf
This patch is brought to you by the department of applied stupidity. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-08[PKT_SCHED]: Dump classification result for basic classifierThomas Graf
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-08[PKT_SCHED]: Allow socket attributes to be matched on via meta ematchThomas Graf
Adds meta collectors for all socket attributes that make sense to be filtered upon. Some of them are only useful for debugging but having them doesn't hurt. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>