aboutsummaryrefslogtreecommitdiff
path: root/net/core
AgeCommit message (Collapse)Author
2005-09-29Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
2005-09-29[PATCH] proc_mkdir() should be used to create procfs directoriesAl Viro
A bunch of create_proc_dir_entry() calls creating directories had crept in since the last sweep; converted to proc_mkdir(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-27[NET]: Fix module reference counts for loadable protocol modulesFrank Filz
I have been experimenting with loadable protocol modules, and ran into several issues with module reference counting. The first issue was that __module_get failed at the BUG_ON check at the top of the routine (checking that my module reference count was not zero) when I created the first socket. When sk_alloc() is called, my module reference count was still 0. When I looked at why sctp didn't have this problem, I discovered that sctp creates a control socket during module init (when the module ref count is not 0), which keeps the reference count non-zero. This section has been updated to address the point Stephen raised about checking the return value of try_module_get(). The next problem arose when my socket init routine returned an error. This resulted in my module reference count being decremented below 0. My socket ops->release routine was also being called. The issue here is that sock_release() calls the ops->release routine and decrements the ref count if sock->ops is not NULL. Since the socket probably didn't get correctly initialized, this should not be done, so we will set sock->ops to NULL because we will not call try_module_get(). While searching for another bug, I also noticed that sys_accept() has a possibility of doing a module_put() when it did not do an __module_get so I re-ordered the call to security_socket_accept(). Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27[NET]: Prefetch dev->qdisc_lock in dev_queue_xmit()Eric Dumazet
We know the lock is going to be taken. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27[NET]: Use non-recursive algorithm in skb_copy_datagram_iovec()Daniel Phillips
Use iteration instead of recursion. Fraglists within fraglists should never occur, so we BUG check this. Signed-off-by: Daniel Phillips <phillips@istop.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27[NEIGH]: Add debugging check when adding timers.David S. Miller
If we double-add a neighbour entry timer, which should be impossible but has been reported, dump the current state of the entry so that we can debug this. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-26Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/llc-2.6David S. Miller
2005-09-24[NET]: Protect neigh_stat_seq_fops by CONFIG_PROC_FSAmos Waterland
From: Amos Waterland <apw@us.ibm.com> If CONFIG_PROC_FS is not selected, the compiler emits this warning: net/core/neighbour.c:64: warning: `neigh_stat_seq_fops' defined but not used Which is correct, because neigh_stat_seq_fops is in fact only initialized and used by code that is protected by CONFIG_PROC_FS. So this patch fixes that up. Signed-off-by: Amos Waterland <apw@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22[LLC]: Fix for Bugzilla ticket #5156Jochen Friedrich
Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-12[NET]: fix-up schedule_timeout() usageNishanth Aravamudan
Use schedule_timeout_{,un}interruptible() instead of set_current_state()/schedule_timeout() to reduce kernel size. Also use human-time conversion functions instead of hard-coded division to avoid rounding issues. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-09[PATCH] more SPIN_LOCK_UNLOCKED -> DEFINE_SPINLOCK conversionsIngo Molnar
This converts the final 20 DEFINE_SPINLOCK holdouts. (another 580 places are already using DEFINE_SPINLOCK). Build tested on x86. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] timer initialization cleanup: DEFINE_TIMERIngo Molnar
Clean up timer initialization by introducing DEFINE_TIMER a'la DEFINE_SPINLOCK. Build and boot-tested on x86. A similar patch has been been in the -RT tree for some time. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07Merge branch 'upstream' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
2005-09-06[NET]: proto_unregister: fix sleeping while atomicPatrick McHardy
proto_unregister holds a lock while calling kmem_cache_destroy, which can sleep. Noticed by Daniele Orlandi <daniele@orlandi.com>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06[PATCH] WE-19 for kernel 2.6.13Jean Tourrilhes
Hi Jeff, This is version 19 of the Wireless Extensions. It was supposed to be the fallback of the WPA API changes, but people seem quite happy about it (especially Jouni), so the patch is rather small. The patch has been fully tested with 2.6.13 and various wireless drivers, and is in its final version. Would you mind pushing that into Linus's kernel so that the driver and the apps can take advantage ot it ? It includes : o iwstat improvement (explicit dBm). This is the result of long discussions with Dan Williams, the authors of NetworkManager. Thanks to him for all the fruitful feedback. o remove pointer from event stream. I was not totally sure if this pointer was 32-64 bits clean, so I'd rather remove it and be at peace with it. o remove linux header from wireless.h. This has long been requested by people writting user space apps, now it's done, and it was not even painful. o final deprecation of spy_offset. You did not like it, it's now gone for good. o Start deprecating dev->get_wireless_stats -> debloat netdev o Add "check" version of event macros for ieee802.11 stack. Jiri Benc doesn't like the current macros, we aim to please ;-) All those changes, except the last one, have been bit-roting on my web pages for a while... Patches for most kernel drivers will follow. Patches for the Orinoco and the HostAP drivers have been sent to their respective maintainers. Have fun... Jean Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-06[NET]: Make sure l_linger is unsigned to avoid negative timeoutsEric Dumazet
One of my x86_64 (linux 2.6.13) server log is filled with : schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca This is because some application does a struct linger li; li.l_onoff = 1; li.l_linger = -1; setsockopt(sock, SOL_SOCKET, SO_LINGER, &li, sizeof(li)); And unfortunatly l_linger is defined as a 'signed int' in include/linux/socket.h: struct linger { int l_onoff; /* Linger active */ int l_linger; /* How long to linger for */ }; I dont know if it's safe to change l_linger to 'unsigned int' in the include file (It might be defined as int in ABI specs) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06Merge branch 'upstream' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
2005-09-05[NET]: 2.6.13 breaks libpcap (and tcpdump)Herbert Xu
Patrick McHardy says: Never mind, I got it, we never fall through to the second switch statement anymore. I think we could simply break when load_pointer returns NULL. The switch statement will fall through to the default case and return 0 for all cases but 0 > k >= SKF_AD_OFF. Here's a patch to do just that. I left BPF_MSH alone because it's really a hack to calculate the IP header length, which makes no sense when applied to the special data. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05[NET]: Do not protect sysctl_optmem_max with CONFIG_SYSCTLDavid S. Miller
The ipv4 and ipv6 protocols need to access it unconditionally. SYSCTL=n build failure reported by Russell King. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05[PATCH] (7/7) __user annotations (ethtool)viro@ftp.linux.org.uk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-29[NET]: use __read_mostly on kmem_cache_t , DEFINE_SNMP_STAT pointersEric Dumazet
This patch puts mostly read only data in the right section (read_mostly), to help sharing of these data between CPUS without memory ping pongs. On one of my production machine, tcp_statistics was sitting in a heavily modified cache line, so *every* SNMP update had to force a reload. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Add support for getting the permanent hardware address.Jon Wetzel
This patch adds a new field to net device to hold the permanent hardware address, and adds a new generic ethtool_op function to get that address. Signed-off-by: Jon Wetzel <jon_wetzel@dell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Implement SKB fast cloning.David S. Miller
Protocols that make extensive use of SKB cloning, for example TCP, eat at least 2 allocations per packet sent as a result. To cut the kmalloc() count in half, we implement a pre-allocation scheme wherein we allocate 2 sk_buff objects in advance, then use a simple reference count to free up the memory at the correct time. Based upon an initial patch by Thomas Graf and suggestions from Herbert Xu. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Fix sparse warningsArnaldo Carvalho de Melo
Of this type, mostly: CHECK net/ipv6/netfilter.c net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static? net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static? Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NETLINK]: Add "groups" argument to netlink_kernel_createPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NETLINK]: Convert netlink users to use group numbers instead of bitmasksPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Store skb->timestamp as offset to a base timestampPatrick McHardy
Reduces skb size by 8 bytes on 64-bit. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NETFILTER]: split net/core/netfilter.c into net/netfilter/*.cHarald Welte
This patch doesn't introduce any code changes, but merely splits the core netfilter code into four separate files. It also moves it from it's old location in net/core/ to the recently-created net/netfilter/ directory. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[ICSK]: Introduce reqsk_queue_prune from code in tcp_synack_timerArnaldo Carvalho de Melo
With this we're very close to getting all of the current TCP refactorings in my dccp-2.6 tree merged, next changeset will export some functions needed by the current DCCP code and then dccp-2.6.git will be born! Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[SOCK]: Introduce sk_cloneArnaldo Carvalho de Melo
Out of tcp_create_openreq_child, will be used in dccp_create_openreq_child, and is a nice sock function anyway. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[INET]: Generalise tcp_tw_bucket, aka TIME_WAIT socketsArnaldo Carvalho de Melo
This paves the way to generalise the rest of the sock ID lookup routines and saves some bytes in TCPv4 TIME_WAIT sockets on distro kernels (where IPv6 is always built as a module): [root@qemu ~]# grep tw_sock /proc/slabinfo tw_sock_TCPv6 0 0 128 31 1 tw_sock_TCP 0 0 96 41 1 [root@qemu ~]# Now if a protocol wants to use the TIME_WAIT generic infrastructure it only has to set the sk_prot->twsk_obj_size field with the size of its inet_timewait_sock derived sock and proto_register will create sk_prot->twsk_slab, for now its only for INET sockets, but we can introduce timewait_sock later if some non INET transport protocolo wants to use this stuff. Next changesets will take advantage of this new infrastructure to generalise even more TCP code. [acme@toy net-2.6.14]$ grep built-in /tmp/before.size /tmp/after.size /tmp/before.size: 188646 11764 5068 205478 322a6 net/ipv4/built-in.o /tmp/after.size: 188144 11764 5068 204976 320b0 net/ipv4/built-in.o [acme@toy net-2.6.14]$ Tested with both IPv4 & IPv6 (::1 (localhost) & ::ffff:172.20.0.1 (qemu host)). Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[TCP]: Move the tcp sock states to net/tcp_states.hArnaldo Carvalho de Melo
Lots of places just needs the states, not even linux/tcp.h, where this enum was, needs it. This speeds up development of the refactorings as less sources are rebuilt when things get moved from net/tcp.h. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NETFILTER]: Extend netfilter logging APIHarald Welte
This patch is in preparation to nfnetlink_log: - loggers now have to register struct nf_logger instead of nf_logfn - nf_log_unregister() replaced by nf_log_unregister_pf() and nf_log_unregister_logger() - add comment to ip[6]t_LOG.h to assure nobody redefines flags - add /proc/net/netfilter/nf_log to tell user which logger is currently registered for which address family - if user has configured logging, but no logging backend (logger) is available, always spit a message to syslog, not just the first time. - split ip[6]t_LOG.c into two parts: Backend: Always try to register as logger for the respective address family Frontend: Always log via nf_log_packet() API - modify all users of nf_log_packet() to accomodate additional argument Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Cleanup INET_REFCNT_DEBUG codeArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NETFILTER]: Core changes required by upcoming nfnetlink_queue codeHarald Welte
- split netfiler verdict in 16bit verdict and 16bit queue number - add 'queuenum' argument to nf_queue_outfn_t and its users ip[6]_queue - move NFNL_SUBSYS_ definitions from enum to #define - introduce autoloading for nfnetlink subsystem modules - add MODULE_ALIAS_NFNL_SUBSYS macro - add nf_unregister_queue_handlers() to register all handlers for a given nf_queue_outfn_t - add more verbose DEBUGP macro definition to nfnetlink.c - make nfnetlink_subsys_register fail if subsys already exists - add some more comments and debug statements to nfnetlink.c Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NETFILTER]: Move reroute-after-queue code up to the nf_queue layer.Harald Welte
The rerouting functionality is required by the core, therefore it has to be implemented by the core and not in individual queue handlers. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NETLINK]: Add properly module refcounting for kernel netlink sockets.Harald Welte
- Remove bogus code for compiling netlink as module - Add module refcounting support for modules implementing a netlink protocol - Add support for autoloading modules that implement a netlink protocol as soon as someone opens a socket for that protocol Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NETFILTER]: Move ipv4 specific code from net/core/netfilter.c to ↵Harald Welte
net/ipv4/netfilter.c Netfilter cleanup - Move ipv4 code from net/core/netfilter.c to net/ipv4/netfilter.c - Move ipv6 netfilter code from net/ipv6/ip6_output.c to net/ipv6/netfilter.c Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NETFILTER]: Rename skb_ip_make_writable() to skb_make_writable()Harald Welte
There is nothing IPv4-specific in it. In fact, it was already used by IPv6, too... Upcoming nfnetlink_queue code will use it for any kind of packet. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Remove explicit initializations of skb->input_devDavid S. Miller
Instead, set it in one place, namely the beginning of netif_receive_skb(). Based upon suggestions from Jamal Hadi Salim. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[IPV4]: possible cleanupsAdrian Bunk
This patch contains the following possible cleanups: - make needlessly global code static - #if 0 the following unused global function: - xfrm4_state.c: xfrm4_state_fini - remove the following unneeded EXPORT_SYMBOL's: - ip_output.c: ip_finish_output - ip_output.c: sysctl_ip_default_ttl - fib_frontend.c: ip_dev_find - inetpeer.c: inet_peer_idlock - ip_options.c: ip_options_compile - ip_options.c: ip_options_undo - net/core/request_sock.c: sysctl_max_syn_backlog Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Kill skb->real_devDavid S. Miller
Bonding just wants the device before the skb_bond() decapsulation occurs, so simply pass that original device into packet_type->func() as an argument. It remains to be seen whether we can use this same exact thing to get rid of skb->input_dev as well. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[REQSK]: Move the syn_table destroy from tcp_listen_stop to reqsk_queue_destroyArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Remove HIPPI private from skbuff.hStephen Hemminger
This removes the private element from skbuff, that is only used by HIPPI. Instead it uses skb->cb[] to hold the additional data that is needed in the output path from hard_header to device driver. PS: The only qdisc that might potentially corrupt this cb[] is if netem was used over HIPPI. I will take care of that by fixing netem to use skb->stamp. I don't expect many users of netem over HIPPI Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Introduce SO_{SND,RCV}BUFFORCE socket optionsPatrick McHardy
Allows overriding of sysctl_{wmem,rmrm}_max Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Kill skb->tc_classidPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Kill skb->listDavid S. Miller
Remove the "list" member of struct sk_buff, as it is entirely redundant. All SKB list removal callers know which list the SKB is on, so storing this in sk_buff does nothing other than taking up some space. Two tricky bits were SCTP, which I took care of, and two ATM drivers which Francois Romieu <romieu@fr.zoreil.com> fixed up. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
2005-08-29[NETFILTER]: reduce netfilter sk_buff enlargementHarald Welte
As discussed at netconf'05, we're trying to save every bit in sk_buff. The patch below makes sk_buff 8 bytes smaller. I did some basic testing on my notebook and it seems to work. The only real in-tree user of nfcache was IPVS, who only needs a single bit. Unfortunately I couldn't find some other free bit in sk_buff to stuff that bit into, so I introduced a separate field for them. Maybe the IPVS guys can resolve that to further save space. Initially I wanted to shrink pkt_type to three bits (PACKET_HOST and alike are only 6 values defined), but unfortunately the bluetooth code overloads pkt_type :( The conntrack-event-api (out-of-tree) uses nfcache, but Rusty just came up with a way how to do it without any skb fields, so it's safe to remove it. - remove all never-implemented 'nfcache' code - don't have ipvs code abuse 'nfcache' field. currently get's their own compile-conditional skb->ipvs_property field. IPVS maintainers can decide to move this bit elswhere, but nfcache needs to die. - remove skb->nfcache field to save 4 bytes - move skb->nfctinfo into three unused bits to save further 4 bytes Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11[NETPOLL]: remove unused variableMatt Mackall
Remove unused variable Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-11[NETPOLL]: fix initialization/NAPI raceMatt Mackall
This fixes a race during initialization with the NAPI softirq processing by using an RCU approach. This race was discovered when refill_skbs() was added to the setup code. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: David S. Miller <davem@davemloft.net>