aboutsummaryrefslogtreecommitdiff
path: root/net/core/dev.c
AgeCommit message (Collapse)Author
2007-07-10[NET]: Fix secondary unicast/multicast address count maintenancePatrick McHardy
When a reference to an existing address is increased or decreased without hitting zero, the address count is incorrectly adjusted. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10[CORE] Stack changes to add multiqueue hardware support APIPeter P Waskiewicz Jr
Add the multiqueue hardware device support API to the core network stack. Allow drivers to allocate multiple queues and manage them at the netdev level if they choose to do so. Added a new field to sk_buff, namely queue_mapping, for drivers to know which tx_ring to select based on OS classification of the flow. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10[NET]: Fix TX checksum feature checkHerbert Xu
This patch fixes a boolean error in the new TX checksum check that causes bogus TSO packets to be generated. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10[NET]: dev: secondary unicast address supportPatrick McHardy
Add support for configuring secondary unicast addresses on network devices. To support this devices capable of filtering multiple unicast addresses need to change their set_multicast_list function to configure unicast filters as well and assign it to dev->set_rx_mode instead of dev->set_multicast_list. Other devices are put into promiscous mode when secondary unicast addresses are present. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10[NET]: dev: introduce generic net_device address listsPatrick McHardy
Introduce struct dev_addr_list and list maintenance functions based on dev_mc_list and the related functions. This will be used by follow-up patches for both multicast and secondary unicast addresses. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10[NET]: IPV6 checksum offloading in network devicesStephen Hemminger
The existing model for checksum offload does not correctly handle devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag implies device can do any arbitrary protocol. This patch: * adds NETIF_F_IPV6_CSUM for those devices * fixes bnx2 and tg3 devices that need it * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO) * fixes assumptions about NETIF_F_ALL_CSUM in nat * adjusts bridge union of checksumming computation Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-23[NET]: Re-enable irqs before pushing pending DMA requestsShannon Nelson
This moves the local_irq_enable() call in net_rx_action() to before calling the CONFIG_NET_DMA's dma_async_memcpy_issue_pending() rather than after. This shortens the irq disabled window and allows for DMA drivers that need to do their own irq hold. Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07[NET]: Avoid duplicate netlink notification when changing link stateThomas Graf
When changing the link state from userspace not affecting any other flags. Two duplicate notification are being sent, once as action in the NETDEV_UP/NETDEV_DOWN notification chain and a second time when comparing old and new device flags after the change has been completed. Although harmless, the duplicates should be avoided. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-19[NET]: Fix race condition about network device name allocation.Stephen Hemminger
Kenji Kaneshige found this race between device removal and registration. On unregister it is possible for the old device to exist, because sysfs file is still open. A new device with 'eth%d' will select the same name, but sysfs kobject register will fial. The following changes the shutdown order slightly. It hold a removes the sysfs entries earlier (on unregister_netdevice), but holds a kobject reference. Then when todo runs the actual last put free happens. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-17[NET]: lockdep classes in register_netdeviceJarek Poplawski
After initializing dev->_xmit_lock register_netdevice() sets lockdep class according to dev->type. Idea of this patch - by David Miller. Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com> Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-09Add suspend-related notifications for CPU hotplugRafael J. Wysocki
Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [oleg@tv-sign.ru: cleanups] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07[NET] net/core: Fix error handlingJosef 'Jeff' Sipek
Upon failure to register "ptype" procfs entry, "softnet_stat" was not removed, and an incorrect attempt was made to remove the "ptype" entry. Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03[NET]: Rework dev_base via list_head (v3)Pavel Emelianov
Cleanup of dev_base list use, with the aim to simplify making device list per-namespace. In almost every occasion, use of dev_base variable and dev->next pointer could be easily replaced by for_each_netdev loop. A few most complicated places were converted to using first_netdev()/next_netdev(). Signed-off-by: Pavel Emelianov <xemul@openvz.org> Acked-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03[NET]: Add __dev_getfirstbyhwtypePatrick McHardy
Add __dev_getfirstbyhwtype for callers that don't want a reference but some data from the device and thus need to take the rtnl anyway. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-28[NET]: Remove NETIF_F_INTERNAL_STATS, default to internal stats.Rusty Russell
Herbert Xu conviced me that a new flag was overkill; every driver currently overrides get_stats, so we might as well make the internal one the default. If someone did fail to set get_stats, they would now get all 0 stats instead of "No statistics available". Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26[WEXT]: Clean up how wext is called.Johannes Berg
This patch cleans up the call paths from the core code into wext. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26[WEXT]: Move to net/wirelessJohannes Berg
This patch moves dev/core/wireless.c to net/wireless/wext.c. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: Warn about GSO/checksum abuseHerbert Xu
Now that Patrick has added the code to deal with GSO in netfilter, we no longer need the crutch that computes partial checksums just before transmission. This patch turns this into a warning again. If this goes OK, we can then turn it into a BUG_ON and remove the gso_send_check cruft. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: Prevent much sadness in qdisc_lock_tree().Andrew Morton
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: Fix comments for register_netdev().Borislav Petkov
Correct the function name in the comments supplied with register_netdev() Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: Get rid of netdev_nitStephen Hemminger
It isn't any faster to test a boolean global variable than do a simple check for empty list. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET_SCHED]: ingress: switch back to using ingress_lockPatrick McHardy
Switch ingress queueing back to use ingress_lock. qdisc_lock_tree now locks both the ingress and egress qdiscs on the device. All changes to data that might be used on both ingress and egress needs to be protected by using qdisc_lock_tree instead of manually taking dev->queue_lock. Additionally the qdisc stats_lock needs to be initialized to ingress_lock for ingress qdiscs. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25bridge: eliminate call by referenceStephen Hemminger
Change the bridging hook to be simple function with return value rather than modifying the skb argument. This could generate better code and is cleaner. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
2007-04-25[NET]: Use csum_start offset instead of skb_transport_headerHerbert Xu
The skb transport pointer is currently used to specify the start of the checksum region for transmit checksum offload. Unfortunately, the same pointer is also used during receive side processing. This creates a problem when we want to retransmit a received packet with partial checksums since the skb transport pointer would be overwritten. This patch solves this problem by creating a new 16-bit csum_start offset value to replace the skb transport header for the purpose of checksums. This offset is calculated from skb->head so that it does not have to change when skb->data changes. No extra space is required since csum_offset itself fits within a 16-bit word so we can use the other 16 bits for csum_start. For backwards compatibility, just before we push a packet with partial checksums off into the device driver, we set the skb transport header to what it would have been under the old scheme. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: Inline net_device_statsRusty Russell
Network drivers which keep stats allocate their own stats structure then write a get_stats() function to return them. It would be nice if this were done by default. 1) Add a new "stats" field to "struct net_device". 2) Add a new feature field to say "this driver uses the internal one" 3) Have a default "get_stats" which returns NULL if that feature not set. 4) Change callers to check result of get_stats call for NULL, not if ->get_stats is set. This should not break backwards compatibility with older drivers, yet allow modern drivers to shed some boilerplate code. Lightly tested: works for a modified lguest network driver. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Convert skb->tail to sk_buff_data_tArnaldo Carvalho de Melo
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes on 64bit architectures, allowing us to combine the 4 bytes hole left by the layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4 64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN... :-) Many calculations that previously required that skb->{transport,network, mac}_header be first converted to a pointer now can be done directly, being meaningful as offsets or pointers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: unions of just one member don't get anything done, kill themArnaldo Carvalho de Melo
Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and skb->mac to skb->mac_header, to match the names of the associated helpers (skb[_[re]set]_{transport,network,mac}_header). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce skb_transport_header(skb)Arnaldo Carvalho de Melo
For the places where we need a pointer to the transport header, it is still legal to touch skb->h.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce skb_transport_offset()Arnaldo Carvalho de Melo
For the quite common 'skb->h.raw - skb->data' sequence. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce skb_reset_transport_header(skb)Arnaldo Carvalho de Melo
For the common, open coded 'skb->h.raw = skb->data' operation, so that we can later turn skb->h.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple cases: skb->h.raw = skb->data; skb->h.raw = {skb_push|[__]skb_pull}() The next ones will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: show bound packet typesStephen Hemminger
Show what protocols are bound to what packet types in /proc/net/ptype Uses kallsyms to decode function pointers if possible. Example: Type Device Function ALL eth1 packet_rcv_spkt+0x0 0800 ip_rcv+0x0 0806 arp_rcv+0x0 86dd :ipv6:ipv6_rcv+0x0 Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: make seq_operations constStephen Hemminger
The seq_file operations stuff can be marked constant to get it out of dirty cache. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: network dev read_mostlyStephen Hemminger
For Eric, mark packet type and network device watermarks as read mostly. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce skb_network_header()Arnaldo Carvalho de Melo
For the places where we need a pointer to the network header, it is still legal to touch skb->nh.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce skb_reset_network_header(skb)Arnaldo Carvalho de Melo
For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple case, next will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce skb_mac_header()Arnaldo Carvalho de Melo
For the places where we need a pointer to the mac header, it is still legal to touch skb->mac.raw directly if just adding to, subtracting from or setting it to another layer header. This one also converts some more cases to skb_reset_mac_header() that my regex missed as it had no spaces before nor after '=', ugh. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce skb_reset_mac_header(skb)Arnaldo Carvalho de Melo
For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple case, next will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: deinline some functionsStephen Hemminger
Several functions are marked inline or forced inline, but it would be better to let the compiler decide. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: convert network timestamps to ktime_tEric Dumazet
We currently use a special structure (struct skb_timeval) and plain 'struct timeval' to store packet timestamps in sk_buffs and struct sock. This has some drawbacks : - Fixed resolution of micro second. - Waste of space on 64bit platforms where sizeof(struct timeval)=16 I suggest using ktime_t that is a nice abstraction of high resolution time services, currently capable of nanosecond resolution. As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits a 8 byte shrink of this structure on 64bit architectures. Some other structures also benefit from this size reduction (struct ipq in ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...) Once this ktime infrastructure adopted, we can more easily provide nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or SO_TIMESTAMPNS/SCM_TIMESTAMPNS) Note : this patch includes a bug correction in compat_sock_get_timestamp() where a "err = 0;" was missing (so this syscall returned -ENOENT instead of 0) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> CC: Stephen Hemminger <shemminger@linux-foundation.org> CC: John find <linux.kernel@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-04[PATCH] net: Ignore sysfs network device rename bugs.Eric W. Biederman
The generic networking code ensures that no two networking devices have the same name, so there is no time except when sysfs has implementation bugs that device_rename when called from dev_change_name will fail. The current error handling for errors from device_rename in dev_change_name is wrong and results in an unusable and unrecoverable network device if device_rename is happens to return an error. This patch removes the buggy error handling. Which confines the mess when device_rename hits a problem to sysfs, instead of propagating it the rest of the network stack. Making linux a little more robust. Without this patch you can observe what happens when sysfs has a bug when CONFIG_SYSFS_DEPRECATED is not set and you attempt to rename a real network device to a name like (broken_parity_status, device, modalias, power, resource2, subsystem_vendor, class, driver, irq, msi_bus, resource, subsystem, uevent, config, enable, local_cpus, numa_node, resource0, subsystem_device, vendor) Greg has a patch that fixes the sysfs bugs but he doesn't trust it for a 2.6.21 timeframe. This patch which just ignores errors should be safe and it keeps the system from going completely wacky. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-29[IFB]: Fix crash on input device removalPatrick McHardy
The input_device pointer is not refcounted, which means the device may disappear while packets are queued, causing a crash when ifb passes packets with a stale skb->dev pointer to netif_rx(). Fix by storing the interface index instead and do a lookup where neccessary. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-25[NET_SCHED]: Fix ingress lockingPatrick McHardy
Ingress queueing uses a seperate lock for serializing enqueue operations, but fails to properly protect itself against concurrent changes to the qdisc tree. Use queue_lock for now since the real fix it quite intrusive. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12[PATCH] mark struct file_operations const 7Arjan van de Ven
Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-10[NET] CORE: Fix whitespace errors.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08[NET]: unregister_netdevice as voidStephen Hemminger
There was no real useful information from the unregister_netdevice() return code, the only error occurred in a situation that was a driver bug. So change it to a void function. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-07Network: convert network devices to use struct device instead of class_deviceGreg Kroah-Hartman
This lets the network core have the ability to handle suspend/resume issues, if it wants to. Thanks to Frederik Deweerdt <frederik.deweerdt@gmail.com> for the arm driver fixes. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-12-07[PATCH] hotplug CPU: clean up hotcpu_notifier() useIngo Molnar
There was lots of #ifdef noise in the kernel due to hotcpu_notifier(fn, prio) not correctly marking 'fn' as used in the !HOTPLUG_CPU case, and thus generating compiler warnings of unused symbols, hence forcing people to add #ifdefs. the compiler can skip truly unused functions just fine: text data bss dec hex filename 1624412 728710 3674856 6027978 5bfaca vmlinux.before 1624412 728710 3674856 6027978 5bfaca vmlinux.after [akpm@osdl.org: topology.c fix] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-02[NET]: Split skb->csumAl Viro
... into anonymous union of __wsum and __u32 (csum and csum_offset resp.) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02[NET]: Annotate callers of csum_fold() in net/*Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02[NET] net/core: Annotations.Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>