aboutsummaryrefslogtreecommitdiff
path: root/net/core
AgeCommit message (Collapse)Author
2009-04-21tun: fix tun_chr_aio_write so that aio worksMichael S. Tsirkin
aio_write gets const struct iovec * but tun_chr_aio_write casts this to struct iovec * and modifies the iovec. As a result, attempts to use io_submit to send packets to a tun device fail with weird errors such as EINVAL. Since tun is the only user of skb_copy_datagram_from_iovec, we can fix this simply by changing the later so that it does not touch the iovec passed to it. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-21net: skb_copy_datagram_const_iovec()Michael S. Tsirkin
There's an skb_copy_datagram_iovec() to copy out of a paged skb, but it modifies the iovec, and does not support starting at an offset in the destination. We want both in tun.c, so let's add the function. It's a carbon copy of skb_copy_datagram_iovec() with enough changes to be annoying. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-21Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/core/dev.c
2009-04-20net: Fix GRO for multiple page fragmentsBen Hutchings
This loop over fragments in napi_fraginfo_skb() was "interesting". Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-20net: fix "compatibility" typosMarcin Slusarz
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-20net: sch_netem: Fix an inconsistency in ingress netem timestamps.Jarek Poplawski
Alex Sidorenko reported: "while experimenting with 'netem' we have found some strange behaviour. It seemed that ingress delay as measured by 'ping' command shows up on some hosts but not on others. After some investigation I have found that the problem is that skbuff->tstamp field value depends on whether there are any packet sniffers enabled. That is: - if any ptype_all handler is registered, the tstamp field is as expected - if there are no ptype_all handlers, the tstamp field does not show the delay" This patch prevents unnecessary update of tstamp in dev_queue_xmit_nit() on ingress path (with act_mirred) adding a check, so minimal overhead on the fast path, but only when sniffers etc. are active. Since netem at ingress seems to logically emulate a network before a host, tstamp is zeroed to trigger the update and pretend delays are from the outside. Reported-by: Alex Sidorenko <alexandre.sidorenko@hp.com> Tested-by: Alex Sidorenko <alexandre.sidorenko@hp.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-16Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2009-04-16gro: New frags interface to avoid copying shinfoHerbert Xu
It turns out that copying a 16-byte area at ~800k times a second can be really expensive :) This patch redesigns the frags GRO interface to avoid copying that area twice. The two disciples of the frags interface have been converted. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-14gro: Restore correct value to gso_sizeHerbert Xu
Since everybody has been focusing on baremetal GRO performance no one noticed when I added a bug that zapped gso_size for all GRO packets. This only gets picked up when you forward the skb out of an interface. Thanks to Mark Wagner for noticing this bug when testing kvm. Reported-by: Mark Wagner <mwagner@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-11net: netif_device_attach/detach should start/stop all queuesAlexander Duyck
Currently netif_device_attach/detach are only stopping one queue. They should be starting and stopping all the queues on a given device. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (54 commits) glge: remove unused #include <version.h> dnet: remove unused #include <version.h> tcp: miscounts due to tcp_fragment pcount reset tcp: add helper for counter tweaking due mid-wq change hso: fix for the 'invalid frame length' messages hso: fix for crash when unplugging the device fsl_pq_mdio: Fix compile failure fsl_pq_mdio: Revive UCC MDIO support ucc_geth: Pass proper device to DMA routines, otherwise oops happens i.MX31: Fixing cs89x0 network building to i.MX31ADS tc35815: Fix build error if NAPI enabled hso: add Vendor/Product ID's for new devices ucc_geth: Remove unused header gianfar: Remove unused header kaweth: Fix locking to be SMP-safe net: allow multiple dev per napi with GRO r8169: reset IntrStatus after chip reset ixgbe: Fix potential memory leak/driver panic issue while setting up Tx & Rx ring parameters ixgbe: fix ethtool -A|a behavior ixgbe: Patch to fix driver panic while freeing up tx & rx resources ...
2009-04-02net: allow multiple dev per napi with GROStephen Hemminger
GRO assumes that there is a one-to-one relationship between NAPI structure and network device. Some devices like sky2 share multiple devices on a single interrupt so only have one NAPI handler. Rather than split GRO from NAPI, just have GRO assume if device changes that it is a different flow. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-01epoll keyed wakeups: make sockets use keyed wakeupsDavide Libenzi
Add support for event-aware wakeups to the sockets code. Events are delivered to the wakeup target, so that epoll can avoid spurious wakeups for non-interesting events. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: William Lee Irwin III <wli@movementarian.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-31core: remove pointless conditional before kfree()Wei Yongjun
Remove pointless conditional before kfree(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: wireless: remove duplicated .ndo_set_mac_address netfilter: xtables: fix IPv6 dependency in the cluster match tg3: Add GRO support. niu: Add GRO support. ucc_geth: Fix use-after-of_node_put() in ucc_geth_probe(). gianfar: Fix use-after-of_node_put() in gfar_of_init(). kernel: remove HIPQUAD() netpoll: store local and remote ip in net-endian netfilter: fix endian bug in conntrack printks dmascc: fix incomplete conversion to network_device_ops gso: Fix support for linear packets skbuff.h: fix missing kernel-doc ni5010: convert to net_device_ops
2009-03-31proc 2/2: remove struct proc_dir_entry::ownerAlexey Dobriyan
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy as correctly noted at bug #12454. Someone can lookup entry with NULL ->owner, thus not pinning enything, and release it later resulting in module refcount underflow. We can keep ->owner and supply it at registration time like ->proc_fops and ->data. But this leaves ->owner as easy-manipulative field (just one C assignment) and somebody will forget to unpin previous/pin current module when switching ->owner. ->proc_fops is declared as "const" which should give some thoughts. ->read_proc/->write_proc were just fixed to not require ->owner for protection. rmmod'ed directories will be empty and return "." and ".." -- no harm. And directories with tricky enough readdir and lookup shouldn't be modular. We definitely don't want such modular code. Removing ->owner will also make PDE smaller. So, let's nuke it. Kudos to Jeff Layton for reminding about this, let's say, oversight. http://bugzilla.kernel.org/show_bug.cgi?id=12454 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-29Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
2009-03-28netpoll: store local and remote ip in net-endianHarvey Harrison
Allows for the removal of byteswapping in some places and the removal of HIPQUAD (replaced by %pI4). Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-28gso: Fix support for linear packetsHerbert Xu
When GRO/frag_list support was added to GSO, I made an error which broke the support for segmenting linear GSO packets (GSO packets are normally non-linear in the payload). These days most of these packets are constructed by the tun driver, which prefers to allocate linear memory if possible. This is fixed in the latest kernel, but for 2.6.29 and earlier it is still the norm. Therefore this bug causes failures with GSO when used with tun in 2.6.29. Reported-by: James Huang <jamesclhuang@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (119 commits) [SCSI] scsi_dh_rdac: Retry for NOT_READY check condition [SCSI] mpt2sas: make global symbols unique [SCSI] sd: Make revalidate less chatty [SCSI] sd: Try READ CAPACITY 16 first for SBC-2 devices [SCSI] sd: Refactor sd_read_capacity() [SCSI] mpt2sas v00.100.11.15 [SCSI] mpt2sas: add MPT2SAS_MINOR(221) to miscdevice.h [SCSI] ch: Add scsi type modalias [SCSI] 3w-9xxx: add power management support [SCSI] bsg: add linux/types.h include to bsg.h [SCSI] cxgb3i: fix function descriptions [SCSI] libiscsi: fix possbile null ptr session command cleanup [SCSI] iscsi class: remove host no argument from session creation callout [SCSI] libiscsi: pass session failure a session struct [SCSI] iscsi lib: remove qdepth param from iscsi host allocation [SCSI] iscsi lib: have lib create work queue for transmitting IO [SCSI] iscsi class: fix lock dep warning on logout [SCSI] libiscsi: don't cap queue depth in iscsi modules [SCSI] iscsi_tcp: replace scsi_debug/tcp_debug logging with iscsi conn logging [SCSI] libiscsi_tcp: replace tcp_debug/scsi_debug logging with session/conn logging ...
2009-03-26GRO: Disable GRO on legacy netif_rx pathHerbert Xu
When I fixed the GRO crash in the legacy receive path I used napi_complete to replace __napi_complete. Unfortunately they're not the same when NETPOLL is enabled, which may result in us not calling __napi_complete at all. What's more, we really do need to keep the __napi_complete call within the IRQ-off section since in theory an IRQ can occur in between and fill up the backlog to the maximum, causing us to lock up. Since we can't seem to find a fix that works properly right now, this patch reverts all the GRO support from the netif_rx path. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-26net: core: remove unneeded include in net/core/utils.c.Rami Rosen
Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21net: remove useless prefetch() callEric Dumazet
There is no gain using prefetch() in dev_hard_start_xmit(), since we already had to read ops->ndo_select_queue pointer in dev_pick_tx(), and both pointers are probably located in the same cache line. This prefetch call slows down fast path because of a stall in address computation. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21skb: expose and constify hash primitivesStephen Hemminger
Some minor changes to queue hashing: 1. Use const on accessor functions 2. Export skb_tx_hash for use in drivers (see ixgbe) Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-20Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/virtio_net.c
2009-03-18net: kfree(napi->skb) => kfree_skbRoel Kluin
struct sk_buff pointers should be freed with kfree_skb. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-17Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/igb/igb_main.c drivers/net/qlge/qlge_main.c drivers/net/wireless/ath9k/ath9k.h drivers/net/wireless/ath9k/core.h drivers/net/wireless/ath9k/hw.c
2009-03-17gro: Fix legacy path napi_complete crashHerbert Xu
On the legacy netif_rx path, I incorrectly tried to optimise the napi_complete call by using __napi_complete before we reenable IRQs. This simply doesn't work since we need to flush the held GRO packets first. This patch fixes it by doing the obvious thing of reenabling IRQs first and then calling napi_complete. Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-16GRO: Move netpoll checks to correct locationHerbert Xu
As my netpoll fix for net doesn't really work for net-next, we need this update to move the checks into the right place. As it stands we may pass freed skbs to netpoll_receive_skb. This patch also introduces a netpoll_rx_on function to avoid GRO completely if we're invoked through netpoll. This might seem paranoid but as netpoll may have an external receive hook it's better to be safe than sorry. I don't think we need this for 2.6.29 though since there's nothing immediately broken by it. This patch also moves the GRO_* return values to netdevice.h since VLAN needs them too (I tried to avoid this originally but alas this seems to be the easiest way out). This fixes a bug in VLAN where it continued to use the old return value 2 instead of the correct GRO_DROP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13[SCSI] net: add NETIF_F_FCOE_CRC to can_checksum_protocolYi Zou
Add FC CRC offload check for ETH_P_FCOE. Signed-off-by: Yi Zou <yi.zou@intel.com> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13Network Drop Monitor: Adding Build changes to enable drop monitorNeil Horman
Network Drop Monitor: Adding Build changes to enable drop monitor Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/Kbuild | 1 + net/Kconfig | 11 +++++++++++ net/core/Makefile | 1 + 3 files changed, 13 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13Network Drop Monitor: Adding drop monitor implementation & Netlink protocolNeil Horman
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/net_dropmon.h | 56 +++++++++ net/core/drop_monitor.c | 263 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 319 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13Network Drop Monitor: Adding kfree_skb_clean for non-drops and modifying ↵Neil Horman
end-of-line points for skbs Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/skbuff.h | 4 +++- net/core/datagram.c | 2 +- net/core/skbuff.c | 22 ++++++++++++++++++++++ net/ipv4/arp.c | 2 +- net/ipv4/udp.c | 2 +- net/packet/af_packet.c | 2 +- 6 files changed, 29 insertions(+), 5 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13Network Drop Monitor: Add trace declaration for skb freesNeil Horman
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/trace/skb.h | 8 ++++++++ net/core/Makefile | 2 ++ net/core/net-traces.c | 29 +++++++++++++++++++++++++++++ 3 files changed, 39 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-10net: fix warning about non-const stringStephen Hemminger
Since dev_set_name takes a printf style string, new gcc complains if arg is not const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-05Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/tokenring/tmspci.c drivers/net/ucc_geth_mii.c
2009-03-04vlan: Fix vlan-in-vlan crashes.David S. Miller
As analyzed by Patrick McHardy, vlan needs to reset it's netdev_ops pointer in it's ->init() function but this leaves the compat method pointers stale. Add a netdev_resync_ops() and call it from the vlan code. Any other driver which changes ->netdev_ops after register_netdevice() will need to call this new function after doing so too. With help from Patrick McHardy. Tested-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-04net: Fix missing dev->neigh_setup in register_netdevice().David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-04neigh: Allow for user space users of the neighbour tableEric Biederman
Currently it is possible to do just about everything with the arp table from user space except treat an entry like you are using it. To that end implement and a flag NTF_USE that when set in a netwlink update request treats the neighbour table entry like the kernel does on the output path. This allows user space applications to share the kernel's arp cache. Signed-off-by: Eric Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-03netns: Remove net_aliveEric W. Biederman
It turns out that net_alive is unnecessary, and the original problem that led to it being added was simply that the icmp code thought it was a network device and wound up being unable to handle packets while there were still packets in the network namespace. Now that icmp and tcp have been fixed to properly register themselves this problem is no longer present and we have a stronger guarantee that packets will not arrive in a network namespace then that provided by net_alive in netif_receive_skb. So remove net_alive allowing packet reception run a little faster. Additionally document the strong reason why network namespace cleanup is safe so that if something happens again someone else will have a chance of figuring it out. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-03net: Avoid race between network down and sysfsStephen Hemminger
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-01Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-tx.c net/8021q/vlan_core.c net/core/dev.c
2009-03-01netpoll: Add drop checks to all entry pointsHerbert Xu
The netpoll entry checks are required to ensure that we don't receive normal packets when invoked via netpoll. Unfortunately it only ever worked for the netif_receive_skb/netif_rx entry points. The VLAN (and subsequently GRO) entry point didn't have the check and therefore can trigger all sorts of weird problems. This patch adds the netpoll check to all entry points. I'm still uneasy with receiving at all under netpoll (which apparently is only used by the out-of-tree kdump code). The reason is it is perfectly legal to receive all data including headers into highmem if netpoll is off, but if you try to do that with netpoll on and someone gets a printk in an IRQ handler you're going to get a nice BUG_ON. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: Add RDS to AF key stringsAndy Grover
Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26sysctl: fix sparse warning: Should it be static?Hannes Eder
Impact: Include header file. Fix this sparse warning: net/core/sysctl_net_core.c:123:32: warning: symbol 'net_core_path' was not declared. Should it be static? Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26core: remove some pointless conditionals before kfree_skb()Wei Yongjun
Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26pktgen: remove some pointless conditionals before kfree_skb()Wei Yongjun
Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-24netlink: change nlmsg_notify() return value logicPablo Neira Ayuso
This patch changes the return value of nlmsg_notify() as follows: If NETLINK_BROADCAST_ERROR is set by any of the listeners and an error in the delivery happened, return the broadcast error; else if there are no listeners apart from the socket that requested a change with the echo flag, return the result of the unicast notification. Thus, with this patch, the unicast notification is handled in the same way of a broadcast listener that has set the NETLINK_BROADCAST_ERROR socket flag. This patch is useful in case that the caller of nlmsg_notify() wants to know the result of the delivery of a netlink notification (including the broadcast delivery) and take any action in case that the delivery failed. For example, ctnetlink can drop packets if the event delivery failed to provide reliable logging and state-synchronization at the cost of dropping packets. This patch also modifies the rtnetlink code to ignore the return value of rtnl_notify() in all callers. The function rtnl_notify() (before this patch) returned the error of the unicast notification which makes rtnl_set_sk_err() reports errors to all listeners. This is not of any help since the origin of the change (the socket that requested the echoing) notices the ENOBUFS error if the notification fails and should resync itself. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-24Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
2009-02-23net: amend the fix for SO_BSDCOMPAT gsopt infoleakEugene Teo
The fix for CVE-2009-0676 (upstream commit df0bca04) is incomplete. Note that the same problem of leaking kernel memory will reappear if someone on some architecture uses struct timeval with some internal padding (for example tv_sec 64-bit and tv_usec 32-bit) --- then, you are going to leak the padded bytes to userspace. Signed-off-by: Eugene Teo <eugeneteo@kernel.sg> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>