aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2009-10-07dccp ccid-2: Remove CCID naming redundancy 1/2Gerrit Renker
This removes a redundancy in the CCID half-connection (hc) naming scheme: * instead of 'hctx->tx_...', write 'hc->tx_...'; * instead of 'hcrx->rx_...', write 'hc->rx_...'; which works because the 'type' of the half-connection is encoded in the 'rx_' / 'tx_' prefixes. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07dccp ccid-3: Overhaul CCID naming convention 2/2Gerrit Renker
This implements the new naming scheme also for CCID-3. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07dccp ccid-2: Overhaul CCID naming convention 1/2Gerrit Renker
This patch starts a less problematic naming convention for CCID structs. The old naming convention used 'hc{tx,rx}->ccid?hc{tx,rx}->...' as recurring prefixes, which made the code * hard to write (not easy to fit into 80 characters); * hard to read (most of the space is occupied by prefixes). The new naming scheme: * struct entries for the TX socket are prefixed by 'tx_'; * and those for the RX socket are prefixed by 'rx_'. The identifiers then remain distinguishable when grep-ing through the tree: (a) RX/TX sockets are distinguished by the naming scheme, (b) individual CCIDs are distinguished by filename (ccid{2,3,4}.{c,h}). This first patch implements the scheme for CCID-2. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07net/wireless/ethtool.h: drop unnecessary include of linux/ethtool.hJohn W. Linville
Everything including this header includes net/cfg80211.h, which includes linux/netdevice.h, which includes linux/ethtool.h already. Why slow-down the build, even a little bit? Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-07mac80211: support ETHTOOL_GPERMADDRJohn W. Linville
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-07cfg80211: add firmware and hardware version to wiphyKalle Valo
It's useful to provide firmware and hardware version to user space and have a generic interface to retrieve them. Users can provide the version information in bug reports etc. Add fields for firmware and hardware version to struct wiphy. (Dropped nl80211 bits for now and modified remaining bits in favor of ethtool. -- JWL) Cc: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-07wireless: implement basic ethtool support for cfg80211 devicesJohn W. Linville
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-07wext: refactorJohannes Berg
Refactor wext to * split out iwpriv handling * split out iwspy handling * split out procfs support * allow cfg80211 to have wireless extensions compat code w/o CONFIG_WIRELESS_EXT After this, drivers need to - select WIRELESS_EXT - for wext support - select WEXT_PRIV - for iwpriv support - select WEXT_SPY - for iwspy support except cfg80211 -- which gets new hooks in wext-core.c and can then get wext handlers without CONFIG_WIRELESS_EXT. Wireless extensions procfs support is auto-selected based on PROC_FS and anything that requires the wext core (i.e. WIRELESS_EXT or CFG80211_WEXT). Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-07nl80211: report age of scan resultsHolger Schurig
Linux keeps scan results up to 15 seconds. This can be a problem for fast moving clients: they get back stale data. But if the kernel reports the age of the BSS items, then user-space can simply weed out old entries by itself. Signed-off-by: Holger Schurig <hs4233@mail.mn-solutions.de> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-07mac80211: use kfree_skb() to free struct sk_buff pointersRoel Kluin
kfree_skb() should be used to free struct sk_buff pointers. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-07mac80211: fix vlan and optimise RXJohannes Berg
When receiving data frames, we can send them only to the interface they belong to based on transmitting station (this doesn't work for probe requests). Also, don't try to handle other frames for AP_VLAN at all since those interface should only receive data. Additionally, the transmit side must check that the station we're sending a frame to is actually on the interface we're transmitting on, and not transmit packets to functions that live on other interfaces, so validate that as well. Another bug fix is needed in sta_info.c where in the VLAN case when adding/removing stations we overwrite the sdata variable we still need. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-07ipv4: arp_notify address list bugStephen Hemminger
This fixes a bug with arp_notify. If arp_notify is enabled, kernel will crash if address is changed and no IP address is assigned. http://bugzilla.kernel.org/show_bug.cgi?id=14330 Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07net: mark net_proto_ops as constStephen Hemminger
All usages of structure net_proto_ops should be declared const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07make TLLAO option for NA packets configurableOctavian Purdila
On Friday 02 October 2009 20:53:51 you wrote: > This is good although I would have shortened the name. Ah, I knew I forgot something :) Here is v4. tavi >From 24d96d825b9fa832b22878cc6c990d5711968734 Mon Sep 17 00:00:00 2001 From: Octavian Purdila <opurdila@ixiacom.com> Date: Fri, 2 Oct 2009 00:51:15 +0300 Subject: [PATCH] ipv6: new sysctl for sending TLLAO with unicast NAs Neighbor advertisements responding to unicast neighbor solicitations did not include the target link-layer address option. This patch adds a new sysctl option (disabled by default) which controls whether this option should be sent even with unicast NAs. The need for this arose because certain routers expect the TLLAO in some situations even as a response to unicast NS packets. Moreover, RFC 2461 recommends sending this to avoid a race condition (section 4.4, Target link-layer address) Signed-off-by: Cosmin Ratiu <cratiu@ixiacom.com> Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07Use sk_mark for IPv6 routing lookupsBrian Haley
Atis Elsts wrote: > Not sure if there is need to fill the mark from skb in tunnel xmit functions. In any case, it's not done for GRE or IPIP tunnels at the moment. Ok, I'll just drop that part, I'm not sure what should be done in this case. > Also, in this patch you are doing that for SIT (v6-in-v4) tunnels only, and not doing it for v4-in-v6 or v6-in-v6 tunnels. Any reason for that? I just sent that patch out too quickly, here's a better one with the updates. Add support for IPv6 route lookups using sk_mark. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07ethtool: Add reset operationBen Hutchings
After updating firmware stored in flash, users may wish to reset the relevant hardware and start the new firmware immediately. This should not be completely automatic as it may be disruptive. A selective reset may also be useful for debugging or diagnostics. This adds a separate reset operation which takes flags indicating the components to be reset. Drivers are allowed to reset only a subset of those requested, and must indicate the actual subset. This allows the use of generic component masks and some future expansion. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07pkt_sched: gen_estimator: Dont report fake rate estimatorsEric Dumazet
Jarek Poplawski a écrit : > > > Hmm... So you made me to do some "real" work here, and guess what?: > there is one serious checkpatch warning! ;-) Plus, this new parameter > should be added to the function description. Otherwise: > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> > > Thanks, > Jarek P. > > PS: I guess full "Don't" would show we really mean it... Okay :) Here is the last round, before the night ! Thanks again [RFC] pkt_sched: gen_estimator: Don't report fake rate estimators We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator is running. # tc -s -d qdisc qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake one (because no estimator is active) After this patch, tc command output is : $ tc -s -d qdisc qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 We add a parameter to gnet_stats_copy_rate_est() function so that it can use gen_estimator_active(bstats, r), as suggested by Jarek. This parameter can be NULL if check is not necessary, (htb for example has a mandatory rate estimator) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07Use sk_mark for routing lookup in more placesEric Dumazet
Here is a followup on this area, thanks. [RFC] af_packet: fill skb->mark at xmit skb->mark may be used by classifiers, so fill it in case user set a SO_MARK option on socket. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07ipv6 sit: 6rd (IPv6 Rapid Deployment) Support.YOSHIFUJI Hideaki / 吉藤英明
IPv6 Rapid Deployment (6rd; draft-ietf-softwire-ipv6-6rd) builds upon mechanisms of 6to4 (RFC3056) to enable a service provider to rapidly deploy IPv6 unicast service to IPv4 sites to which it provides customer premise equipment. Like 6to4, it utilizes stateless IPv6 in IPv4 encapsulation in order to transit IPv4-only network infrastructure. Unlike 6to4, a 6rd service provider uses an IPv6 prefix of its own in place of the fixed 6to4 prefix. With this option enabled, the SIT driver offers 6rd functionality by providing additional ioctl API to configure the IPv6 Prefix for in stead of static 2002::/16 for 6to4. Original patch was done by Alexandre Cassen <acassen@freebox.fr> based on old Internet-Draft. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07add vif using local interface index instead of IPIlia K
When routing daemon wants to enable forwarding of multicast traffic it performs something like: struct vifctl vc = { .vifc_vifi = 1, .vifc_flags = 0, .vifc_threshold = 1, .vifc_rate_limit = 0, .vifc_lcl_addr = ip, /* <--- ip address of physical interface, e.g. eth0 */ .vifc_rmt_addr.s_addr = htonl(INADDR_ANY), }; setsockopt(fd, IPPROTO_IP, MRT_ADD_VIF, &vc, sizeof(vc)); This leads (in the kernel) to calling vif_add() function call which search the (physical) device using assigned IP address: dev = ip_dev_find(net, vifc->vifc_lcl_addr.s_addr); The current API (struct vifctl) does not allow to specify an interface other way than using it's IP, and if there are more than a single interface with specified IP only the first one will be found. The attached patch (against 2.6.30.4) allows to specify an interface by its index, instead of IP address: struct vifctl vc = { .vifc_vifi = 1, .vifc_flags = VIFF_USE_IFINDEX, /* NEW */ .vifc_threshold = 1, .vifc_rate_limit = 0, .vifc_lcl_ifindex = if_nametoindex("eth0"), /* NEW */ .vifc_rmt_addr.s_addr = htonl(INADDR_ANY), }; setsockopt(fd, IPPROTO_IP, MRT_ADD_VIF, &vc, sizeof(vc)); Signed-off-by: Ilia K. <mail4ilia@gmail.com> === modified file 'include/linux/mroute.h' Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-06Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2009-10-06net: speedup sk_wake_async()Eric Dumazet
An incoming datagram must bring into cpu cache *lot* of cache lines, in particular : (other parts omitted (hash chains, ip route cache...)) On 32bit arches : offsetof(struct sock, sk_rcvbuf) =0x30 (read) offsetof(struct sock, sk_lock) =0x34 (rw) offsetof(struct sock, sk_sleep) =0x50 (read) offsetof(struct sock, sk_rmem_alloc) =0x64 (rw) offsetof(struct sock, sk_receive_queue)=0x74 (rw) offsetof(struct sock, sk_forward_alloc)=0x98 (rw) offsetof(struct sock, sk_callback_lock)=0xcc (rw) offsetof(struct sock, sk_drops) =0xd8 (read if we add dropcount support, rw if frame dropped) offsetof(struct sock, sk_filter) =0xf8 (read) offsetof(struct sock, sk_socket) =0x138 (read) offsetof(struct sock, sk_data_ready) =0x15c (read) We can avoid sk->sk_socket and socket->fasync_list referencing on sockets with no fasync() structures. (socket->fasync_list ptr is probably already in cache because it shares a cache line with socket->wait, ie location pointed by sk->sk_sleep) This avoids one cache line load per incoming packet for common cases (no fasync()) We can leave (or even move in a future patch) sk->sk_socket in a cold location Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-05wext: let get_wireless_stats() sleepJohannes Berg
A number of drivers (recently including cfg80211-based ones) assume that all wireless handlers, including statistics, can sleep and they often also implicitly assume that the rtnl is held around their invocation. This is almost always true now except when reading from sysfs: BUG: sleeping function called from invalid context at kernel/mutex.c:280 in_atomic(): 1, irqs_disabled(): 0, pid: 10450, name: head 2 locks held by head/10450: #0: (&buffer->mutex){+.+.+.}, at: [<c10ceb99>] sysfs_read_file+0x24/0xf4 #1: (dev_base_lock){++.?..}, at: [<c12844ee>] wireless_show+0x1a/0x4c Pid: 10450, comm: head Not tainted 2.6.32-rc3 #1 Call Trace: [<c102301c>] __might_sleep+0xf0/0xf7 [<c1324355>] mutex_lock_nested+0x1a/0x33 [<f8cea53b>] wdev_lock+0xd/0xf [cfg80211] [<f8cea58f>] cfg80211_wireless_stats+0x45/0x12d [cfg80211] [<c13118d6>] get_wireless_stats+0x16/0x1c [<c12844fe>] wireless_show+0x2a/0x4c Fix this by using the rtnl instead of dev_base_lock. Reported-by: Miles Lane <miles.lane@gmail.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-05net: export device speed and duplex via sysfsAndy Gospodarek
This patch exports the link-speed (in Mbps) and duplex of an interface via sysfs. This eliminates the need to use ethtool just to check the link-speed. Not requiring 'ethtool' and not relying on the SIOCETHTOOL ioctl should be helpful in an embedded environment where space is at a premium as well. NOTE: This patch also intentionally allows non-root users to check the link speed and duplex -- something not possible with ethtool. Here's some sample output: # cat /sys/class/net/eth0/speed 100 # cat /sys/class/net/eth0/duplex half # ethtool eth0 Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: Not reported Advertised auto-negotiation: No Speed: 100Mb/s Duplex: Half Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: off Supports Wake-on: g Wake-on: g Current message level: 0x000000ff (255) Link detected: yes Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-05cfg80211: assign device type in netdev notifier callbackMarcel Holtmann
Instead of having to modify every non-mac80211 for device type assignment, do this inside the netdev notifier callback of cfg80211. So all drivers that integrate with cfg80211 will export a proper device type. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-05net: introduce NETDEV_POST_INIT notifierJohannes Berg
For various purposes including a wireless extensions bugfix, we need to hook into the netdev creation before before netdev_register_kobject(). This will also ease doing the dev type assignment that Marcel was working on for cfg80211 drivers w/o touching them all. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-05tunnels: Optimize tx pathEric Dumazet
We currently dirty a cache line to update tunnel device stats (tx_packets/tx_bytes). We better use the txq->tx_bytes/tx_packets counters that already are present in cpu cache, in the cache line shared with txq->_xmit_lock This patch extends IPTUNNEL_XMIT() macro to use txq pointer provided by the caller. Also &tunnel->dev->stats can be replaced by &dev->stats Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-05ipv4: fib table algorithm performance improvementStephen Hemminger
The FIB algorithim for IPV4 is set at compile time, but kernel goes through the overhead of function call indirection at runtime. Save some cycles by turning the indirect calls to direct calls to either hash or trie code. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-05af_packet: add interframe drop cmsg (v6)Neil Horman
Add Ancilliary data to better represent loss information I've had a few requests recently to provide more detail regarding frame loss during an AF_PACKET packet capture session. Specifically the requestors want to see where in a packet sequence frames were lost, i.e. they want to see that 40 frames were lost between frames 302 and 303 in a packet capture file. In order to do this we need: 1) The kernel to export this data to user space 2) The applications to make use of it This patch addresses item (1). It does this by doing the following: A) Anytime we drop a frame for which we would increment po->stats.tp_drops, we also no increment a stats called po->stats.tp_gap. B) Every time we successfully enqueue a frame to sk_receive_queue, we record the value of po->stats.tp_gap in skb->mark. skb->cb would nominally be the place to record this, but since all the space there is used up, we're overloading skb->mark. Its safe to do since any enqueued packet is guaranteed to be unshared at this point, and skb->mark isn't used for anything else in the rx path to the application. After we record tp_gap in the skb, we zero po->stats.tp_gap. This allows us to keep a counter of the number of frames lost between any two enqueued packets C) When the application goes to dequeue a frame from the packet socket, we look at skb->mark for that frame. If it is non-zero, we add a cmsg chunk to the msghdr of level SOL_PACKET and type PACKET_GAPDATA. Its a 32 bit integer that represents the number of frames lost between this packet and the last previous frame received. Note there is a chance that if there is frame loss after a receive, and then the socket is closed, some gap data might be lost. This is covered by the use of the PACKET_AUXDATA socket option, which gives total loss data. With a bit of math, the final gap can be determined that way. I've tested this patch myself, and it works well. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> include/linux/if_packet.h | 2 ++ net/packet/af_packet.c | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-05pktgen: Avoid dirtying skb->users when txq is fullEric Dumazet
We can avoid two atomic ops on skb->users if packet is not going to be sent to the device (because hardware txqueue is full) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-05icmp: No need to call sk_write_space()Eric Dumazet
We can make icmp messages tx completion callback a litle bit faster. Setting SOCK_USE_WRITE_QUEUE sk flag tells sock_wfree() to not call sk_write_space() on a socket we know no thread is posssibly waiting for write space. (on per cpu kernel internal icmp sockets only) This avoids the sock_def_write_space() call and read_lock(&sk->sk_callback_lock)/read_unlock(&sk->sk_callback_lock) calls as well. We avoid three atomic ops. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-05ethtool: Remove support for obsolete string query operationsBen Hutchings
The in-tree implementations have all been converted to get_sset_count(). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-04pktgen: restore nanosec delaysEric Dumazet
Commit fd29cf72 (pktgen: convert to use ktime_t) inadvertantly converted "delay" parameter from nanosec to microsec. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-04pktgen: Fix multiqueue handlingEric Dumazet
It is not currently possible to instruct pktgen to use one selected tx queue. When Robert added multiqueue support in commit 45b270f8, he added an interval (queue_map_min, queue_map_max), and his code doesnt take into account the case of min = max, to select one tx queue exactly. I suspect a high performance setup on a eight txqueue device wants to use exactly eight cpus, and assign one tx queue to each sender. This patchs makes pktgen select the right tx queue, not the first one. Also updates Documentation to reflect Robert changes. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-04headers: remove sched.h from poll.hAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (46 commits) cnic: Fix NETDEV_UP event processing. uvesafb/connector: Disallow unpliviged users to send netlink packets pohmelfs/connector: Disallow unpliviged users to configure pohmelfs dst/connector: Disallow unpliviged users to configure dst dm/connector: Only process connector packages from privileged processes connector: Removed the destruct_data callback since it is always kfree_skb() connector/dm: Fixed a compilation warning connector: Provide the sender's credentials to the callback connector: Keep the skb in cn_callback_data e1000e/igb/ixgbe: Don't report an error if devices don't support AER net: Fix wrong sizeof net: splice() from tcp to pipe should take into account O_NONBLOCK net: Use sk_mark for routing lookup in more places sky2: irqname based on pci address skge: use unique IRQ name IPv4 TCP fails to send window scale option when window scale is zero net/ipv4/tcp.c: fix min() type mismatch warning Kconfig: STRIP: Remove stale bits of STRIP help text NET: mkiss: Fix typo tg3: Remove prev_vlan_tag from struct tx_ring_info ...
2009-10-02net: splice() from tcp to pipe should take into account O_NONBLOCKEric Dumazet
tcp_splice_read() doesnt take into account socket's O_NONBLOCK flag Before this patch : splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE); causes a random endless block (if pipe is full) and splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE | SPLICE_F_NONBLOCK); will return 0 immediately if the TCP buffer is empty. User application has no way to instruct splice() that socket should be in blocking mode but pipe in nonblock more. Many projects cannot use splice(tcp -> pipe) because of this flaw. http://git.samba.org/?p=samba.git;a=history;f=source3/lib/recvfile.c;h=ea0159642137390a0f7e57a123684e6e63e47581;hb=HEAD http://lkml.indiana.edu/hypermail/linux/kernel/0807.2/0687.html Linus introduced SPLICE_F_NONBLOCK in commit 29e350944fdc2dfca102500790d8ad6d6ff4f69d (splice: add SPLICE_F_NONBLOCK flag ) It doesn't make the splice itself necessarily nonblocking (because the actual file descriptors that are spliced from/to may block unless they have the O_NONBLOCK flag set), but it makes the splice pipe operations nonblocking. Linus intention was clear : let SPLICE_F_NONBLOCK control the splice pipe mode only This patch instruct tcp_splice_read() to use the underlying file O_NONBLOCK flag, as other socket operations do. Users will then call : splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE | SPLICE_F_NONBLOCK ); to block on data coming from socket (if file is in blocking mode), and not block on pipe output (to avoid deadlock) First version of this patch was submitted by Octavian Purdila Reported-by: Volker Lendecke <vl@samba.org> Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-01net: Use sk_mark for routing lookup in more placesAtis Elsts
This patch against v2.6.31 adds support for route lookup using sk_mark in some more places. The benefits from this patch are the following. First, SO_MARK option now has effect on UDP sockets too. Second, ip_queue_xmit() and inet_sk_rebuild_header() could fail to do routing lookup correctly if TCP sockets with SO_MARK were used. Signed-off-by: Atis Elsts <atis@mikrotik.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2009-10-01IPv4 TCP fails to send window scale option when window scale is zeroOri Finkelman
Acknowledge TCP window scale support by inserting the proper option in SYN/ACK and SYN headers even if our window scale is zero. This fixes the following observed behavior: 1. Client sends a SYN with TCP window scaling option and non zero window scale value to a Linux box. 2. Linux box notes large receive window from client. 3. Linux decides on a zero value of window scale for its part. 4. Due to compare against requested window scale size option, Linux does not to send windows scale TCP option header on SYN/ACK at all. With the following result: Client box thinks TCP window scaling is not supported, since SYN/ACK had no TCP window scale option, while Linux thinks that TCP window scaling is supported (and scale might be non zero), since SYN had TCP window scale option and we have a mismatched idea between the client and server regarding window sizes. Probably it also fixes up the following bug (not observed in practice): 1. Linux box opens TCP connection to some server. 2. Linux decides on zero value of window scale. 3. Due to compare against computed window scale size option, Linux does not to set windows scale TCP option header on SYN. With the expected result that the server OS does not use window scale option due to not receiving such an option in the SYN headers, leading to suboptimal performance. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Signed-off-by: Ori Finkelman <ori@comsleep.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-01net/ipv4/tcp.c: fix min() type mismatch warningAndrew Morton
net/ipv4/tcp.c: In function 'do_tcp_setsockopt': net/ipv4/tcp.c:2050: warning: comparison of distinct pointer types lacks a cast Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-01Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2009-10-01pktgen: Fix delay handlingEric Dumazet
After last pktgen changes, delay handling is wrong. pktgen actually sends packets at full line speed. Fix is to update pkt_dev->next_tx even if spin() returns early, so that next spin() calls have a chance to see a positive delay. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: ax25: Fix possible oops in ax25_make_new net: restore tx timestamping for accelerated vlans Phonet: fix mutex imbalance sit: fix off-by-one in ipip6_tunnel_get_prl net: Fix sock_wfree() race net: Make setsockopt() optlen be unsigned.
2009-09-30ax25: Fix possible oops in ax25_make_newJarek Poplawski
In ax25_make_new, if kmemdup of digipeat returns an error, there would be an oops in sk_free while calling sk_destruct, because sk_protinfo is NULL at the moment; move sk->sk_destruct initialization after this. BTW of reported-by: Bernard Pidoux F6BVP <f6bvp@free.fr> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-30net: restore tx timestamping for accelerated vlansEric Dumazet
Since commit 9b22ea560957de1484e6b3e8538f7eef202e3596 ( net: fix packet socket delivery in rx irq handler ) We lost rx timestamping of packets received on accelerated vlans. Effect is that tcpdump on real dev can show strange timings, since it gets rx timestamps too late (ie at skb dequeueing time, not at skb queueing time) 14:47:26.986871 IP 192.168.20.110 > 192.168.20.141: icmp 64: echo request seq 1 14:47:26.986786 IP 192.168.20.141 > 192.168.20.110: icmp 64: echo reply seq 1 14:47:27.986888 IP 192.168.20.110 > 192.168.20.141: icmp 64: echo request seq 2 14:47:27.986781 IP 192.168.20.141 > 192.168.20.110: icmp 64: echo reply seq 2 14:47:28.986896 IP 192.168.20.110 > 192.168.20.141: icmp 64: echo request seq 3 14:47:28.986780 IP 192.168.20.141 > 192.168.20.110: icmp 64: echo reply seq 3 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-30Phonet: fix mutex imbalanceRémi Denis-Courmont
From: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> port_mutex was unlocked twice. Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-30sit: fix off-by-one in ipip6_tunnel_get_prlSascha Hlusiak
When requesting all prl entries (kprl.addr == INADDR_ANY) and there are more prl entries than there is space passed from userspace, the existing code would always copy cmax+1 entries, which is more than can be handled. This patch makes the kernel copy only exactly cmax entries. Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Acked-By: Fred L. Templin <Fred.L.Templin@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-30net: Fix sock_wfree() raceEric Dumazet
Commit 2b85a34e911bf483c27cfdd124aeb1605145dc80 (net: No more expensive sock_hold()/sock_put() on each tx) opens a window in sock_wfree() where another cpu might free the socket we are working on. A fix is to call sk->sk_write_space(sk) while still holding a reference on sk. Reported-by: Jike Song <albcamus@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-30net: Make setsockopt() optlen be unsigned.David S. Miller
This provides safety against negative optlen at the type level instead of depending upon (sometimes non-trivial) checks against this sprinkled all over the the place, in each and every implementation. Based upon work done by Arjan van de Ven and feedback from Linus Torvalds. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits) sony-laptop: re-read the rfkill state when resuming from suspend sony-laptop: check for rfkill hard block at load time wext: add back wireless/ dir in sysfs for cfg80211 interfaces wext: Add bound checks for copy_from_user mac80211: improve/fix mlme messages cfg80211: always get BSS iwlwifi: fix 3945 ucode info retrieval after failure iwlwifi: fix memory leak in command queue handling iwlwifi: fix debugfs buffer handling cfg80211: don't set privacy w/o key cfg80211: wext: don't display BSSID unless associated net: Add explicit bound checks in net/socket.c bridge: Fix double-free in br_add_if. isdn: fix netjet/isdnhdlc build errors atm: dereference of he_dev->rbps_virt in he_init_group() ax25: Add missing dev_put in ax25_setsockopt Revert "sit: stateless autoconf for isatap" net: fix double skb free in dcbnl net: fix nlmsg len size for skb when error bit is set. net: fix vlan_get_size to include vlan_flags size ...