]> www.infradead.org Git - users/willy/xarray.git/log
users/willy/xarray.git
18 years ago[AFS]: Implement the CB.InitCallBackState3 operation.
David Howells [Thu, 26 Apr 2007 22:58:49 +0000 (15:58 -0700)]
[AFS]: Implement the CB.InitCallBackState3 operation.

Implement the CB.InitCallBackState3 operation for the fileserver to
call.  This reduces the amount of network traffic because if this op
is aborted, the fileserver will then attempt an CB.InitCallBackState
operation.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AFS]: Add support for the CB.GetCapabilities operation.
David Howells [Thu, 26 Apr 2007 22:58:17 +0000 (15:58 -0700)]
[AFS]: Add support for the CB.GetCapabilities operation.

Add support for the CB.GetCapabilities operation with which the fileserver can
ask the client for the following information:

 (1) The list of network interfaces it has available as IPv4 address + netmask
     plus the MTUs.

 (2) The client's UUID.

 (3) The extended capabilities of the client, for which the only current one
     is unified error mapping (abort code interpretation).

To support this, the patch adds the following routines to AFS:

 (1) A function to iterate through all the network interfaces using RTNETLINK
     to extract IPv4 addresses and MTUs.

 (2) A function to iterate through all the network interfaces using RTNETLINK
     to pull out the MAC address of the lowest index interface to use in UUID
     construction.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AFS]: Update the AFS fs documentation.
David Howells [Thu, 26 Apr 2007 22:57:43 +0000 (15:57 -0700)]
[AFS]: Update the AFS fs documentation.

Update the AFS fs documentation.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AFS]: Add security support.
David Howells [Thu, 26 Apr 2007 22:57:07 +0000 (15:57 -0700)]
[AFS]: Add security support.

Add security support to the AFS filesystem.  Kerberos IV tickets are added as
RxRPC keys are added to the session keyring with the klog program.  open() and
other VFS operations then find this ticket with request_key() and either use
it immediately (eg: mkdir, unlink) or attach it to a file descriptor (open).

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AFS]: Handle multiple mounts of an AFS superblock correctly.
David Howells [Thu, 26 Apr 2007 22:56:24 +0000 (15:56 -0700)]
[AFS]: Handle multiple mounts of an AFS superblock correctly.

Handle multiple mounts of an AFS superblock correctly, checking to see
whether the superblock is already initialised after calling sget()
rather than just unconditionally stamping all over it.

Also delete the "silent" parameter to afs_fill_super() as it's not
used and can, in any case, be obtained from sb->s_flags.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_RXRPC]: Delete the old RxRPC code.
David Howells [Thu, 26 Apr 2007 22:55:48 +0000 (15:55 -0700)]
[AF_RXRPC]: Delete the old RxRPC code.

Delete the old RxRPC code as it's now no longer used.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.
David Howells [Thu, 26 Apr 2007 22:55:03 +0000 (15:55 -0700)]
[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.

Make the in-kernel AFS filesystem use AF_RXRPC instead of the old RxRPC code.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_RXRPC]: Add an interface to the AF_RXRPC module for the AFS filesystem to use
David Howells [Thu, 26 Apr 2007 22:50:17 +0000 (15:50 -0700)]
[AF_RXRPC]: Add an interface to the AF_RXRPC module for the AFS filesystem to use

Add an interface to the AF_RXRPC module so that the AFS filesystem module can
more easily make use of the services available.  AFS still opens a socket but
then uses the action functions in lieu of sendmsg() and registers an intercept
functions to grab messages before they're queued on the socket Rx queue.

This permits AFS (or whatever) to:

 (1) Avoid the overhead of using the recvmsg() call.

 (2) Use different keys directly on individual client calls on one socket
     rather than having to open a whole slew of sockets, one for each key it
     might want to use.

 (3) Avoid calling request_key() at the point of issue of a call or opening of
     a socket.  This is done instead by AFS at the point of open(), unlink() or
     other VFS operation and the key handed through.

 (4) Request the use of something other than GFP_KERNEL to allocate memory.

Furthermore:

 (*) The socket buffer markings used by RxRPC are made available for AFS so
     that it can interpret the cooked RxRPC messages itself.

 (*) rxgen (un)marshalling abort codes are made available.

The following documentation for the kernel interface is added to
Documentation/networking/rxrpc.txt:

=========================
AF_RXRPC KERNEL INTERFACE
=========================

The AF_RXRPC module also provides an interface for use by in-kernel utilities
such as the AFS filesystem.  This permits such a utility to:

 (1) Use different keys directly on individual client calls on one socket
     rather than having to open a whole slew of sockets, one for each key it
     might want to use.

 (2) Avoid having RxRPC call request_key() at the point of issue of a call or
     opening of a socket.  Instead the utility is responsible for requesting a
     key at the appropriate point.  AFS, for instance, would do this during VFS
     operations such as open() or unlink().  The key is then handed through
     when the call is initiated.

 (3) Request the use of something other than GFP_KERNEL to allocate memory.

 (4) Avoid the overhead of using the recvmsg() call.  RxRPC messages can be
     intercepted before they get put into the socket Rx queue and the socket
     buffers manipulated directly.

To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
bind an addess as appropriate and listen if it's to be a server socket, but
then it passes this to the kernel interface functions.

The kernel interface functions are as follows:

 (*) Begin a new client call.

struct rxrpc_call *
rxrpc_kernel_begin_call(struct socket *sock,
struct sockaddr_rxrpc *srx,
struct key *key,
unsigned long user_call_ID,
gfp_t gfp);

     This allocates the infrastructure to make a new RxRPC call and assigns
     call and connection numbers.  The call will be made on the UDP port that
     the socket is bound to.  The call will go to the destination address of a
     connected client socket unless an alternative is supplied (srx is
     non-NULL).

     If a key is supplied then this will be used to secure the call instead of
     the key bound to the socket with the RXRPC_SECURITY_KEY sockopt.  Calls
     secured in this way will still share connections if at all possible.

     The user_call_ID is equivalent to that supplied to sendmsg() in the
     control data buffer.  It is entirely feasible to use this to point to a
     kernel data structure.

     If this function is successful, an opaque reference to the RxRPC call is
     returned.  The caller now holds a reference on this and it must be
     properly ended.

 (*) End a client call.

void rxrpc_kernel_end_call(struct rxrpc_call *call);

     This is used to end a previously begun call.  The user_call_ID is expunged
     from AF_RXRPC's knowledge and will not be seen again in association with
     the specified call.

 (*) Send data through a call.

int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg,
   size_t len);

     This is used to supply either the request part of a client call or the
     reply part of a server call.  msg.msg_iovlen and msg.msg_iov specify the
     data buffers to be used.  msg_iov may not be NULL and must point
     exclusively to in-kernel virtual addresses.  msg.msg_flags may be given
     MSG_MORE if there will be subsequent data sends for this call.

     The msg must not specify a destination address, control data or any flags
     other than MSG_MORE.  len is the total amount of data to transmit.

 (*) Abort a call.

void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code);

     This is used to abort a call if it's still in an abortable state.  The
     abort code specified will be placed in the ABORT message sent.

 (*) Intercept received RxRPC messages.

typedef void (*rxrpc_interceptor_t)(struct sock *sk,
    unsigned long user_call_ID,
    struct sk_buff *skb);

void
rxrpc_kernel_intercept_rx_messages(struct socket *sock,
   rxrpc_interceptor_t interceptor);

     This installs an interceptor function on the specified AF_RXRPC socket.
     All messages that would otherwise wind up in the socket's Rx queue are
     then diverted to this function.  Note that care must be taken to process
     the messages in the right order to maintain DATA message sequentiality.

     The interceptor function itself is provided with the address of the socket
     and handling the incoming message, the ID assigned by the kernel utility
     to the call and the socket buffer containing the message.

     The skb->mark field indicates the type of message:

MARK MEANING
=============================== =======================================
RXRPC_SKB_MARK_DATA Data message
RXRPC_SKB_MARK_FINAL_ACK Final ACK received for an incoming call
RXRPC_SKB_MARK_BUSY Client call rejected as server busy
RXRPC_SKB_MARK_REMOTE_ABORT Call aborted by peer
RXRPC_SKB_MARK_NET_ERROR Network error detected
RXRPC_SKB_MARK_LOCAL_ERROR Local error encountered
RXRPC_SKB_MARK_NEW_CALL New incoming call awaiting acceptance

     The remote abort message can be probed with rxrpc_kernel_get_abort_code().
     The two error messages can be probed with rxrpc_kernel_get_error_number().
     A new call can be accepted with rxrpc_kernel_accept_call().

     Data messages can have their contents extracted with the usual bunch of
     socket buffer manipulation functions.  A data message can be determined to
     be the last one in a sequence with rxrpc_kernel_is_data_last().  When a
     data message has been used up, rxrpc_kernel_data_delivered() should be
     called on it..

     Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose
     of.  It is possible to get extra refs on all types of message for later
     freeing, but this may pin the state of a call until the message is finally
     freed.

 (*) Accept an incoming call.

struct rxrpc_call *
rxrpc_kernel_accept_call(struct socket *sock,
 unsigned long user_call_ID);

     This is used to accept an incoming call and to assign it a call ID.  This
     function is similar to rxrpc_kernel_begin_call() and calls accepted must
     be ended in the same way.

     If this function is successful, an opaque reference to the RxRPC call is
     returned.  The caller now holds a reference on this and it must be
     properly ended.

 (*) Reject an incoming call.

int rxrpc_kernel_reject_call(struct socket *sock);

     This is used to reject the first incoming call on the socket's queue with
     a BUSY message.  -ENODATA is returned if there were no incoming calls.
     Other errors may be returned if the call had been aborted (-ECONNABORTED)
     or had timed out (-ETIME).

 (*) Record the delivery of a data message and free it.

void rxrpc_kernel_data_delivered(struct sk_buff *skb);

     This is used to record a data message as having been delivered and to
     update the ACK state for the call.  The socket buffer will be freed.

 (*) Free a message.

void rxrpc_kernel_free_skb(struct sk_buff *skb);

     This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC
     socket.

 (*) Determine if a data message is the last one on a call.

bool rxrpc_kernel_is_data_last(struct sk_buff *skb);

     This is used to determine if a socket buffer holds the last data message
     to be received for a call (true will be returned if it does, false
     if not).

     The data message will be part of the reply on a client call and the
     request on an incoming call.  In the latter case there will be more
     messages, but in the former case there will not.

 (*) Get the abort code from an abort message.

u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb);

     This is used to extract the abort code from a remote abort message.

 (*) Get the error number from a local or network error message.

int rxrpc_kernel_get_error_number(struct sk_buff *skb);

     This is used to extract the error number from a message indicating either
     a local error occurred or a network error occurred.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AFS]: Clean up the AFS sources
David Howells [Thu, 26 Apr 2007 22:49:28 +0000 (15:49 -0700)]
[AFS]: Clean up the AFS sources

Clean up the AFS sources.

Also remove references to AFS keys.  RxRPC keys are used instead.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both
David Howells [Thu, 26 Apr 2007 22:48:28 +0000 (15:48 -0700)]
[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both

Provide AF_RXRPC sockets that can be used to talk to AFS servers, or serve
answers to AFS clients.  KerberosIV security is fully supported.  The patches
and some example test programs can be found in:

http://people.redhat.com/~dhowells/rxrpc/

This will eventually replace the old implementation of kernel-only RxRPC
currently resident in net/rxrpc/.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_RXRPC]: Make it possible to merely try to cancel timers from a module
David Howells [Thu, 26 Apr 2007 22:46:56 +0000 (15:46 -0700)]
[AF_RXRPC]: Make it possible to merely try to cancel timers from a module

Export try_to_del_timer_sync() for use by the AF_RXRPC module.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_RXRPC]: Key facility changes for AF_RXRPC
David Howells [Thu, 26 Apr 2007 22:46:23 +0000 (15:46 -0700)]
[AF_RXRPC]: Key facility changes for AF_RXRPC

Export the keyring key type definition and document its availability.

Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[WORKQUEUE]: cancel_delayed_work: use del_timer() instead of del_timer_sync()
Oleg Nesterov [Thu, 26 Apr 2007 22:45:32 +0000 (15:45 -0700)]
[WORKQUEUE]: cancel_delayed_work: use del_timer() instead of del_timer_sync()

del_timer_sync() buys nothing for cancel_delayed_work(), but it is less
efficient since it locks the timer unconditionally, and may wait for the
completion of the delayed_work_timer_fn().

cancel_delayed_work() == 0 means:

before this patch:
work->func may still be running or queued

after this patch:
work->func may still be running or queued, or
delayed_work_timer_fn->__queue_work() in progress.

The latter doesn't differ from the caller's POV,
delayed_work_timer_fn() is called with _PENDING
bit set.

cancel_delayed_work() == 1 with this patch adds a new possibility:

delayed_work->work was cancelled, but delayed_work_timer_fn
is still running (this is only possible for the re-arming
works on single-threaded workqueue).

In this case the timer was re-started by work->func(), nobody
else can do this. This in turn means that delayed_work_timer_fn
has already passed __queue_work() (and wont't touch delayed_work)
because nobody else can queue delayed_work->work.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM]: Missing bits to SAD info.
Jamal Hadi Salim [Thu, 26 Apr 2007 21:12:15 +0000 (14:12 -0700)]
[XFRM]: Missing bits to SAD info.

This brings the SAD info in sync with net-2.6.22/net-2.6

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[ATM]: Use mutex instead of binary semaphore in FORE Systems 200E-series driver
Matthias Kaehlcke [Thu, 26 Apr 2007 08:41:49 +0000 (01:41 -0700)]
[ATM]: Use mutex instead of binary semaphore in FORE Systems 200E-series driver

(akpm: remove CVS control string too)

Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BLUETOOTH] rfcomm_worker(): fix wakeup race
Andrew Morton [Thu, 26 Apr 2007 08:41:01 +0000 (01:41 -0700)]
[BLUETOOTH] rfcomm_worker(): fix wakeup race

Set TASK_INTERRUPTIBLE prior to testing the flag to avoid missed wakeups.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: bonding documentation fix for multiple bonding interfaces
Alexandra N. Kossovsky [Thu, 26 Apr 2007 08:40:13 +0000 (01:40 -0700)]
[NET]: bonding documentation fix for multiple bonding interfaces

Fix bonding driver documentation for the case of multiple bonding interfaces.

Signed-off-by: "Alexandra N. Kossovsky" <Alexandra.Kossovsky@oktetlabs.ru>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: SPIN_LOCK_UNLOCKED cleanup in drivers/atm, net
Milind Arun Choudhary [Thu, 26 Apr 2007 08:37:44 +0000 (01:37 -0700)]
[NET]: SPIN_LOCK_UNLOCKED cleanup in drivers/atm, net

SPIN_LOCK_UNLOCKED cleanup,use __SPIN_LOCK_UNLOCKED instead

Signed-off-by: Milind Arun Choudhary <milindchoudhary@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IRDA] irda_device_dongle_init: fix kzalloc(GFP_KERNEL) in spinlock
Andrew Morton [Thu, 26 Apr 2007 08:36:49 +0000 (01:36 -0700)]
[IRDA] irda_device_dongle_init: fix kzalloc(GFP_KERNEL) in spinlock

Fix http://bugzilla.kernel.org/show_bug.cgi?id=8343

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SUNRPC]: cleanup: use seq_release_private() where appropriate
Martin Peschke [Thu, 26 Apr 2007 08:03:43 +0000 (01:03 -0700)]
[SUNRPC]: cleanup: use seq_release_private() where appropriate

We can save some lines of code by using seq_release_private().

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_IUCV]: Fix compilation on s390-up
Alexey Dobriyan [Thu, 26 Apr 2007 08:02:51 +0000 (01:02 -0700)]
[AF_IUCV]: Fix compilation on s390-up

  CC [M]  net/iucv/iucv.o
net/iucv/iucv.c: In function 'iucv_init':
net/iucv/iucv.c:1556: error: 'iucv_cpu_notifier' undeclared (first use in this function)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: ROUND_UP macro cleanup in drivers/net/ppp_generic.c
Milind Arun Choudhary [Thu, 26 Apr 2007 08:01:53 +0000 (01:01 -0700)]
[NET]: ROUND_UP macro cleanup in drivers/net/ppp_generic.c

ROUND_UP macro cleanup use DIV_ROUND_UP

Signed-off-by: Milind Arun Choudhary <milindchoudhary@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET] tun/tap: fixed hw address handling
Brian Braunstein [Thu, 26 Apr 2007 08:00:55 +0000 (01:00 -0700)]
[NET] tun/tap: fixed hw address handling

Fixed tun/tap driver's handling of hw addresses.  The hw address is stored
in both the net_device.dev_addr and tun.dev_addr fields.  These fields were
not kept synchronized, and in fact weren't even initialized to the same
value.  Now during both init and when performing SIOCSIFHWADDR on the tun
device these values are both updated.  However, if SIOCSIFHWADDR is
performed on the net device directly (for instance, setting the hw address
using ifconfig), the tun device does not get updated.  Perhaps the
tun.dev_addr field should be removed completely at some point, as it is
redundant and net_device.dev_addr can be used anywhere it is used.

Signed-off-by: Brian Braunstein <linuxkernel@bristyle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Delete unused header file linux/if_wanpipe_common.h
Robert P. J. Day [Thu, 26 Apr 2007 07:59:27 +0000 (00:59 -0700)]
[NET]: Delete unused header file linux/if_wanpipe_common.h

Delete the unreferenced header file include/linux/if_wanpipe_common.h,
as well as the reference to it in the Doc file.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Delete unused header file linux/sdla_fr.h.
Robert P. J. Day [Thu, 26 Apr 2007 07:58:39 +0000 (00:58 -0700)]
[NET]: Delete unused header file linux/sdla_fr.h.

Delete the unreferenced header file include/linux/sdla_fr.h.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
18 years ago[NETLINK]: Possible cleanups.
Adrian Bunk [Thu, 26 Apr 2007 07:57:41 +0000 (00:57 -0700)]
[NETLINK]: Possible cleanups.

- make the following needlessly global variables static:
  - core/rtnetlink.c: struct rtnl_msg_handlers[]
  - netfilter/nf_conntrack_proto.c: struct nf_ct_protos[]
- make the following needlessly global functions static:
  - core/rtnetlink.c: rtnl_dump_all()
  - netlink/af_netlink.c: netlink_queue_skip()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Fix yam.c
Andrew Morton [Thu, 26 Apr 2007 07:55:53 +0000 (00:55 -0700)]
[NET]: Fix yam.c

drivers/net/hamradio/yam.c: In function `yam_tx_byte':
drivers/net/hamradio/yam.c:643: warning: passing arg 1 of `skb_copy_from_linear_data_offset' from incompatible pointer type

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Clean up sk_buff walkers.
Jean Delvare [Thu, 26 Apr 2007 07:44:22 +0000 (00:44 -0700)]
[NET]: Clean up sk_buff walkers.

I noticed recently that, in skb_checksum(), "offset" and "start" are
essentially the same thing and have the same value throughout the
function, despite being computed differently. Using a single variable
allows some cleanups and makes the skb_checksum() function smaller,
more readable, and presumably marginally faster.

We appear to have many other "sk_buff walker" functions built on the
exact same model, so the cleanup applies to them, too. Here is a list
of the functions I found to be affected:

net/appletalk/ddp.c:atalk_sum_skb()
net/core/datagram.c:skb_copy_datagram_iovec()
net/core/datagram.c:skb_copy_and_csum_datagram()
net/core/skbuff.c:skb_copy_bits()
net/core/skbuff.c:skb_store_bits()
net/core/skbuff.c:skb_checksum()
net/core/skbuff.c:skb_copy_and_csum_bit()
net/core/user_dma.c:dma_skb_copy_datagram_iovec()
net/xfrm/xfrm_algo.c:skb_icv_walk()
net/xfrm/xfrm_algo.c:skb_to_sgvec()

OTOH, I admit I'm a bit surprised, the cleanup is rather obvious so I'm
really wondering if I am missing something. Can anyone please comment
on this?

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM]: Export SAD info.
Jamal Hadi Salim [Thu, 26 Apr 2007 07:10:29 +0000 (00:10 -0700)]
[XFRM]: Export SAD info.

On a system with a lot of SAs, counting SAD entries chews useful
CPU time since you need to dump the whole SAD to user space;
i.e something like ip xfrm state ls | grep -i src | wc -l
I have seen taking literally minutes on a 40K SAs when the system
is swapping.
With this patch, some of the SAD info (that was already being tracked)
is exposed to user space. i.e you do:
ip xfrm state count
And you get the count; you can also pass -s to the command line and
get the hash info.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: Missing rtnl.
Stephen Hemminger [Thu, 26 Apr 2007 05:08:46 +0000 (22:08 -0700)]
[BRIDGE]: Missing rtnl.

Writing to /sys/class/net/brX/bridge/stp_state causes a warning because
RTNL is not held when call br_stp_if.c

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: if no STP then forward all BPDUs
Stephen Hemminger [Thu, 26 Apr 2007 05:07:58 +0000 (22:07 -0700)]
[BRIDGE]: if no STP then forward all BPDUs

If a bridge is not running STP, then it has no way to detect a cycle
in the network. But if it is not running STP and some other machine
or device is running STP, then if STP BPDU's get forwarded to it can
detect the cycle.

This is how the old 2.4 and early 2.6 code worked.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: drop PAUSE frames
Stephen Hemminger [Thu, 26 Apr 2007 05:05:55 +0000 (22:05 -0700)]
[BRIDGE]: drop PAUSE frames

Pause frames should never make it out of the network device into
the stack. But if a device was misconfigured, it might happen.
So drop pause frames in bridge.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: don't change packet type
Stephen Hemminger [Thu, 26 Apr 2007 05:03:10 +0000 (22:03 -0700)]
[BRIDGE]: don't change packet type

The change to forward STP bpdu's (for usermode STP) through normal path,
changed the packet type in the process. Since link local stuff is multicast, it
should stay pkt_type = PACKET_MULTICAST.  The code was probably copy/pasted
incorrectly from the bridge pseudo-device receive path.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] NDISC: Unify main process of sending ND messages.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:52 +0000 (20:44 +0900)]
[IPV6] NDISC: Unify main process of sending ND messages.

Because ndisc_send_na(), ndisc_send_ns() and ndisc_send_rs()
are almost identical, so let's unify their common part.

With gcc (GCC) 3.3.5 (Debian 1:3.3.5-13) on i386,
Before:
   text    data     bss     dec     hex filename
  14689     364      24   15077    3ae5 net/ipv6/ndisc.o
After:
   text    data     bss     dec     hex filename
  12317     364      24   12705    31a1 net/ipv6/ndisc.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[IPV6] XFRM: Use ip6addr_any where applicable.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:50 +0000 (20:44 +0900)]
[IPV6] XFRM: Use ip6addr_any where applicable.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[IPV6]: Export in6addr_any for future use.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:49 +0000 (20:44 +0900)]
[IPV6]: Export in6addr_any for future use.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[IPV4] IP_GRE: Unify code path to get hash array index.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:48 +0000 (20:44 +0900)]
[IPV4] IP_GRE: Unify code path to get hash array index.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[IPV4] IPIP: Unify code path to get hash array index.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:47 +0000 (20:44 +0900)]
[IPV4] IPIP: Unify code path to get hash array index.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[IPV6] SIT: Unify code path to get hash array index.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 11:44:47 +0000 (20:44 +0900)]
[IPV6] SIT: Unify code path to get hash array index.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[IPV6]: Fix Makefile thinko.
David S. Miller [Wed, 25 Apr 2007 05:15:40 +0000 (22:15 -0700)]
[IPV6]: Fix Makefile thinko.

obj-$(CONFIG_PROC_FS) --> ipv6-$(CONFIG_PROC_FS)

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: Consolidate common SNMP code
Herbert Xu [Wed, 25 Apr 2007 04:54:09 +0000 (21:54 -0700)]
[IPV6]: Consolidate common SNMP code

This patch moves the non-proc SNMP code into addrconf.c and reuses
IPv4 SNMP code where applicable.

As a result we can skip proc.o if /proc is disabled.

Note that I've made a number of functions static since they're only
used by addrconf.c for now.  If they ever get used elsewhere we can
always remove the static.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4]: Consolidate common SNMP code
Herbert Xu [Wed, 25 Apr 2007 04:53:35 +0000 (21:53 -0700)]
[IPV4]: Consolidate common SNMP code

This patch moves the SNMP code shared between IPv4/IPv6 from proc.c
into net/ipv4/af_inet.c.  This makes sense because these functions
aren't specific to /proc.

As a result we can again skip proc.o if /proc is disabled.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4]: Fix build without procfs.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 23:22:42 +0000 (16:22 -0700)]
[IPV4]: Fix build without procfs.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: Fix linkage errors on i386.
YOSHIFUJI Hideaki [Tue, 24 Apr 2007 23:21:38 +0000 (16:21 -0700)]
[TCP]: Fix linkage errors on i386.

To avoid raw division, use ktime_to_timeval() to get usec.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TIPC]: Enhancements to msg_set_bits() routine
Allan Stephens [Tue, 24 Apr 2007 21:51:55 +0000 (14:51 -0700)]
[TIPC]: Enhancements to msg_set_bits() routine

This patch makes two enhancements to msg_set_bits():

1) It now ignores any bits of the new field value that are not
   covered by the mask being used.  (Previously, if the new value
   exceeded the size of the mask the extra bits could corrupt
   other fields in the message header word being updated.)

2) The code has been optimized to minimize the number of run-time
   endianness conversion operations by leveraging the fact that the
   mask (and, in some cases, the value as well) is constant and the
   necessary conversion can be performed by the compiler.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Jon Paul Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[WIRELESS] cfg80211: Update comment for locking.
Johannes Berg [Tue, 24 Apr 2007 21:07:27 +0000 (14:07 -0700)]
[WIRELESS] cfg80211: Update comment for locking.

This patch adds a comment that was part of my rtnl locking patch for
cfg80211 but which I forgot for the merge.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Warn about GSO/checksum abuse
Herbert Xu [Tue, 24 Apr 2007 05:36:13 +0000 (22:36 -0700)]
[NET]: Warn about GSO/checksum abuse

Now that Patrick has added the code to deal with GSO in netfilter,
we no longer need the crutch that computes partial checksums just
before transmission.

This patch turns this into a warning again.  If this goes OK, we
can then turn it into a BUG_ON and remove the gso_send_check cruft.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP] TCP YEAH: Use vegas dont copy it.
Stephen Hemminger [Tue, 24 Apr 2007 05:28:23 +0000 (22:28 -0700)]
[TCP] TCP YEAH: Use vegas dont copy it.

Rather than using a copy of vegas code, the YEAH code should just have
it exported so there is common code.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: Congestion control API update.
Stephen Hemminger [Tue, 24 Apr 2007 05:26:16 +0000 (22:26 -0700)]
[TCP]: Congestion control API update.

Do some simple changes to make congestion control API faster/cleaner.
* use ktime_t rather than timeval
* merge rtt sampling into existing ack callback
  this means one indirect call versus two per ack.
* use flags bits to store options/settings

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: TCP Illinois update.
Stephen Hemminger [Tue, 24 Apr 2007 05:24:32 +0000 (22:24 -0700)]
[TCP]: TCP Illinois update.

This version more closely matches the paper, and fixes several
math errors. The biggest difference is that it updates alpha/beta
once per RTT

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[WIRELESS] drivers/net/wireless/Kconfig: correct minor typo
John W. Linville [Mon, 23 Apr 2007 20:28:49 +0000 (13:28 -0700)]
[WIRELESS] drivers/net/wireless/Kconfig: correct minor typo

Correct minor typo in drivers/net/wireless/Kconfig identified by
Stefano Brivio <stefano.brivio@polimi.it>.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[WIRELESS]: Remove wext over netlink.
Johannes Berg [Mon, 23 Apr 2007 19:20:55 +0000 (12:20 -0700)]
[WIRELESS]: Remove wext over netlink.

As scheduled, this patch removes the pointless wext over netlink code.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[WIRELESS] cfg80211: New wireless config infrastructure.
Johannes Berg [Mon, 23 Apr 2007 19:20:05 +0000 (12:20 -0700)]
[WIRELESS] cfg80211: New wireless config infrastructure.

This patch creates the core cfg80211 code along with some sysfs bits.
This is a stripped down version to allow mac80211 to function, but
doesn't include any configuration yet except for creating and removing
virtual interfaces.

This patch includes the nl80211 header file but it only contains the
interface types which the cfg80211 interface for creating virtual
interfaces relies on.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[WIRELESS]: Refactor wireless Kconfig.
Johannes Berg [Mon, 23 Apr 2007 19:19:12 +0000 (12:19 -0700)]
[WIRELESS]: Refactor wireless Kconfig.

This patch refactors the wireless Kconfig all over and already
introduces net/wireless/Kconfig with just the WEXT bit for now,
the cfg80211 patch will add to that as well.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[WIRELESS]: Update MAINTAINERS for wireless mailing list.
Johannes Berg [Mon, 23 Apr 2007 19:18:20 +0000 (12:18 -0700)]
[WIRELESS]: Update MAINTAINERS for wireless mailing list.

This patch adds the linux-wireless mailing list to all appropriate
entries in the MAINTAINERS file.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Prevent much sadness in qdisc_lock_tree().
Andrew Morton [Mon, 23 Apr 2007 06:22:24 +0000 (23:22 -0700)]
[NET]: Prevent much sadness in qdisc_lock_tree().

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] SNMP: Use put_unaligned() instead of memcpy().
YOSHIFUJI Hideaki [Sun, 22 Apr 2007 02:52:04 +0000 (19:52 -0700)]
[IPV6] SNMP: Use put_unaligned() instead of memcpy().

Hint from David Miller <davem@davemloft.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] SNMP: Fix several warnings without procfs.
YOSHIFUJI Hideaki [Sat, 21 Apr 2007 11:13:44 +0000 (20:13 +0900)]
[IPV6] SNMP: Fix several warnings without procfs.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[IPV6] SNMP: Avoid unaligned accesses.
YOSHIFUJI Hideaki [Sat, 21 Apr 2007 11:12:43 +0000 (20:12 +0900)]
[IPV6] SNMP: Avoid unaligned accesses.

Because stats pointer may not be aligned for u64, use memcpy
to fill u64 values.
Issue reported by David Miller <davem@davemloft.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[TCP]: Sed magic converts func(sk, tp, ...) -> func(sk, ...)
Ilpo Järvinen [Sat, 21 Apr 2007 05:18:02 +0000 (22:18 -0700)]
[TCP]: Sed magic converts func(sk, tp, ...) -> func(sk, ...)

This is (mostly) automated change using magic:

sed -e '/struct sock \*sk/ N' -e '/struct sock \*sk/ N'
    -e '/struct sock \*sk/ N' -e '/struct sock \*sk/ N'
    -e 's|struct sock \*sk,[\n\t ]*struct tcp_sock \*tp\([^{]*\n{\n\)|
  struct sock \*sk\1\tstruct tcp_sock *tp = tcp_sk(sk);\n|g'
    -e 's|struct sock \*sk, struct tcp_sock \*tp|
  struct sock \*sk|g' -e 's|sk, tp\([^-]\)|sk\1|g'

Fixed four unused variable (tp) warnings that were introduced.

In addition, manually added newlines after local variables and
tweaked function arguments positioning.

$ gcc --version
gcc (GCC) 4.1.1 20060525 (Red Hat 4.1.1-1)
...
$ codiff -fV built-in.o.old built-in.o.new
net/ipv4/route.c:
  rt_cache_flush |  +14
 1 function changed, 14 bytes added

net/ipv4/tcp.c:
  tcp_setsockopt |   -5
  tcp_sendpage   |  -25
  tcp_sendmsg    |  -16
 3 functions changed, 46 bytes removed

net/ipv4/tcp_input.c:
  tcp_try_undo_recovery |   +3
  tcp_try_undo_dsack    |   +2
  tcp_mark_head_lost    |  -12
  tcp_ack               |  -15
  tcp_event_data_recv   |  -32
  tcp_rcv_state_process |  -10
  tcp_rcv_established   |   +1
 7 functions changed, 6 bytes added, 69 bytes removed, diff: -63

net/ipv4/tcp_output.c:
  update_send_head          |   -9
  tcp_transmit_skb          |  +19
  tcp_cwnd_validate         |   +1
  tcp_write_wakeup          |  -17
  __tcp_push_pending_frames |  -25
  tcp_push_one              |   -8
  tcp_send_fin              |   -4
 7 functions changed, 20 bytes added, 63 bytes removed, diff: -43

built-in.o.new:
 18 functions changed, 40 bytes added, 178 bytes removed, diff: -138

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Fix comments for register_netdev().
Borislav Petkov [Sat, 21 Apr 2007 05:14:10 +0000 (22:14 -0700)]
[NET]: Fix comments for register_netdev().

Correct the function name in the comments supplied with
register_netdev()

Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IrDA]: Misc spelling corrections.
G. Liakhovetski [Sat, 21 Apr 2007 05:12:48 +0000 (22:12 -0700)]
[IrDA]: Misc spelling corrections.

Spelling corrections, from "to" to "too".

Signed-off-by: G. Liakhovetski <gl@dsa-ac.de>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IrDA]: Adding carriage returns to mcs7780 debug statements
Samuel Ortiz [Sat, 21 Apr 2007 05:12:07 +0000 (22:12 -0700)]
[IrDA]: Adding carriage returns to mcs7780 debug statements

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IrDA] af_irda: IRDA_ASSERT cleanups
Samuel Ortiz [Sat, 21 Apr 2007 05:10:13 +0000 (22:10 -0700)]
[IrDA] af_irda: IRDA_ASSERT cleanups

In af_irda.c, the multiple IRDA_ASSERT() are either hiding bugs, useless, or
returning the wrong value.
Let's clean that up.

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IrDA] af_irda: irda_accept cleanup
Samuel Ortiz [Sat, 21 Apr 2007 05:09:33 +0000 (22:09 -0700)]
[IrDA] af_irda: irda_accept cleanup

This patch removes a cut'n'paste copy of wait_event_interruptible
from irda_accept.

Signed-off-by: Samuel Ortiz <samuel@ortiz.org>
Acked-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IrDA] af_irda: Silence kernel message in irda_recvmsg_stream
Olaf Kirch [Sat, 21 Apr 2007 05:08:15 +0000 (22:08 -0700)]
[IrDA] af_irda: Silence kernel message in irda_recvmsg_stream

This patch silences an IRDA_ASSERT in irda_recvmsg_stream, as described in
http://bugzilla.kernel.org/show_bug.cgi?id=7512 irda_disconnect_indication
would set sk->sk_err to ECONNRESET, and a subsequent call to recvmsg
would print an irritating kernel message and return -1.

When a connected socket is closed by the peer, recvmsg should return 0
rather than an error. This patch fixes this.

Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IrDA] af_irda: irda_recvmsg_stream cleanup
Olaf Kirch [Sat, 21 Apr 2007 05:05:27 +0000 (22:05 -0700)]
[IrDA] af_irda: irda_recvmsg_stream cleanup

This patch cleans up some code in irda_recvmsg_stream, replacing some
homebrew code with prepare_to_wait/finish_wait, and by making the
code honor sock_rcvtimeo.

Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Move sk_setup_caps() out of line.
Andi Kleen [Sat, 21 Apr 2007 00:12:43 +0000 (17:12 -0700)]
[NET]: Move sk_setup_caps() out of line.

It is far too large to be an inline and not in any hot paths.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: Uninline tcp_done().
Andi Kleen [Sat, 21 Apr 2007 00:11:46 +0000 (17:11 -0700)]
[TCP]: Uninline tcp_done().

The function is quite big and has several call sites and nothing
to collapse by compiler optimization on inlining.

Besides it's nicer to read in a in .c file.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: cleanup extra semicolons
Stephen Hemminger [Sat, 21 Apr 2007 00:09:22 +0000 (17:09 -0700)]
[NET]: cleanup extra semicolons

Spring cleaning time...

There seems to be a lot of places in the network code that have
extra bogus semicolons after conditionals.  Most commonly is a
bogus semicolon after: switch() { }

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: TCP Illinois congestion control (rev3)
Stephen Hemminger [Sat, 21 Apr 2007 00:07:51 +0000 (17:07 -0700)]
[TCP]: TCP Illinois congestion control (rev3)

This is an implementation of TCP Illinois invented by Shao Liu
at University of Illinois. It is a another variant of Reno which adapts
the alpha and beta parameters based on RTT. The basic idea is to increase
window less rapidly as delay approaches the maximum. See the papers
and talks to get a more complete description.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Get rid of netdev_nit
Stephen Hemminger [Sat, 21 Apr 2007 00:02:45 +0000 (17:02 -0700)]
[NET]: Get rid of netdev_nit

It isn't any faster to test a boolean global variable than do a simple
check for empty list.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[PPPOE]: Fix device tear-down notification.
Michal Ostrowski [Fri, 20 Apr 2007 23:59:24 +0000 (16:59 -0700)]
[PPPOE]: Fix device tear-down notification.

pppoe_flush_dev() kicks all sockets bound to a device that is going down.
In doing so, locks must be taken in the right order consistently (sock lock,
followed by the pppoe_hash_lock).  However, the scan process is based on
us holding the sock lock.  So, when something is found in the scan we must
release the lock we're holding and grab the sock lock.

This patch fixes race conditions between this code and pppoe_release(),
both of which perform similar functions but would naturally prefer to grab
locks in opposing orders.  Both code paths are now going after these locks
in a consistent manner.

pppoe_hash_lock protects the contents of the "pppox_sock" objects that reside
inside the hash.  Thus, NULL'ing out the pppoe_dev field should be done
under the protection of this lock.

Signed-off-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[PPPOE]: memory leak when socket is release()d before PPPIOCGCHAN has been called...
Florian Zumbiehl [Fri, 20 Apr 2007 23:58:14 +0000 (16:58 -0700)]
[PPPOE]: memory leak when socket is release()d before PPPIOCGCHAN has been called on it

below you find a patch that fixes a memory leak when a PPPoE socket is
release()d after it has been connect()ed, but before the PPPIOCGCHAN ioctl
ever has been called on it.

This is somewhat of a security problem, too, since PPPoE sockets can be
created by any user, so any user can easily allocate all the machine's
RAM to non-swappable address space and thus DoS the system.

Is there any specific reason for PPPoE sockets being available to any
unprivileged process, BTW? After all, you need a packet socket for the
discovery stage anyway, so it's unlikely that any unprivileged process
will ever need to create a PPPoE socket, no? Allocating all session IDs
for a known AC is a kind of DoS, too, after all - with Juniper ERXes,
this is really easy, actually, since they don't ever assign session ids
above 8000 ...

Signed-off-by: Florian Zumbiehl <florz@florz.de>
Acked-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[PPPOE]: race between interface going down and connect()
Florian Zumbiehl [Fri, 20 Apr 2007 23:57:27 +0000 (16:57 -0700)]
[PPPOE]: race between interface going down and connect()

below you find a patch that (hopefully) fixes a race between an interface
going down and a connect() to a peer on that interface. Before,
connect() would determine that an interface is up, then the interface
could go down and all entries referring to that interface in the
item_hash_table would be marked as ZOMBIEs and their references to
the device would be freed, and after that, connect() would put a new
entry into the hash table referring to the device that meanwhile is
down already - which also would cause unregister_netdevice() to wait
until the socket has been release()d.

This patch does not suffice if we are not allowed to accept connect()s
referring to a device that we already acked a NETDEV_GOING_DOWN for
(that is: all references are only guaranteed to be freed after
NETDEV_DOWN has been acknowledged, not necessarily after the
NETDEV_GOING_DOWN already). And if we are allowed to, we could avoid
looking through the hash table upon NETDEV_GOING_DOWN completely and
only do that once we get the NETDEV_DOWN ...

mostrows:
pppoe_flush_dev is called on NETDEV_GOING_DOWN and NETDEV_DOWN to deal with
this "late connect" issue.  Ideally one would hope to notify users at the
"NETDEV_GOING_DOWN" phase (just to pretend to be nice).  However, it is the
NETDEV_DOWN scan that takes all the responsibility for ensuring nobody is
hanging around at that time.

Signed-off-by: Florian Zumbiehl <florz@florz.de>
Acked-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[PPPoE]: miscellaneous smaller cleanups
Florian Zumbiehl [Fri, 20 Apr 2007 23:56:31 +0000 (16:56 -0700)]
[PPPoE]: miscellaneous smaller cleanups

below is a patch that just removes dead code/initializers without any
effect (first access is an assignment) that I stumbled accross while
reading the source.

Signed-off-by: Florian Zumbiehl <florz@florz.de>
Acked-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET] skbuff: skb_store_bits const is backwards
Stephen Hemminger [Fri, 20 Apr 2007 23:40:01 +0000 (16:40 -0700)]
[NET] skbuff: skb_store_bits const is backwards

Getting warnings becuase skb_store_bits has skb as constant,
but the function overwrites it. Looks like const was on the
wrong side.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: Fix warning in net-2.6.22
Stephen Hemminger [Fri, 20 Apr 2007 23:39:17 +0000 (16:39 -0700)]
[BRIDGE]: Fix warning in net-2.6.22

The following is leftover from earlier change in net-2.6.22.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AX25/NETROM/ROSE]: Convert to use modern wait queue API
Ralf Baechle [Fri, 20 Apr 2007 23:06:45 +0000 (16:06 -0700)]
[AX25/NETROM/ROSE]: Convert to use modern wait queue API

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_PACKET]: Add option to return orig_dev to userspace.
Peter P. Waskiewicz Jr [Fri, 20 Apr 2007 23:05:39 +0000 (16:05 -0700)]
[AF_PACKET]: Add option to return orig_dev to userspace.

Add a packet socket option to allow the orig_dev index to be returned
to userspace when passing traffic through a decapsulated device, such
as the bonding driver.

This is very useful for layer 2 traffic being able to report which
physical device actually received the traffic, instead of having the
encapsulating device hide that information.

The new option is called PACKET_ORIGDEV.

Signed-off-by: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] SNMP: Export statistics via netlink without CONFIG_PROC_FS.
YOSHIFUJI Hideaki [Fri, 20 Apr 2007 22:57:45 +0000 (15:57 -0700)]
[IPV6] SNMP: Export statistics via netlink without CONFIG_PROC_FS.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4] SNMP: Move some statistic bits to net/ipv4/proc.c.
YOSHIFUJI Hideaki [Fri, 20 Apr 2007 22:57:15 +0000 (15:57 -0700)]
[IPV4] SNMP: Move some statistic bits to net/ipv4/proc.c.

This also fixes memory leak in error path.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] SNMP: Move some statistic bits to net/ipv6/proc.c.
YOSHIFUJI Hideaki [Fri, 20 Apr 2007 22:56:48 +0000 (15:56 -0700)]
[IPV6] SNMP: Move some statistic bits to net/ipv6/proc.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] SNMP: Netlink interface.
YOSHIFUJI Hideaki [Fri, 20 Apr 2007 22:56:20 +0000 (15:56 -0700)]
[IPV6] SNMP: Netlink interface.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[INET]: Add IP(V6)_PMTUDISC_RPOBE
John Heffner [Fri, 20 Apr 2007 22:53:27 +0000 (15:53 -0700)]
[INET]: Add IP(V6)_PMTUDISC_RPOBE

Add IP(V6)_PMTUDISC_PROBE value for IP(V6)_MTU_DISCOVER.  This option forces
us not to fragment, but does not make use of the kernel path MTU discovery.
That is, it allows for user-mode MTU probing (or, packetization-layer path
MTU discovery).  This is particularly useful for diagnostic utilities, like
traceroute/tracepath.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: MTU discovery check in ip6_fragment()
John Heffner [Fri, 20 Apr 2007 22:52:39 +0000 (15:52 -0700)]
[IPV6]: MTU discovery check in ip6_fragment()

Adds a check in ip6_fragment() mirroring ip_fragment() for packets
that we can't fragment, and sends an ICMP Packet Too Big message
in response.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET_SCHED]: ingress: switch back to using ingress_lock
Patrick McHardy [Tue, 17 Apr 2007 00:07:08 +0000 (17:07 -0700)]
[NET_SCHED]: ingress: switch back to using ingress_lock

Switch ingress queueing back to use ingress_lock. qdisc_lock_tree now locks
both the ingress and egress qdiscs on the device. All changes to data that
might be used on both ingress and egress needs to be protected by using
qdisc_lock_tree instead of manually taking dev->queue_lock. Additionally
the qdisc stats_lock needs to be initialized to ingress_lock for ingress
qdiscs.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET_SCHED]: Eliminate qdisc_tree_lock
Patrick McHardy [Tue, 17 Apr 2007 00:02:10 +0000 (17:02 -0700)]
[NET_SCHED]: Eliminate qdisc_tree_lock

Since we're now holding the rtnl during the entire dump operation, we
can remove qdisc_tree_lock, whose only purpose is to protect dump
callbacks from concurrent changes to the qdisc tree.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETLINK]: don't reinitialize callback mutex
Patrick McHardy [Wed, 25 Apr 2007 21:01:17 +0000 (14:01 -0700)]
[NETLINK]: don't reinitialize callback mutex

Don't reinitialize the callback mutex the netlink_kernel_create caller
handed in, it is supposed to already be initialized and could already
be held by someone.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[RTNETLINK]: Remove unnecessary locking in dump callbacks
Patrick McHardy [Tue, 17 Apr 2007 00:00:53 +0000 (17:00 -0700)]
[RTNETLINK]: Remove unnecessary locking in dump callbacks

Since we're now holding the rtnl during the entire dump operation, we can
remove additional locking for rtnl protected data. This patch does that
for all simple cases (dev_base_lock for dev_base walking, RCU protection
for FIB rule dumping).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks
Patrick McHardy [Mon, 16 Apr 2007 23:59:10 +0000 (16:59 -0700)]
[RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks

Hold rtnl_mutex during the entire netlink dump operation. This allows
to simplify locking in the dump callbacks, since they can now rely on
that no concurrent changes happen.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETLINK]: Switch cb_lock spinlock to mutex and allow to override it
Patrick McHardy [Fri, 20 Apr 2007 21:14:21 +0000 (14:14 -0700)]
[NETLINK]: Switch cb_lock spinlock to mutex and allow to override it

Switch cb_lock to mutex and allow netlink kernel users to override it
with a subsystem specific mutex for consistent locking in dump callbacks.
All netlink_dump_start users have been audited not to rely on any
side-effects of the previously used spinlock.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: ipt_ULOG: add compat conversion functions
Patrick McHardy [Fri, 13 Apr 2007 05:17:05 +0000 (22:17 -0700)]
[NETFILTER]: ipt_ULOG: add compat conversion functions

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: nfnetlink_log: remove fallback to group 0
Patrick McHardy [Fri, 13 Apr 2007 05:16:38 +0000 (22:16 -0700)]
[NETFILTER]: nfnetlink_log: remove fallback to group 0

Don't fallback to group 0 if no instance can be found for the given group.
This potentially confuses the listener and is not what the user configured.
Also remove the ring buffer spamming that happens when rules are set up
before the logging daemon is started.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: {eb,ip6,ip}t_LOG: remove remains of LOG target overloading
Patrick McHardy [Fri, 13 Apr 2007 05:16:18 +0000 (22:16 -0700)]
[NETFILTER]: {eb,ip6,ip}t_LOG: remove remains of LOG target overloading

All LOG targets always use their internal logging function nowadays, so
remove the incorrect error message and handle real errors (!= -EEXIST)
by failing to load.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: nf_nat: use HW checksumming when possible
Patrick McHardy [Fri, 13 Apr 2007 05:15:50 +0000 (22:15 -0700)]
[NETFILTER]: nf_nat: use HW checksumming when possible

When mangling packets forwarded to a HW checksumming capable device,
offload recalculation of the checksum instead of doing it in software.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: ebt_arp: add gratuitous arp filtering
Bart De Schuymer [Fri, 13 Apr 2007 05:15:06 +0000 (22:15 -0700)]
[NETFILTER]: ebt_arp: add gratuitous arp filtering

The attached patch adds gratuitous arp filtering, more precisely: it
allows checking that the IPv4 source address matches the IPv4
destination address inside the ARP header. It also adds a check for the
hardware address type when matching MAC addresses (nothing critical,
just for better consistency).

Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Acked-by: Carl-Daniel Hailfinger <c-d.hailfinger.devel.2006@gmx.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: bridge-nf: filter bridged IPv4/IPv6 encapsulated in pppoe traffic
Michael Milner [Fri, 13 Apr 2007 05:14:23 +0000 (22:14 -0700)]
[NETFILTER]: bridge-nf: filter bridged IPv4/IPv6 encapsulated in pppoe traffic

The attached patch by Michael Milner adds support for using iptables and
ip6tables on bridged traffic encapsulated in ppoe frames, similar to
what's already supported for vlan.

Signed-off-by: Michael Milner <milner@blissisland.ca>
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP]: Complete documentation of dccp_sock
Gerrit Renker [Fri, 20 Apr 2007 20:57:21 +0000 (13:57 -0700)]
[DCCP]: Complete documentation of dccp_sock

This fills in missing documentation for dccp_sock fields.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP]: Debug statements for Elapsed Time option
Gerrit Renker [Fri, 20 Apr 2007 20:56:47 +0000 (13:56 -0700)]
[DCCP]: Debug statements for Elapsed Time option

This prints the value of the parsed Elapsed Time when received via a
Timestamp Echo option [RFC 4342, 13.3].

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>