]> www.infradead.org Git - users/hch/misc.git/log
users/hch/misc.git
12 years agocifs: simplify id_to_sid and sid_to_id mapping code
Jeff Layton [Mon, 3 Dec 2012 11:05:29 +0000 (06:05 -0500)]
cifs: simplify id_to_sid and sid_to_id mapping code

The cifs.idmap handling code currently causes the kernel to cache the
data from userspace twice. It first looks in a rbtree to see if there is
a matching entry for the given id. If there isn't then it calls
request_key which then checks its cache and then calls out to userland
if it doesn't have one. If the userland program establishes a mapping
and downcalls with that info, it then gets cached in the keyring and in
this rbtree.

Aside from the double memory usage and the performance penalty in doing
all of these extra copies, there are some nasty bugs in here too. The
code declares four rbtrees and spinlocks to protect them, but only seems
to use two of them. The upshot is that the same tree is used to hold
(eg) uid:sid and sid:uid mappings. The comparitors aren't equipped to
deal with that.

I think we'd be best off to remove a layer of caching in this code. If
this was originally done for performance reasons, then that really seems
like a premature optimization.

This patch does that -- it removes the rbtrees and the locks that
protect them and simply has the code do a request_key call on each call
into sid_to_id and id_to_sid. This greatly simplifies this code and
should roughly halve the memory utilization from using the idmapping
code.

Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agoCIFS: Fix possible data coherency problem after oplock break to None
Pavel Shilovsky [Thu, 6 Dec 2012 17:24:33 +0000 (21:24 +0400)]
CIFS: Fix possible data coherency problem after oplock break to None

by using cifs_invalidate_mapping rather than invalidate_remote_inode
in cifs_oplock_break - this invalidates all inode pages and resets
fscache cookies.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agoCIFS: Do not permit write to a range mandatory locked with a read lock
Pavel Shilovsky [Tue, 27 Nov 2012 14:38:53 +0000 (18:38 +0400)]
CIFS: Do not permit write to a range mandatory locked with a read lock

We don't need to permit a write to the area locked with a read lock
by any process including the process that issues the write.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: rename cifs_readdir_lookup to cifs_prime_dcache and make it void return
Jeff Layton [Mon, 3 Dec 2012 11:05:37 +0000 (06:05 -0500)]
cifs: rename cifs_readdir_lookup to cifs_prime_dcache and make it void return

The caller doesn't do anything with the dentry, so there's no point in
holding a reference to it on return. Also cifs_prime_dcache better
describes the actual purpose of the function.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: Add CONFIG_CIFS_DEBUG and rename use of CIFS_DEBUG
Joe Perches [Wed, 5 Dec 2012 20:42:58 +0000 (12:42 -0800)]
cifs: Add CONFIG_CIFS_DEBUG and rename use of CIFS_DEBUG

This can reduce the size of the module by ~120KB which
could be useful for embedded systems.

$ size fs/cifs/built-in.o*
   text    data     bss     dec     hex filename
 388567   34459  100440  523466   7fcca fs/cifs/built-in.o.new
 495970   34599  117904  648473   9e519 fs/cifs/built-in.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
12 years agocifs: Make CIFS_DEBUG possible to undefine
Joe Perches [Wed, 5 Dec 2012 20:42:47 +0000 (12:42 -0800)]
cifs: Make CIFS_DEBUG possible to undefine

Make the compilation work again when CIFS_DEBUG is not #define'd.

Add format and argument verification for the various macros when
CIFS_DEBUG is not #define'd.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
12 years agoSMB3 mounts fail with access denied to some servers
Steve French [Tue, 4 Dec 2012 22:56:37 +0000 (16:56 -0600)]
SMB3 mounts fail with access denied to some servers

We were checking incorrectly if signatures were required to be sent,
so were always sending signatures after the initial session establishment.
For SMB3 mounts (vers=3.0) this was a problem because we were putting
SMB2 signatures in SMB3 requests which would cause access denied
on mount (the tree connection would fail).

This might also be worth considering for stable (for 3.7), as the
error message on mount (access denied) is confusing to users and
there is no workaround if the server is configured to only
support smb3.0. I am ok either way.

CC: stable <stable@kernel.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
12 years agocifs: Remove unused cEVENT macro
Joe Perches [Thu, 29 Nov 2012 19:37:18 +0000 (11:37 -0800)]
cifs: Remove unused cEVENT macro

It uses an undefined KERN_EVENT and is itself unused.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: always zero out smb_vol before parsing options
Jeff Layton [Mon, 26 Nov 2012 16:09:57 +0000 (11:09 -0500)]
cifs: always zero out smb_vol before parsing options

Currently, the code relies on the callers to do that and they all do,
but this will ensure that it's always done.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: remove unneeded address argument from cifs_find_tcp_session and match_server
Jeff Layton [Mon, 26 Nov 2012 16:09:57 +0000 (11:09 -0500)]
cifs: remove unneeded address argument from cifs_find_tcp_session and match_server

Now that the smb_vol contains the destination sockaddr, there's no need
to pass it in separately.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agomake convert_delimiter use strchr instead of open-coding it
Steve French [Fri, 30 Nov 2012 00:07:51 +0000 (18:07 -0600)]
make convert_delimiter use strchr instead of open-coding it

Take advantage of accelerated strchr() on arches that support it.

Also, no caller ever passes in a NULL pointer. Get rid of the unneeded
NULL pointer check.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: get rid of smb_vol->UNCip and smb_vol->port
Jeff Layton [Mon, 26 Nov 2012 16:09:55 +0000 (11:09 -0500)]
cifs: get rid of smb_vol->UNCip and smb_vol->port

Passing this around as a string is contorted and painful. Instead, just
convert these to a sockaddr as soon as possible, since that's how we're
going to work with it later anyway.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: ensure we revalidate the inode after readdir if cifsacl is enabled
Jeff Layton [Sun, 25 Nov 2012 13:00:40 +0000 (08:00 -0500)]
cifs: ensure we revalidate the inode after readdir if cifsacl is enabled

Otherwise, "ls -l" will simply show the ownership of the files as
the default mnt_uid/gid. This may make "ls -l" performance on large
directories super-suck in some cases, but that's the cost of cifsacl.

One possibility to make it suck less would be to somehow proactively
dispatch the ACL requests asynchronously from readdir codepath, but
that's non-trivial to implement.

Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: Add handling of blank password option
Jesper Nilsson [Thu, 29 Nov 2012 16:31:16 +0000 (17:31 +0100)]
cifs: Add handling of blank password option

The option to have a blank "pass=" already exists, and with
a password specified both "pass=%s" and "password=%s" are supported.
Also, both blank "user=" and "username=" are supported, making
"password=" the odd man out.

Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agoAdd SMB2.02 dialect support
Steve French [Thu, 29 Nov 2012 05:21:06 +0000 (23:21 -0600)]
Add SMB2.02 dialect support

This patch enables optional for original SMB2 (SMB2.02) dialect
by specifying vers=2.0 on mount.

Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agoCIFS: Fix lock consistensy bug in cifs_setlk
Pavel Shilovsky [Thu, 22 Nov 2012 14:56:39 +0000 (18:56 +0400)]
CIFS: Fix lock consistensy bug in cifs_setlk

If we netogiate mandatory locking style, have a read lock and try
to set a write lock we end up with a write lock in vfs cache and
no lock in cifs lock cache - that's wrong. Fix it by returning
from cifs_setlk immediately if a error occurs during setting a lock.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agoCIFS: Implement cifs_relock_file
Pavel Shilovsky [Thu, 22 Nov 2012 13:10:57 +0000 (17:10 +0400)]
CIFS: Implement cifs_relock_file

that reacquires byte-range locks when a file is reopened.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agoCIFS: Separate pushing mandatory locks and lock_sem handling
Pavel Shilovsky [Thu, 22 Nov 2012 13:07:16 +0000 (17:07 +0400)]
CIFS: Separate pushing mandatory locks and lock_sem handling

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agoCIFS: Separate pushing posix locks and lock_sem handling
Pavel Shilovsky [Thu, 22 Nov 2012 13:00:10 +0000 (17:00 +0400)]
CIFS: Separate pushing posix locks and lock_sem handling

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agoCIFS: Make use of common cifs_build_path_to_root for CIFS and SMB2
Steve French [Thu, 29 Nov 2012 04:34:41 +0000 (22:34 -0600)]
CIFS: Make use of common cifs_build_path_to_root for CIFS and SMB2

because the is no difference here. This also adds support of prefixpath
mount option for SMB2.

Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: make error on lack of a unc= option more explicit
Jeff Layton [Sun, 25 Nov 2012 13:00:42 +0000 (08:00 -0500)]
cifs: make error on lack of a unc= option more explicit

Error out with a clear error message if there is no unc= option. The
existing code doesn't handle this in a clear fashion, and the check for
a UNCip option with no UNC string is just plain wrong.

Later, we'll fix the code to not require a unc= option, but for now we
need this to at least clarify why people are getting errors about DFS
parsing. With this change we can also get rid of some later NULL pointer
checks since we know the UNC and UNCip will never be NULL there.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: don't override the uid/gid in getattr when cifsacl is enabled
Jeff Layton [Sun, 25 Nov 2012 13:00:40 +0000 (08:00 -0500)]
cifs: don't override the uid/gid in getattr when cifsacl is enabled

If we're using cifsacl, then we don't want to override the uid/gid with
the current uid/gid, since that would prevent you from being able to
upcall for this info.

Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: remove uneeded __KERNEL__ block from cifsacl.h
Jeff Layton [Sun, 25 Nov 2012 13:00:38 +0000 (08:00 -0500)]
cifs: remove uneeded __KERNEL__ block from cifsacl.h

...and make those symbols static in cifsacl.c. Nothing outside
of that file refers to them.

Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: fix the format specifiers in sid_to_str
Jeff Layton [Sun, 25 Nov 2012 13:00:38 +0000 (08:00 -0500)]
cifs: fix the format specifiers in sid_to_str

The format specifiers are for signed values, but these are unsigned.
Given that '-' is a delimiter between fields, I don't think you'd get
what you'd expect if you got a value here that would overflow the sign
bit.

The version and authority fields are 8 bit values so use a "hh" length
modifier there. The subauths are 32 bit values, so there's no need to
use a "l" length modifier there.

Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: redefine NUM_SUBAUTH constant from 5 to 15
Jeff Layton [Sun, 25 Nov 2012 13:00:37 +0000 (08:00 -0500)]
cifs: redefine NUM_SUBAUTH constant from 5 to 15

According to several places on the Internet and the samba winbind code,
this is hard limited to 15 in windows, not 5. This does balloon out
the allocation of each by 40 bytes, but I don't see any alternative.

Also, rename it to SID_MAX_SUB_AUTHORITIES to match the alleged name
of this constant in the windows header files

Finally, rename SIDLEN to SID_STRING_MAX, fix the value to reflect
the change to SID_MAX_SUB_AUTHORITIES and document how it was
determined.

Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: make cifs_copy_sid handle a source sid with variable size subauth arrays
Jeff Layton [Sun, 25 Nov 2012 13:00:37 +0000 (08:00 -0500)]
cifs: make cifs_copy_sid handle a source sid with variable size subauth arrays

...and lift the restriction in id_to_sid upcall that the size must be
at least as big as a full cifs_sid.

Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: make compare_sids static
Jeff Layton [Sun, 25 Nov 2012 13:00:36 +0000 (08:00 -0500)]
cifs: make compare_sids static

..nothing outside of cifsacl.c calls it. Also fix the incorrect
comment on the function. It returns 0 when they match.

Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: use the NUM_AUTHS and NUM_SUBAUTHS constants in cifsacl code
Jeff Layton [Sun, 25 Nov 2012 13:00:36 +0000 (08:00 -0500)]
cifs: use the NUM_AUTHS and NUM_SUBAUTHS constants in cifsacl code

...instead of hardcoding in '5' and '6' all over the place.

Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: move num_subauth check inside of CONFIG_CIFS_DEBUG2 check in parse_sid()
Jeff Layton [Sun, 25 Nov 2012 13:00:35 +0000 (08:00 -0500)]
cifs: move num_subauth check inside of CONFIG_CIFS_DEBUG2 check in parse_sid()

Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: clean up id_mode_to_cifs_acl
Jeff Layton [Sun, 25 Nov 2012 13:00:35 +0000 (08:00 -0500)]
cifs: clean up id_mode_to_cifs_acl

Add a label we can goto on error, and get rid of some excess indentation.
Also move to kernel-style comments.

Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agocifs: fix types on module parameters
Jeff Layton [Sun, 25 Nov 2012 13:00:34 +0000 (08:00 -0500)]
cifs: fix types on module parameters

Most of these are unsigned ints, so we should be passing "uint" to
module_param. Also, get rid of the extra "(bool)" in the description
of enable_oplocks.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
12 years agodefault authentication needs to be at least ntlmv2 security for cifs mounts
Steve French [Sun, 25 Nov 2012 06:07:44 +0000 (00:07 -0600)]
default authentication needs to be at least ntlmv2 security for cifs mounts

We had planned to upgrade to ntlmv2 security a few releases ago,
and have been warning users in dmesg on mount about the impending
upgrade, but had to make a change (to use nltmssp with ntlmv2) due
to testing issues with some non-Windows, non-Samba servers.

The approach in this patch is simpler than earlier patches,
and changes the default authentication mechanism to ntlmv2
password hashes (encapsulated in ntlmssp) from ntlm (ntlm is
too weak for current use and ntlmv2 has been broadly
supported for many, many years).

Signed-off-by: Steve French <smfrench@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
12 years agovfs: clear to the end of the buffer on partial buffer reads
Dan Carpenter [Wed, 5 Dec 2012 17:01:24 +0000 (20:01 +0300)]
vfs: clear to the end of the buffer on partial buffer reads

READ is zero so the "rw & READ" test is always false.  The intended test
was "((rw & RW_MASK) == READ)".

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Linus Torvalds [Tue, 4 Dec 2012 17:32:12 +0000 (09:32 -0800)]
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module fixes from Rusty Russell:
 "Module signing build fixes for blackfin and metag"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  modsign: add symbol prefix to certificate list
  linux/kernel.h: define SYMBOL_PREFIX

12 years agoMerge tag 'upstream-3.7-rc9' of git://git.infradead.org/linux-ubi
Linus Torvalds [Tue, 4 Dec 2012 17:15:51 +0000 (09:15 -0800)]
Merge tag 'upstream-3.7-rc9' of git://git.infradead.org/linux-ubi

Pull UBI changes from Artem Bityutskiy:
 "Fixes for 2 brown-paperbag bugs introduced this merge window by the
  fastmap code:

   1.  The UBI background thread got stuck when a bit-flip happened
       because free LEBs was not removed from the "free" tree when we
       started using it.
   2.  I/O debugging checks did not work because we called a sleeping
       function in atomic context."

* tag 'upstream-3.7-rc9' of git://git.infradead.org/linux-ubi:
  UBI: dont call ubi_self_check_all_ff() in __wl_get_peb()
  UBI: remove PEB from free tree in get_peb_for_wl()

12 years agoMerge branch 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Linus Torvalds [Tue, 4 Dec 2012 17:02:45 +0000 (09:02 -0800)]
Merge branch 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue fixes from Tejun Heo:
 "So, safe fixes my ass.

  Commit 8852aac25e79 ("workqueue: mod_delayed_work_on() shouldn't queue
  timer on 0 delay") had the side-effect of performing delayed_work
  sanity checks even when @delay is 0, which should be fine for any sane
  use cases.

  Unfortunately, megaraid was being overly ingenious.  It seemingly
  wanted to use cancel_delayed_work_sync() before cancel_work_sync() was
  introduced, but didn't want to waste the space for full delayed_work
  as it was only going to use 0 @delay.  So, it only allocated space for
  struct work_struct and then cast it to struct delayed_work and passed
  it into delayed_work functions - truly awesome engineering tradeoff to
  save some bytes.

  Xiaotian fixed it by making megraid allocate full delayed_work for
  now.  It should be converted to use work_struct and cancel_work_sync()
  but I think we better do that after 3.7.

  I added another commit to change BUG_ON()s in __queue_delayed_work()
  to WARN_ON_ONCE()s so that the kernel doesn't crash even if there are
  more such abuses."

* 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: convert BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s
  megaraid: fix BUG_ON() from incorrect use of delayed work

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Linus Torvalds [Tue, 4 Dec 2012 16:42:29 +0000 (08:42 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc

Pull sparc fixes from David Miller:
 "Two small fixes for Sparc, nobody uses sparc, so these are low risk :-)

   1) Piggyback is too picky about the symbol types that _start and _end
      have in the final kernel image, and it thus breaks with newer
      binutils.  Future proof by getting rid of the symbol type checks.

   2) exit_group() should kill register windows on sparc64 the same way
      we do for plain exit().  Thanks to Al Viro for spotting this."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Fix piggyback with newer binutils.
  sparc64: exit_group should kill register windows just like plain exit.

12 years agovfs: avoid "attempt to access beyond end of device" warnings
Linus Torvalds [Tue, 4 Dec 2012 16:25:11 +0000 (08:25 -0800)]
vfs: avoid "attempt to access beyond end of device" warnings

The block device access simplification that avoided accessing the (racy)
block size information (commit bbec0270bdd8: "blkdev_max_block: make
private to fs/buffer.c") no longer checks the maximum block size in the
block mapping path.

That was _almost_ as simple as just removing the code entirely, because
the readers and writers all check the size of the device anyway, so
under normal circumstances it "just worked".

However, the block size may be such that the end of the device may
straddle one single buffer_head.  At which point we may still want to
access the end of the device, but the buffer we use to access it
partially extends past the end.

The 'bd_set_size()' function intentionally sets the block size to avoid
this, but mounting the device - or setting the block size by hand to
some other value - can modify that block size.

So instead, teach 'submit_bh()' about the special case of the buffer
head straddling the end of the device, and turning such an access into a
smaller IO access, avoiding the problem.

This, btw, also means that unlike before, we can now access the whole
device regardless of device block size setting.  So now, even if the
device size is only 512-byte aligned, we can read and write even the
last sector even when having a much bigger block size for accessing the
rest of the device.

So with this, we could now get rid of the 'bd_set_size()' block size
code entirely - resulting in faster IO for the common case - but that
would be a separate patch.

Reported-and-tested-by: Romain Francoise <romain@orebokech.com>
Reporeted-and-tested-by: Meelis Roos <mroos@linux.ee>
Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoworkqueue: convert BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s
Tejun Heo [Tue, 4 Dec 2012 15:40:39 +0000 (07:40 -0800)]
workqueue: convert BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s

8852aac25e ("workqueue: mod_delayed_work_on() shouldn't queue timer on
0 delay") unexpectedly uncovered a very nasty abuse of delayed_work in
megaraid - it allocated work_struct, casted it to delayed_work and
then pass that into queue_delayed_work().

Previously, this was okay because 0 @delay short-circuited to
queue_work() before doing anything with delayed_work.  8852aac25e
moved 0 @delay test into __queue_delayed_work() after sanity check on
delayed_work making megaraid trigger BUG_ON().

Although megaraid is already fixed by c1d390d8e6 ("megaraid: fix
BUG_ON() from incorrect use of delayed work"), this patch converts
BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s so that such
abusers, if there are more, trigger warning but don't crash the
machine.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Xiaotian Feng <xtfeng@gmail.com>
12 years agomegaraid: fix BUG_ON() from incorrect use of delayed work
Xiaotian Feng [Tue, 4 Dec 2012 11:33:54 +0000 (19:33 +0800)]
megaraid: fix BUG_ON() from incorrect use of delayed work

megaraid use INIT_WORK to declare a hotplug_work, but cast the
hotplug_work from work_struct to delayed_work and
schedule_delayed_work on it.  This is very dangerous, as other part of
delayed_work might be kernel memories allocated by others.

With commit 8852aac ("workqueue: mod_delayed_work_on() shouldn't queue
timer on 0 delay"), schedule_delayed_work() will check dwork->timer
before queue_work even when @delay is 0, this causes megaraid code to
hit the BUG_ON() in workqueue code.  Change megaraid code to use
delayed work.

Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Neela Syam Kolli <megaraidlinux@lsi.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: linux-scsi@vger.kernel.org
12 years agoUBI: dont call ubi_self_check_all_ff() in __wl_get_peb()
Richard Weinberger [Mon, 3 Dec 2012 19:57:47 +0000 (20:57 +0100)]
UBI: dont call ubi_self_check_all_ff() in __wl_get_peb()

As ubi_self_check_all_ff() might sleep we are not allowed
to call it from atomic context.
For now we call it only from ubi_wl_get_peb().
There are some code paths where it would also make sense,
but these paths are currently atomic and only enabled
when fastmap is used.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
12 years agoUBI: remove PEB from free tree in get_peb_for_wl()
Richard Weinberger [Mon, 3 Dec 2012 19:57:46 +0000 (20:57 +0100)]
UBI: remove PEB from free tree in get_peb_for_wl()

If UBI is built without fastmap, get_peb_for_wl() has to
remove the PEB manially from the free tree.
Otherwise the requested PEB lives in two trees.

Reported-by: Zach Sadecki <zsadecki@itwatchdogs.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
12 years agosparc: Fix piggyback with newer binutils.
David S. Miller [Mon, 3 Dec 2012 19:24:25 +0000 (11:24 -0800)]
sparc: Fix piggyback with newer binutils.

Newer versions of binutils mark '_end' as 'B' instead of 'A' for
whatever reason.

To be honest, the piggyback code doesn't actually care what kind
of symbol _start and _end are, it just wants to find them and
record the address.

So remove the type from the match strings.

Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoLinux 3.7-rc8 v3.7-rc8
Linus Torvalds [Mon, 3 Dec 2012 19:22:37 +0000 (11:22 -0800)]
Linux 3.7-rc8

12 years agosparc64: exit_group should kill register windows just like plain exit.
David S. Miller [Mon, 3 Dec 2012 19:17:57 +0000 (11:17 -0800)]
sparc64: exit_group should kill register windows just like plain exit.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac
Linus Torvalds [Mon, 3 Dec 2012 19:16:37 +0000 (11:16 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac

Pull EDAC fixes from Mauro Carvalho Chehab:
 "One EDAC core fix, and a few driver fixes (i7300, i9275x, i7core)."

* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
  i7core_edac: fix panic when accessing sysfs files
  i7300_edac: Fix error flag testing
  edac: Fix the dimm filling for csrows-based layouts
  i82975x_edac: Fix dimm label initialization

12 years agoMerge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Mon, 3 Dec 2012 19:13:32 +0000 (11:13 -0800)]
Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
 "Some driver fixes for s5p/exynos (mostly race fixes)"

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] s5p-mfc: Handle multi-frame input buffer
  [media] s5p-mfc: Bug fix of timestamp/timecode copy mechanism
  [media] exynos-gsc: Add missing video device vfl_dir flag initialization
  [media] exynos-gsc: Fix settings for input and output image RGB type
  [media] exynos-gsc: Don't use mutex_lock_interruptible() in device release()
  [media] fimc-lite: Don't use mutex_lock_interruptible() in device release()
  [media] s5p-fimc: Don't use mutex_lock_interruptible() in device release()
  [media] s5p-fimc: Prevent race conditions during subdevs registration

12 years ago[parisc] open(2) compat bug
Al Viro [Mon, 3 Dec 2012 18:15:30 +0000 (18:15 +0000)]
[parisc] open(2) compat bug

In commit 9d73fc2d641f ("open*(2) compat fixes (s390, arm64)") I said:
>
>  The usual rules for open()/openat()/open_by_handle_at() are
> 1) native 32bit - don't force O_LARGEFILE in flags
> 2) native 64bit - force O_LARGEFILE in flags
> 3) compat on 64bit host - as for native 32bit
> 4) native 32bit ABI for 64bit system (mips/n32, x86/x32) - as for native 64bit
>
> There are only two exceptions - s390 compat has open() forcing O_LARGEFILE and
> arm64 compat has open_by_handle_at() doing the same thing.  The same binaries
> on native host (s390/31 and arm resp.) will *not* force O_LARGEFILE, so IMO
> both are emulation bugs.

Three exceptions, actually - parisc open() is another case like that.
Native 32bit won't force O_LARGEFILE, the same binary on parisc64 will.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoRevert "sched, autogroup: Stop going ahead if autogroup is disabled"
Mike Galbraith [Mon, 3 Dec 2012 05:25:25 +0000 (06:25 +0100)]
Revert "sched, autogroup: Stop going ahead if autogroup is disabled"

This reverts commit 800d4d30c8f20bd728e5741a3b77c4859a613f7c.

Between commits 8323f26ce342 ("sched: Fix race in task_group()") and
800d4d30c8f2 ("sched, autogroup: Stop going ahead if autogroup is
disabled"), autogroup is a wreck.

With both applied, all you have to do to crash a box is disable
autogroup during boot up, then reboot..  boom, NULL pointer dereference
due to commit 800d4d30c8f2 not allowing autogroup to move things, and
commit 8323f26ce342 making that the only way to switch runqueues:

  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: [<ffffffff81063ac0>] effective_load.isra.43+0x50/0x90
  Pid: 7047, comm: systemd-user-se Not tainted 3.6.8-smp #7 MEDIONPC MS-7502/MS-7502
  RIP: effective_load.isra.43+0x50/0x90
  Process systemd-user-se (pid: 7047, threadinfo ffff880221dde000, task ffff88022618b3a0)
  Call Trace:
    select_task_rq_fair+0x255/0x780
    try_to_wake_up+0x156/0x2c0
    wake_up_state+0xb/0x10
    signal_wake_up+0x28/0x40
    complete_signal+0x1d6/0x250
    __send_signal+0x170/0x310
    send_signal+0x40/0x80
    do_send_sig_info+0x47/0x90
    group_send_sig_info+0x4a/0x70
    kill_pid_info+0x3a/0x60
    sys_kill+0x97/0x1a0
    ? vfs_read+0x120/0x160
    ? sys_read+0x45/0x90
    system_call_fastpath+0x16/0x1b
  Code: 49 0f af 41 50 31 d2 49 f7 f0 48 83 f8 01 48 0f 46 c6 48 2b 07 48 8b bf 40 01 00 00 48 85 ff 74 3a 45 31 c0 48 8b 8f 50 01 00 00 <48> 8b 11 4c 8b 89 80 00 00 00 49 89 d2 48 01 d0 45 8b 59 58 4c
  RIP  [<ffffffff81063ac0>] effective_load.isra.43+0x50/0x90
   RSP <ffff880221ddfbd8>
  CR2: 0000000000000000

Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Yong Zhang <yong.zhang0@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@vger.kernel.org # 2.6.39+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'block-dev'
Linus Torvalds [Mon, 3 Dec 2012 18:53:25 +0000 (10:53 -0800)]
Merge branch 'block-dev'

Merge 'block-dev' branch.

I was going to just mark everything here for stable and leave it to the
3.8 merge window, but having decided on doing another -rc, I migth as
well merge it now.

This removes the bd_block_size_semaphore semaphore that was added in
this release to fix a race condition between block size changes and
block IO, and replaces it with atomicity guaratees in fs/buffer.c
instead, along with simplifying fs/block-dev.c.

This removes more lines than it adds, makes the code generally simpler,
and avoids the latency/rt issues that the block size semaphore
introduced for mount.

I'm not happy with the timing, but it wouldn't be much better doing this
during the merge window and then having some delayed back-port of it
into stable.

* block-dev:
  blkdev_max_block: make private to fs/buffer.c
  direct-io: don't read inode->i_blkbits multiple times
  blockdev: remove bd_block_size_semaphore again
  fs/buffer.c: make block-size be per-page and protected by the page lock

12 years agomodsign: add symbol prefix to certificate list
James Hogan [Fri, 23 Nov 2012 12:08:16 +0000 (12:08 +0000)]
modsign: add symbol prefix to certificate list

Add the arch symbol prefix (if applicable) to the asm definition of
modsign_certificate_list and modsign_certificate_list_end. This uses the
recently defined SYMBOL_PREFIX which is derived from
CONFIG_SYMBOL_PREFIX.

This fixes the build of module signing on the blackfin and metag
architectures.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
12 years agolinux/kernel.h: define SYMBOL_PREFIX
James Hogan [Fri, 23 Nov 2012 12:08:15 +0000 (12:08 +0000)]
linux/kernel.h: define SYMBOL_PREFIX

Define SYMBOL_PREFIX to be the same as CONFIG_SYMBOL_PREFIX if set by
the architecture, or "" otherwise. This avoids the need for ugly #ifdefs
whenever symbols are referenced in asm blocks.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Mon, 3 Dec 2012 00:39:00 +0000 (16:39 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) 8139cp leaks memory in error paths, from Francois Romieu.

 2) do_tcp_sendpages() cannot handle order > 0 pages, but they can
    certainly arrive there now, fix from Eric Dumazet.

 3) Race condition and sysfs fixes in bonding from Nikolay Aleksandrov.

 4) Remain-on-Channel fix in mac80211 from Felix Liao.

 5) CCK rate calculation fix in iwlwifi, from Emmanuel Grumbach.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  8139cp: fix coherent mapping leak in error path.
  tcp: fix crashes in do_tcp_sendpages()
  bonding: fix race condition in bonding_store_slaves_active
  bonding: make arp_ip_target parameter checks consistent with sysfs
  bonding: fix miimon and arp_interval delayed work race conditions
  mac80211: fix remain-on-channel (non-)cancelling
  iwlwifi: fix the basic CCK rates calculation

12 years agoMerge tag 'md-3.7-fixes' of git://neil.brown.name/md
Linus Torvalds [Mon, 3 Dec 2012 00:24:31 +0000 (16:24 -0800)]
Merge tag 'md-3.7-fixes' of git://neil.brown.name/md

Pull md bugfix from NeilBrown:
 "Single bugfix for raid1/raid10.

  Fixes a recently introduced deadlock."

* tag 'md-3.7-fixes' of git://neil.brown.name/md:
  md/raid1{,0}: fix deadlock in bitmap_unplug.

12 years agoopen*(2) compat fixes (s390, arm64)
Al Viro [Sun, 2 Dec 2012 17:55:07 +0000 (17:55 +0000)]
open*(2) compat fixes (s390, arm64)

The usual rules for open()/openat()/open_by_handle_at() are
 1) native 32bit - don't force O_LARGEFILE in flags
 2) native 64bit - force O_LARGEFILE in flags
 3) compat on 64bit host - as for native 32bit
 4) native 32bit ABI for 64bit system (mips/n32, x86/x32) - as for
    native 64bit

There are only two exceptions - s390 compat has open() forcing
O_LARGEFILE and arm64 compat has open_by_handle_at() doing the same
thing.  The same binaries on native host (s390/31 and arm resp.) will
*not* force O_LARGEFILE, so IMO both are emulation bugs.

Objections? The fix is obvious...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Linus Torvalds [Sun, 2 Dec 2012 01:55:13 +0000 (17:55 -0800)]
Merge branch 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull  late workqueue fixes from Tejun Heo:
 "Unfortunately, I have two really late fixes.  One was for a
  long-standing bug and queued for 3.8 but I found out about a
  regression introduced during 3.7-rc1 two days ago, so I'm sending out
  the two fixes together.

  The first (long-standing) one is rescuer_thread() entering exit path
  w/ TASK_INTERRUPTIBLE.  It only triggers on workqueue destructions
  which isn't very frequent and the exit path can usually survive being
  called with TASK_INTERRUPT, so it was hidden pretty well.  Apparently,
  if you're reiserfs, this could lead to the exiting kthread sleeping
  indefinitely holding a mutex, which is never good.

  The fix is simple - restoring TASK_RUNNING before returning from the
  kthread function.

  The second one is introduced by the new mod_delayed_work().
  mod_delayed_work() was missing special case handling for 0 delay.
  Instead of queueing the work item immediately, it queued the timer
  which expires on the closest next tick.  Some users of the new
  function converted from "[__]cancel_delayed_work() +
  queue_delayed_work()" combination became unhappy with the extra delay.

  Block unplugging led to noticeably higher number of context switches
  and intel 6250 wireless failed to associate with WPA-Enterprise
  network.  The fix, again, is fairly simple.  The 0 delay special case
  logic from queue_delayed_work_on() should be moved to
  __queue_delayed_work() which is shared by both queue_delayed_work_on()
  and mod_delayed_work_on().

  The first one is difficult to trigger and the failure mode for the
  latter isn't completely catastrophic, so missing these two for 3.7
  wouldn't make it a disastrous release, but both bugs are nasty and the
  fixes are fairly safe"

* 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: mod_delayed_work_on() shouldn't queue timer on 0 delay
  workqueue: exit rescuer_thread() as TASK_RUNNING

12 years ago8139cp: fix coherent mapping leak in error path.
françois romieu [Sat, 1 Dec 2012 13:08:50 +0000 (13:08 +0000)]
8139cp: fix coherent mapping leak in error path.

cp_open
[...]
        rc = cp_alloc_rings(cp);
        if (rc)
                return rc;

cp_alloc_rings
[...]
        mem = dma_alloc_coherent(&cp->pdev->dev, CP_RING_BYTES,
                                 &cp->ring_dma, GFP_KERNEL);

- cp_alloc_rings never frees the coherent mapping it allocates
- neither do cp_open when cp_alloc_rings fails

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotcp: fix crashes in do_tcp_sendpages()
Eric Dumazet [Sat, 1 Dec 2012 13:07:02 +0000 (13:07 +0000)]
tcp: fix crashes in do_tcp_sendpages()

Recent network changes allowed high order pages being used
for skb fragments.

This uncovered a bug in do_tcp_sendpages() which was assuming its caller
provided an array of order-0 page pointers.

We only have to deal with a single page in this function, and its order
is irrelevant.

Reported-by: Willy Tarreau <w@1wt.eu>
Tested-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoworkqueue: mod_delayed_work_on() shouldn't queue timer on 0 delay
Tejun Heo [Sun, 2 Dec 2012 00:23:42 +0000 (16:23 -0800)]
workqueue: mod_delayed_work_on() shouldn't queue timer on 0 delay

8376fe22c7 ("workqueue: implement mod_delayed_work[_on]()")
implemented mod_delayed_work[_on]() using the improved
try_to_grab_pending().  The function is later used, among others, to
replace [__]candel_delayed_work() + queue_delayed_work() combinations.

Unfortunately, a delayed_work item w/ zero @delay is handled slightly
differently by mod_delayed_work_on() compared to
queue_delayed_work_on().  The latter skips timer altogether and
directly queues it using queue_work_on() while the former schedules
timer which will expire on the closest tick.  This means, when @delay
is zero, that [__]cancel_delayed_work() + queue_delayed_work_on()
makes the target item immediately executable while
mod_delayed_work_on() may induce delay of upto a full tick.

This somewhat subtle difference breaks some of the converted users.
e.g. block queue plugging uses delayed_work for deferred processing
and uses mod_delayed_work_on() when the queue needs to be immediately
unplugged.  The above problem manifested as noticeably higher number
of context switches under certain circumstances.

The difference in behavior was caused by missing special case handling
for 0 delay in mod_delayed_work_on() compared to
queue_delayed_work_on().  Joonsoo Kim posted a patch to add it -
("workqueue: optimize mod_delayed_work_on() when @delay == 0")[1].
The patch was queued for 3.8 but it was described as optimization and
I missed that it was a correctness issue.

As both queue_delayed_work_on() and mod_delayed_work_on() use
__queue_delayed_work() for queueing, it seems that the better approach
is to move the 0 delay special handling to the function instead of
duplicating it in mod_delayed_work_on().

Fix the problem by moving 0 delay special case handling from
queue_delayed_work_on() to __queue_delayed_work().  This replaces
Joonsoo's patch.

[1] http://thread.gmane.org/gmane.linux.kernel/1379011/focus=1379012

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Anders Kaseorg <andersk@MIT.EDU>
Reported-and-tested-by: Zlatko Calusic <zlatko.calusic@iskon.hr>
LKML-Reference: <alpine.DEB.2.00.1211280953350.26602@dr-wily.mit.edu>
LKML-Reference: <50A78AA9.5040904@iskon.hr>
Cc: Joonsoo Kim <js1304@gmail.com>
12 years agoworkqueue: exit rescuer_thread() as TASK_RUNNING
Mike Galbraith [Wed, 28 Nov 2012 06:17:18 +0000 (07:17 +0100)]
workqueue: exit rescuer_thread() as TASK_RUNNING

A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling
off, never to be seen again.  In the case where this occurred, an exiting
thread hit reiserfs homebrew conditional resched while holding a mutex,
bringing the box to its knees.

PID: 18105  TASK: ffff8807fd412180  CPU: 5   COMMAND: "kdmflush"
 #0 [ffff8808157e7670] schedule at ffffffff8143f489
 #1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs]
 #2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14
 #3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs]
 #4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2
 #5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41
 #6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a
 #7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88
 #8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850
 #9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f
    [exception RIP: kernel_thread_helper]
    RIP: ffffffff8144a5c0  RSP: ffff8808157e7f58  RFLAGS: 00000202
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffffffff8107af60  RDI: ffff8803ee491d18
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sat, 1 Dec 2012 21:29:55 +0000 (13:29 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
 "A bunch of fixes; the last one is this cycle regression, the rest are
  -stable fodder."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix off-by-one in argument passed by iterate_fd() to callbacks
  lookup_one_len: don't accept . and ..
  cifs: get rid of blind d_drop() in readdir
  nfs_lookup_revalidate(): fix a leak
  don't do blind d_drop() in nfs_prime_dcache()

12 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 1 Dec 2012 21:08:36 +0000 (13:08 -0800)]
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU fix from Ingo Molnar:
 "Fix leaking RCU extended quiescent state, which might trigger warnings
  and mess up the extended quiescent state tracking logic into thinking
  that we are in "RCU user mode" while we aren't."

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rcu: Fix unrecovered RCU user mode in syscall_trace_leave()

12 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 1 Dec 2012 21:07:48 +0000 (13:07 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "This is mostly about unbreaking architectures that took the UAPI
  changes in the v3.7 cycle, plus misc fixes."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf kvm: Fix building perf kvm on non x86 arches
  perf kvm: Rename perf_kvm to perf_kvm_stat
  perf: Make perf build for x86 with UAPI disintegration applied
  perf powerpc: Use uapi/unistd.h to fix build error
  tools: Pass the target in descend
  tools: Honour the O= flag when tool build called from a higher Makefile
  tools: Define a Makefile function to do subdir processing
  x86: Export asm/{svm.h,vmx.h,perf_regs.h}
  perf tools: Fix strbuf_addf() when the buffer needs to grow
  perf header: Fix numa topology printing
  perf, powerpc: Fix hw breakpoints returning -ENOSPC

12 years agoMerge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Sat, 1 Dec 2012 10:56:03 +0000 (11:56 +0100)]
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

 - Don't build 'perf kvm stat" on non-x86 arches, fix from Xiao Guangrong.

 - UAPI fixes to get perf building again in non-x86 arches, from David Howells.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
12 years agoMerge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Sat, 1 Dec 2012 01:00:23 +0000 (17:00 -0800)]
Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Peter Anvin.

This includes the resume-time FPU corruption fix from the chromeos guys,
marked for stable.

* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, fpu: Avoid FPU lazy restore after suspend
  x86-32: Unbreak booting on some 486 clones
  x86, kvm: Remove incorrect redundant assembly constraint

12 years agoMerge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming
Linus Torvalds [Sat, 1 Dec 2012 00:59:50 +0000 (16:59 -0800)]
Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming

Pull C6X fixes from Mark Salter.

* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
  c6x: use generic kvm_para.h
  c6x: remove internal kernel symbols from exported setup.h
  c6x: fix misleading comment
  c6x: run do_notify_resume with interrupts enabled

12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Linus Torvalds [Sat, 1 Dec 2012 00:58:55 +0000 (16:58 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal

Pull assorted signal-related fixes from Al Viro:
 "uml regression fix (braino in sys_execve() patch) + a bunch of fucked
  sigaltstack-on-rt_sigreturn uses, similar to sparc64 fix that went in
  through davem's tree.  m32r horrors not included - that one's waiting
  for maintainer."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  microblaze: rt_sigreturn is too trigger-happy about sigaltstack errors
  score: do_sigaltstack() expects a userland pointer...
  sh64: fix altstack switching on sigreturn
  openrisk: fix altstack switching on sigreturn
  um: get_safe_registers() should be done in flush_thread(), not start_thread()

12 years agoMerge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 1 Dec 2012 00:57:18 +0000 (16:57 -0800)]
Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6

Pull CIFS fixes from Steve French:
 "Two low risk, small fixes, that fix cifs regressions introduced in
  3.7."

* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Fix wrong buffer pointer usage in smb_set_file_info
  cifs: fix writeback race with file that is growing

12 years agoMerge tag 'rproc-3.7-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/remot...
Linus Torvalds [Sat, 1 Dec 2012 00:56:38 +0000 (16:56 -0800)]
Merge tag 'rproc-3.7-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/remoteproc

Pull remoteproc fix from Ohad Ben-Cohen:
 "A single remoteproc fix for an error path issue reported by Ido Yariv."

* tag 'rproc-3.7-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/remoteproc:
  remoteproc: fix error path of ->find_vqs

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Linus Torvalds [Sat, 1 Dec 2012 00:55:51 +0000 (16:55 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending

Pull target fix from Nicholas Bellinger:
 "So just a single target fix for v3.7.0 this time around from Roland to
  address a aborted command bug w/ tcm_qla2xxx fabric ports.

  Also, there is one outstanding IBLOCK + virtio-blk bug that is still
  being tracked down effecting v3.6.x, but AFAICT thus far this appears
  to be a bug outside of target code."

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Fix handling of aborted commands

12 years agox86, fpu: Avoid FPU lazy restore after suspend
Vincent Palatin [Fri, 30 Nov 2012 20:15:32 +0000 (12:15 -0800)]
x86, fpu: Avoid FPU lazy restore after suspend

When a cpu enters S3 state, the FPU state is lost.
After resuming for S3, if we try to lazy restore the FPU for a process running
on the same CPU, this will result in a corrupted FPU context.

Ensure that "fpu_owner_task" is properly invalided when (re-)initializing a CPU,
so nobody will try to lazy restore a state which doesn't exist in the hardware.

Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off,
by doing thousands of suspend/resume cycles with 4 processes doing FPU
operations running. Without the patch, a process is killed after a
few hundreds cycles by a SIGFPE.

Cc: Duncan Laurie <dlaurie@chromium.org>
Cc: Olof Johansson <olofj@chromium.org>
Cc: <stable@kernel.org> v3.4+ # for 3.4 need to replace this_cpu_write by percpu_write
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Link: http://lkml.kernel.org/r/1354306532-1014-1-git-send-email-vpalatin@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
12 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 30 Nov 2012 18:47:55 +0000 (10:47 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull DRM fixes from Dave Airlie:
 "Just driver fixes, nothing major, except maybe the Ironlake rc6
  disable:

   - intel:
     * revert ironlake rc6 - we still have one ilk regression, but this
       gets rid of one big one
     * turn off cloning
     * a directed fix for Apple edp
   - radeon: one modesetting fix
   - exynos: minor fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  radeon: fix pll/ctrc mapping on dce2 and dce3 hardware
  Revert "drm/i915: enable rc6 on ilk again"
  drm/i915: do not default to 18 bpp for eDP if missing from VBT
  drm/exynos: Fix potential NULL pointer dereference in exynos_drm_encoder.c
  drm/exynos: Make exynos4/5_fimd_driver_data static
  drm/exynos: fix overlay updating issue
  drm/exynos: remove unnecessary code.
  drm/exynos: fix linux framebuffer address setting.
  drm/i915: disable cloning on sdvo

12 years agoMerge branch 'akpm' (Fixes from Andrew)
Linus Torvalds [Fri, 30 Nov 2012 18:46:43 +0000 (10:46 -0800)]
Merge branch 'akpm' (Fixes from Andrew)

Merge misc fixes from Andrew Morton:
 "Seven fixes, some of them fingers-crossed :("

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (7 patches)
  drivers/rtc/rtc-tps65910.c: fix invalid pointer access on _remove()
  mm: soft offline: split thp at the beginning of soft_offline_page()
  mm: avoid waking kswapd for THP allocations when compaction is deferred or contended
  revert "Revert "mm: remove __GFP_NO_KSWAPD""
  mm: vmscan: fix endless loop in kswapd balancing
  mm/vmemmap: fix wrong use of virt_to_page
  mm: compaction: fix return value of capture_free_page()

12 years agoMerge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm...
Linus Torvalds [Fri, 30 Nov 2012 18:30:34 +0000 (10:30 -0800)]
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "These are three fixes for the Marvell EBU family and one for the
  Samsung s3c platforms.  All of them are obvious should still make it
  into 3.7."

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: Kirkwood: Update PCI-E fixup
  Dove: Fix irq_to_pmu()
  Dove: Attempt to fix PMU/RTC interrupts
  ARM: S3C24XX: Fix potential NULL pointer dereference error

12 years agoMerge tag 'ixp4xx-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Linus Torvalds [Fri, 30 Nov 2012 18:28:09 +0000 (10:28 -0800)]
Merge tag 'ixp4xx-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM ixp4xx bug fixes from Arnd Bergmann:
 "These were originally prepared by Krzysztof Halasa but not submitted
  in time for v3.7 due to some confusion about how ixp4xx patches should
  be handled.  Jason Cooper thankfully offered to help out sending the
  patches upstream through arm-soc now, but given the timing, we could
  as well delay them for 3.8."

* tag 'ixp4xx-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  IXP4xx: use __iomem for MMIO
  IXP4xx: map CPU config registers within VMALLOC region.
  IXP4xx: Always ioremap() Queue Manager MMIO region at boot.
  ixp4xx: Declare MODULE_FIRMWARE usage
  IXP4xx crypto: MOD_AES{128,192,256} already include key size.
  WAN: Remove redundant HDLC info printed by IXP4xx HSS driver.
  IXP4xx: Remove time limit for PCI TRDY to enable use of slow devices.
  IXP4xx: ixp4xx_crypto driver requires Queue Manager and NPE drivers.
  IXP4xx: HW pseudo-random generator is available on IXP45x/46x only.
  IXP4xx: Fix off-by-one bug in Goramo MultiLink platform.
  IXP4xx: Fix Goramo MultiLink platform compilation.

12 years agoMerge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Linus Torvalds [Fri, 30 Nov 2012 16:53:53 +0000 (08:53 -0800)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull final ARM fix from Russell King:
 "One final fix, spotted by Will, to do with what happens when we boot a
  SMP kernel on UP."

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7586/1: sp804: set cpumask to cpu_possible_mask for clock event device

12 years agodrivers/rtc/rtc-tps65910.c: fix invalid pointer access on _remove()
Kim, Milo [Thu, 29 Nov 2012 21:54:36 +0000 (13:54 -0800)]
drivers/rtc/rtc-tps65910.c: fix invalid pointer access on _remove()

The tps65910_rtc data is registered as the platform driver data in
_probe(= ).  Therefore the tps65910_rtc should be used on unregistering
the rtc device.  And device pointer should be retrieved from the
platform_device structure.

This patch fixes the below oops:

 Unable to handle kernel NULL pointer dereference at virtual address 00000008
 Modules linked in: rtc_tps65910(-)
 CPU: 0    Not tainted  (3.7.0-rc7-next-20121128-g6b1f974-dirty #7)
 PC is at tps65910_rtc_alarm_irq_enable+0x20/0x2c [rtc_tps65910]
     (tps65910_rtc_alarm_irq_enable+0x20/0x2c [rtc_tps65910])
     (tps65910_rtc_remove+0x18/0x28 [rtc_tps65910])
     (platform_drv_remove+0x18/0x1c)
     (__device_release_driver+0x70/0xcc)
     (driver_detach+0xb4/0xb8)
     (bus_remove_driver+0x7c/0xc0)
     (sys_delete_module+0x148/0x21c)

Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomm: soft offline: split thp at the beginning of soft_offline_page()
Naoya Horiguchi [Thu, 29 Nov 2012 21:54:34 +0000 (13:54 -0800)]
mm: soft offline: split thp at the beginning of soft_offline_page()

When we try to soft-offline a thp tail page, put_page() is called on the
tail page unthinkingly and VM_BUG_ON is triggered in put_compound_page().

This patch splits thp before going into the main body of soft-offlining.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomm: avoid waking kswapd for THP allocations when compaction is deferred or contended
Mel Gorman [Thu, 29 Nov 2012 21:54:30 +0000 (13:54 -0800)]
mm: avoid waking kswapd for THP allocations when compaction is deferred or contended

With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
based on failures" reverted, Zdenek Kabelac reported the following

  Hmm,  so it's just took longer to hit the problem and observe
  kswapd0 spinning on my CPU again - it's not as endless like before -
  but still it easily eats minutes - it helps to turn off  Firefox
  or TB  (memory hungry apps) so kswapd0 stops soon - and restart
  those apps again.  (And I still have like >1GB of cached memory)

  kswapd0         R  running task        0    30      2 0x00000000
  Call Trace:
    preempt_schedule+0x42/0x60
    _raw_spin_unlock+0x55/0x60
    put_super+0x31/0x40
    drop_super+0x22/0x30
    prune_super+0x149/0x1b0
    shrink_slab+0xba/0x510

The sysrq+m indicates the system has no swap so it'll never reclaim
anonymous pages as part of reclaim/compaction.  That is one part of the
problem but not the root cause as file-backed pages could also be
reclaimed.

The likely underlying problem is that kswapd is woken up or kept awake
for each THP allocation request in the page allocator slow path.

If compaction fails for the requesting process then compaction will be
deferred for a time and direct reclaim is avoided.  However, if there
are a storm of THP requests that are simply rejected, it will still be
the the case that kswapd is awake for a prolonged period of time as
pgdat->kswapd_max_order is updated each time.  This is noticed by the
main kswapd() loop and it will not call kswapd_try_to_sleep().  Instead
it will loopp, shrinking a small number of pages and calling
shrink_slab() on each iteration.

This patch defers when kswapd gets woken up for THP allocations.  For
!THP allocations, kswapd is always woken up.  For THP allocations,
kswapd is woken up iff the process is willing to enter into direct
reclaim/compaction.

[akpm@linux-foundation.org: fix typo in comment]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Zdenek Kabelac <zkabelac@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Glauber Costa <glommer@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agorevert "Revert "mm: remove __GFP_NO_KSWAPD""
Andrew Morton [Thu, 29 Nov 2012 21:54:27 +0000 (13:54 -0800)]
revert "Revert "mm: remove __GFP_NO_KSWAPD""

It apepars that this patch was innocent, and we hope that "mm: avoid
waking kswapd for THP allocations when compaction is deferred or
contended" will fix the final kswapd-spinning cause.

Cc: Zdenek Kabelac <zkabelac@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomm: vmscan: fix endless loop in kswapd balancing
Johannes Weiner [Thu, 29 Nov 2012 21:54:23 +0000 (13:54 -0800)]
mm: vmscan: fix endless loop in kswapd balancing

Kswapd does not in all places have the same criteria for a balanced
zone.  Zones are only being reclaimed when their high watermark is
breached, but compaction checks loop over the zonelist again when the
zone does not meet the low watermark plus two times the size of the
allocation.  This gets kswapd stuck in an endless loop over a small
zone, like the DMA zone, where the high watermark is smaller than the
compaction requirement.

Add a function, zone_balanced(), that checks the watermark, and, for
higher order allocations, if compaction has enough free memory.  Then
use it uniformly to check for balanced zones.

This makes sure that when the compaction watermark is not met, at least
reclaim happens and progress is made - or the zone is declared
unreclaimable at some point and skipped entirely.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: George Spelvin <linux@horizon.com>
Reported-by: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Reported-by: Tomas Racek <tracek@redhat.com>
Tested-by: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomm/vmemmap: fix wrong use of virt_to_page
Jianguo Wu [Thu, 29 Nov 2012 21:54:21 +0000 (13:54 -0800)]
mm/vmemmap: fix wrong use of virt_to_page

I enable CONFIG_DEBUG_VIRTUAL and CONFIG_SPARSEMEM_VMEMMAP, when doing
memory hotremove, there is a kernel BUG at arch/x86/mm/physaddr.c:20.

It is caused by free_section_usemap()->virt_to_page(), virt_to_page() is
only used for kernel direct mapping address, but sparse-vmemmap uses
vmemmap address, so it is going wrong here.

  ------------[ cut here ]------------
  kernel BUG at arch/x86/mm/physaddr.c:20!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: acpihp_drv acpihp_slot edd cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf fuse vfat fat loop dm_mod coretemp kvm crc32c_intel ipv6 ixgbe igb iTCO_wdt i7core_edac edac_core pcspkr iTCO_vendor_support ioatdma microcode joydev sr_mod i2c_i801 dca lpc_ich mfd_core mdio tpm_tis i2c_core hid_generic tpm cdrom sg tpm_bios rtc_cmos button ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif processor thermal_sys hwmon scsi_dh_alua scsi_dh_hp_sw scsi_dh_rdac scsi_dh_emc scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
  CPU 39
  Pid: 6454, comm: sh Not tainted 3.7.0-rc1-acpihp-final+ #45 QCI QSSC-S4R/QSSC-S4R
  RIP: 0010:[<ffffffff8103c908>]  [<ffffffff8103c908>] __phys_addr+0x88/0x90
  RSP: 0018:ffff8804440d7c08  EFLAGS: 00010006
  RAX: 0000000000000006 RBX: ffffea0012000000 RCX: 000000000000002c
  ...

Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reviewd-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomm: compaction: fix return value of capture_free_page()
Mel Gorman [Thu, 29 Nov 2012 21:54:20 +0000 (13:54 -0800)]
mm: compaction: fix return value of capture_free_page()

Commit ef6c5be658f6 ("fix incorrect NR_FREE_PAGES accounting (appears
like memory leak)") fixes a NR_FREE_PAGE accounting leak but missed the
return value which was also missed by this reviewer until today.

That return value is used by compaction when adding pages to a list of
isolated free pages and without this follow-up fix, there is a risk of
free list corruption.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
John W. Linville [Fri, 30 Nov 2012 16:27:32 +0000 (11:27 -0500)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem

12 years agofix off-by-one in argument passed by iterate_fd() to callbacks
Al Viro [Fri, 30 Nov 2012 03:57:33 +0000 (22:57 -0500)]
fix off-by-one in argument passed by iterate_fd() to callbacks

Noticed by Pavel Roskin; the thing in his patch I disagree with
was compensating for that shite in callbacks instead of fixing
it once in the iterator itself.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agolookup_one_len: don't accept . and ..
Al Viro [Fri, 30 Nov 2012 03:17:21 +0000 (22:17 -0500)]
lookup_one_len: don't accept . and ..

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agocifs: get rid of blind d_drop() in readdir
Al Viro [Fri, 30 Nov 2012 03:11:06 +0000 (22:11 -0500)]
cifs: get rid of blind d_drop() in readdir

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agonfs_lookup_revalidate(): fix a leak
Al Viro [Fri, 30 Nov 2012 03:04:36 +0000 (22:04 -0500)]
nfs_lookup_revalidate(): fix a leak

We are leaking fattr and fhandle if we decide that dentry is not to
be invalidated, after all (e.g. happens to be a mountpoint).  Just
free both before that...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agodon't do blind d_drop() in nfs_prime_dcache()
Al Viro [Fri, 30 Nov 2012 03:00:51 +0000 (22:00 -0500)]
don't do blind d_drop() in nfs_prime_dcache()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agoblkdev_max_block: make private to fs/buffer.c
Linus Torvalds [Thu, 29 Nov 2012 20:31:52 +0000 (12:31 -0800)]
blkdev_max_block: make private to fs/buffer.c

We really don't want to look at the block size for the raw block device
accesses in fs/block-dev.c, because it may be changing from under us.
So get rid of the max_block logic entirely, since the caller should
already have done it anyway.

That leaves the only user of this function in fs/buffer.c, so move the
whole function there and make it static.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agodirect-io: don't read inode->i_blkbits multiple times
Linus Torvalds [Thu, 29 Nov 2012 20:27:00 +0000 (12:27 -0800)]
direct-io: don't read inode->i_blkbits multiple times

Since directio can work on a raw block device, and the block size of the
device can change under it, we need to do the same thing that
fs/buffer.c now does: read the block size a single time, using
ACCESS_ONCE().

Reading it multiple times can get different results, which will then
confuse the code because it actually encodes the i_blksize in
relationship to the underlying logical blocksize.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoblockdev: remove bd_block_size_semaphore again
Linus Torvalds [Thu, 29 Nov 2012 18:49:50 +0000 (10:49 -0800)]
blockdev: remove bd_block_size_semaphore again

This reverts the block-device direct access code to the previous
unlocked code, now that fs/buffer.c no longer needs external locking.

With this, fs/block_dev.c is back to the original version, apart from a
whitespace cleanup that I didn't want to revert.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agofs/buffer.c: make block-size be per-page and protected by the page lock
Linus Torvalds [Thu, 29 Nov 2012 18:21:43 +0000 (10:21 -0800)]
fs/buffer.c: make block-size be per-page and protected by the page lock

This makes the buffer size handling be a per-page thing, which allows us
to not have to worry about locking too much when changing the buffer
size.  If a page doesn't have buffers, we still need to read the block
size from the inode, but we can do that with ACCESS_ONCE(), so that even
if the size is changing, we get a consistent value.

This doesn't convert all functions - many of the buffer functions are
used purely by filesystems, which in turn results in the buffer size
being fixed at mount-time.  So they don't have the same consistency
issues that the raw device access can have.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agobonding: fix race condition in bonding_store_slaves_active
nikolay@redhat.com [Thu, 29 Nov 2012 01:37:59 +0000 (01:37 +0000)]
bonding: fix race condition in bonding_store_slaves_active

Race between bonding_store_slaves_active() and slave manipulation
 functions. The bond_for_each_slave use in bonding_store_slaves_active()
 is not protected by any synchronization mechanism.
 NULL pointer dereference is easy to reach.
 Fixed by acquiring the bond->lock for the slave walk.

 v2: Make description text < 75 columns

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobonding: make arp_ip_target parameter checks consistent with sysfs
nikolay@redhat.com [Thu, 29 Nov 2012 01:34:06 +0000 (01:34 +0000)]
bonding: make arp_ip_target parameter checks consistent with sysfs

The module can be loaded with arp_ip_target="255.255.255.255" which makes
 it impossible to remove as the function in sysfs checks for that value,
 so we make the parameter checks consistent with sysfs.

 v2: Fix formatting
 v3: Make description text < 75 columns

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobonding: fix miimon and arp_interval delayed work race conditions
nikolay@redhat.com [Thu, 29 Nov 2012 01:31:31 +0000 (01:31 +0000)]
bonding: fix miimon and arp_interval delayed work race conditions

First I would give three observations which will be used later.
Observation 1: if (delayed_work_pending(wq)) cancel_delayed_work(wq)
 This usage is wrong because the pending bit is cleared just before the
 work's fn is executed and if the function re-arms itself we might end up
 with the work still running. It's safe to call cancel_delayed_work_sync()
 even if the work is not queued at all.
Observation 2: Use of INIT_DELAYED_WORK()
 Work needs to be initialized only once prior to (de/en)queueing.
Observation 3: IFF_UP is set only after ndo_open is called

Related race conditions:
1. Race between bonding_store_miimon() and bonding_store_arp_interval()
 Because of Obs.1 we can end up having both works enqueued.
2. Multiple races with INIT_DELAYED_WORK()
 Since the works are not protected by anything between INIT_DELAYED_WORK()
 and calls to (en/de)queue it is possible for races between the following
 functions:
 (races are also possible between the calls to INIT_DELAYED_WORK()
  and workqueue code)
 bonding_store_miimon() - bonding_store_arp_interval(), bond_close(),
  bond_open(), enqueued functions
 bonding_store_arp_interval() - bonding_store_miimon(), bond_close(),
bond_open(), enqueued functions
3. By Obs.1 we need to change bond_cancel_all()

Bugs 1 and 2 are fixed by moving all work initializations in bond_open
which by Obs. 2 and Obs. 3 and the fact that we make sure that all works
are cancelled in bond_close(), is guaranteed not to have any work
enqueued.
Also RTNL lock is now acquired in bonding_store_miimon/arp_interval so
they can't race with bond_close and bond_open. The opposing work is
cancelled only if the IFF_UP flag is set and it is cancelled
unconditionally. The opposing work is already cancelled if the interface
is down so no need to cancel it again. This way we don't need new
synchronizations for the bonding workqueue. These bugs (and fixes) are
tied together and belong in the same patch.
Note: I have left 1 line intentionally over 80 characters (84) because I
      didn't like how it looks broken down. If you'd prefer it otherwise,
      then simply break it.

 v2: Make description text < 75 columns

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'v3.7-samsung-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel...
Arnd Bergmann [Thu, 29 Nov 2012 14:07:27 +0000 (15:07 +0100)]
Merge branch 'v3.7-samsung-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes

From Kukjin Kim <kgene.kim@samsung.com>:

Samsung fixes for v3.7

* 'v3.7-samsung-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: S3C24XX: Fix potential NULL pointer dereference error

This would have been ok to delay to 3.8 according to Kukjin, but since
it's an obvious bug fix and a potential NULL pointer dereference, it
seem appropriate for a late 3.7 submission.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
12 years agoremoteproc: fix error path of ->find_vqs
Ohad Ben-Cohen [Mon, 12 Nov 2012 09:13:51 +0000 (11:13 +0200)]
remoteproc: fix error path of ->find_vqs

Eliminate an erroneous invocation of rproc_shutdown inside
the error path of rproc_virtio_find_vqs.

Reported-by: Ido Yariv <ido@wizery.com>
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 29 Nov 2012 05:54:07 +0000 (21:54 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:
 "Some more fixes trickled in over the past few days:

   1) PIM device names can overflow the IFNAMSIZ buffer unless we
      properly limit the allowed indexes, fix from Eric Dumazet.

   2) Under heavy load we can OOPS in icmp reply processing due to an
      unchecked inet_putpeer() call.  Fix from Neal Cardwell.

   3) SCTP round trip calculations need to use 64-bit math to avoid
      overflows, fix from Schoch Christian.

   4) Fix a memory leak and an error return flub in SCTP and IRDA
      triggerable by userspace.  Fix from Tommi Rantala and found by the
      syscall fuzzer (trinity).

   5) MLX4 driver gives bogus size to memcpy() call, fix from Amir
      Vadai.

   6) Fix length calculation in VHOST descriptor translation, from
      Michael S Tsirkin.

   7) Ambassador ATM driver loops forever while loading firmware, fix
      from Dan Carpenter.

   8) Over MTU packets in openvswitch warn about wrong device, fix from
      Jesse Gross.

   9) Netfilter IPSET's netlink code can overrun a string buffer because
      it's not properly limited to IFNAMSIZ.  Fix from Florian Westphal.

  10) PCAN USB driver sets wrong timestamp in SKB, from Oliver Hartkopp.

  11) Make sure the RX ifindex always has a valid value in the CAN BCM
      driver, even if we haven't received a frame yet.  Fix also from
      Oliver Hartkopp."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  team: fix hw_features setup
  atm: forever loop loading ambassador firmware
  vhost: fix length for cross region descriptor
  irda: irttp: fix memory leak in irttp_open_tsap() error path
  net: qmi_wwan: add Huawei E173
  net/mlx4_en: Can set maxrate only for TC0
  sctp: Error in calculation of RTTvar
  sctp: fix -ENOMEM result with invalid user space pointer in sendto() syscall
  sctp: fix memory leak in sctp_datamsg_from_user() when copy from user space fails
  net: ipmr: limit MRT_TABLE identifiers
  ipv4: avoid passing NULL to inet_putpeer() in icmpv4_xrlim_allow()
  can: bcm: initialize ifindex for timeouts without previous frame reception
  can: peak_usb: fix hwtstamp assignment
  netfilter: ipset: fix netiface set name overflow
  openvswitch: Store flow key len if ARP opcode is not request or reply.
  openvswitch: Print device when warning about over MTU packets.

12 years agomicroblaze: rt_sigreturn is too trigger-happy about sigaltstack errors
Al Viro [Tue, 20 Nov 2012 15:41:11 +0000 (10:41 -0500)]
microblaze: rt_sigreturn is too trigger-happy about sigaltstack errors

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>