Sheng Yang [Tue, 11 Nov 2008 07:30:40 +0000 (15:30 +0800)]
KVM: Fix kernel allocated memory slot
Commit 7fd49de9773fdcb7b75e823b21c1c5dc1e218c14 "KVM: ensure that memslot
userspace addresses are page-aligned" broke kernel space allocated memory
slot, for the userspace_addr is invalid.
Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Fri, 7 Nov 2008 19:32:12 +0000 (13:32 -0600)]
KVM: ensure that memslot userspace addresses are page-aligned
Bad page translation and silent guest failure ensue if the userspace address is
not page-aligned. I hit this problem using large (host) pages with qemu,
because qemu currently has a hardcoded 4096-byte alignment for guest memory
allocations.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Xiantao Zhang [Sat, 8 Nov 2008 07:46:59 +0000 (15:46 +0800)]
KVM: ia64: fix vmm_spin_{un}lock for !CONFIG_SMP
In the case of !CONFIG_SMP, raw_spinlock_t is empty and the spinlock functions
don't build. Fix by defining spinlock functions for the uniprocessor case.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Nitin A Kamble [Wed, 5 Nov 2008 23:56:21 +0000 (15:56 -0800)]
KVM: Fix cpuid iteration on multiple leaves per eac
The code to traverse the cpuid data array list for counting type of leaves is
currently broken.
This patches fixes the 2 things in it.
1. Set the 1st counting entry's flag KVM_CPUID_FLAG_STATE_READ_NEXT. Without
it the code will never find a valid entry.
2. Also the stop condition in the for loop while looking for the next unflaged
entry is broken. It needs to stop when it find one matching entry;
and in the case of count of 1, it will be the same entry found in this
iteration.
Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Nitin A Kamble [Wed, 5 Nov 2008 23:37:36 +0000 (15:37 -0800)]
KVM: Fix cpuid leaf 0xb loop termination
For cpuid leaf 0xb the bits 8-15 in ECX register define the end of counting
leaf. The previous code was using bits 0-7 for this purpose, which is
a bug.
Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Thu, 6 Nov 2008 06:55:45 +0000 (14:55 +0800)]
KVM: VMX: Set IGMT bit in EPT entry
There is a potential issue that, when guest using pagetable without vmexit when
EPT enabled, guest would use PAT/PCD/PWT bits to index PAT msr for it's memory,
which would be inconsistent with host side and would cause host MCE due to
inconsistent cache attribute.
The patch set IGMT bit in EPT entry to ignore guest PAT and use WB as default
memory type to protect host (notice that all memory mapped by KVM should be WB).
Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:23 +0000 (09:36 -0600)]
KVM: ppc: optimize irq delivery path
In kvmppc_deliver_interrupt is just one case left in the switch and it is a
rare one (less than 8%) when looking at the exit numbers. Therefore we can
at least drop the switch/case and if an if. I inserted an unlikely too, but
that's open for discussion.
In kvmppc_can_deliver_interrupt all frequent cases are in the default case.
I know compilers are smart but we can make it easier for them. By writing
down all options and removing the default case combined with the fact that
ithe values are constants 0..15 should allow the compiler to write an easy
jump table.
Modifying kvmppc_can_deliver_interrupt pointed me to the fact that gcc seems
to be unable to reduce priority_exception[x] to a build time constant.
Therefore I changed the usage of the translation arrays in the interrupt
delivery path completely. It is now using priority without translation to irq
on the full irq delivery path.
To be able to do that ivpr regs are stored by their priority now.
Additionally the decision made in kvmppc_can_deliver_interrupt is already
sufficient to get the value of interrupt_msr_mask[x]. Therefore we can replace
the 16x4byte array used here with a single 4byte variable (might still be one
miss, but the chance to find this in cache should be better than the right
entry of the whole array).
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:21 +0000 (09:36 -0600)]
KVM: ppc: optimize kvm stat handling
Currently we use an unnecessary if&switch to detect some cases.
To be honest we don't need the ligh_exits counter anyway, because we can
calculate it out of others. Sum_exits can also be calculated, so we can
remove that too.
MMIO, DCR and INTR can be counted on other places without these
additional control structures (The INTR case was never hit anyway).
The handling of BOOKE_INTERRUPT_EXTERNAL/BOOKE_INTERRUPT_DECREMENTER is
similar, but we can avoid the additional if when copying 3 lines of code.
I thought about a goto there to prevent duplicate lines, but rewriting three
lines should be better style than a goto cross switch/case statements (its
also not enough code to justify a new inline function).
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:20 +0000 (09:36 -0600)]
KVM: ppc: fix set regs to take care of msr change
When changing some msr bits e.g. problem state we need to take special
care of that. We call the function in our mtmsr emulation (not needed for
wrtee[i]), but we don't call kvmppc_set_msr if we change msr via set_regs
ioctl.
It's a corner case we never hit so far, but I assume it should be
kvmppc_set_msr in our arch set regs function (I found it because it is also
a corner case when using pv support which would miss the update otherwise).
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:18 +0000 (09:36 -0600)]
KVM: ppc: create struct kvm_vcpu_44x and introduce container_of() accessor
This patch doesn't yet move all 44x-specific data into the new structure, but
is the first step down that path. In the future we may also want to create a
struct kvm_vcpu_booke.
Based on patch from Liu Yu <yu.liu@freescale.com>.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:16 +0000 (09:36 -0600)]
KVM: ppc: refactor instruction emulation into generic and core-specific pieces
Cores provide 3 emulation hooks, implemented for example in the new
4xx_emulate.c:
kvmppc_core_emulate_op
kvmppc_core_emulate_mtspr
kvmppc_core_emulate_mfspr
Strictly speaking the last two aren't necessary, but provide for more
informative error reporting ("unknown SPR").
Long term I'd like to have instruction decoding autogenerated from tables of
opcodes, and that way we could aggregate universal, Book E, and core-specific
instructions more easily and without redundant switch statements.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:15 +0000 (09:36 -0600)]
ppc: Create disassemble.h to extract instruction fields
This is used in a couple places in KVM, but isn't KVM-specific.
However, this patch doesn't modify other in-kernel emulation code:
- xmon uses a direct copy of ppc_opc.c from binutils
- emulate_instruction() doesn't need it because it can use a series
of mask tests.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Avi Kivity <avi@redhat.com>
Hollis Blanchard [Wed, 5 Nov 2008 15:36:14 +0000 (09:36 -0600)]
KVM: ppc: Refactor powerpc.c to relocate 440-specific code
This introduces a set of core-provided hooks. For 440, some of these are
implemented by booke.c, with the rest in (the new) 44x.c.
Note that these hooks are link-time, not run-time. Since it is not possible to
build a single kernel for both e500 and 440 (for example), using function
pointers would only add overhead.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Izik Eidus [Fri, 3 Oct 2008 14:40:32 +0000 (17:40 +0300)]
KVM: MMU: Fix aliased gfns treated as unaliased
Some areas of kvm x86 mmu are using gfn offset inside a slot without
unaliasing the gfn first. This patch makes sure that the gfn will be
unaliased and add gfn_to_memslot_unaliased() to save the calculating
of the gfn unaliasing in case we have it unaliased already.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
xfrm: Fix xfrm_policy_gc_lock handling.
niu: Use pci_ioremap_bar().
bnx2x: Version Update
bnx2x: Calling netif_carrier_off at the end of the probe
bnx2x: PCI configuration bug on big-endian
bnx2x: Removing the PMF indication when unloading
mv643xx_eth: fix SMI bus access timeouts
net: kconfig cleanup
fs_enet: fix polling
XFRM: copy_to_user_kmaddress() reports local address twice
SMC91x: Fix compilation on some platforms.
udp: Fix the SNMP counter of UDP_MIB_INERRORS
udp: Fix the SNMP counter of UDP_MIB_INDATAGRAMS
drivers/net/smc911x.c: Fix lockdep warning on xmit.
Linus Torvalds [Tue, 4 Nov 2008 16:19:01 +0000 (08:19 -0800)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: mask off DET when restoring SControl for detach
libata: implement ATA_HORKAGE_ATAPI_MOD16_DMA and apply it
libata: Fix a potential race condition in ata_scsi_park_show()
sata_nv: fix generic, nf2/3 detection regression
sata_via: restore vt*_prepare_host error handling
sata_promise: add ATA engine reset to reset ops
Tejun Heo [Mon, 3 Nov 2008 10:27:07 +0000 (19:27 +0900)]
libata: mask off DET when restoring SControl for detach
libata restores SControl on detach; however, trying to restore
non-zero DET can cause undeterministic behavior including PMP device
going offline till power cycling. Mask off DET when restoring
SControl.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Mon, 3 Nov 2008 10:01:09 +0000 (19:01 +0900)]
libata: implement ATA_HORKAGE_ATAPI_MOD16_DMA and apply it
libata always uses PIO for ATAPI commands when the number of bytes to
transfer isn't multiple of 16 but quantum DAT72 chokes on odd bytes
PIO transfers. Implement a horkage to skip the mod16 check and apply
it to the quantum device.
This is reported by John Clark in the following thread.
http://thread.gmane.org/gmane.linux.ide/34748
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: John Clark <clarkjc@runbox.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Elias Oltmanns [Mon, 3 Nov 2008 10:01:08 +0000 (19:01 +0900)]
libata: Fix a potential race condition in ata_scsi_park_show()
Peter Moulder has pointed out that there is a slight chance that a
negative value might be passed to jiffies_to_msecs() in
ata_scsi_park_show(). This is fixed by saving the value of jiffies in a
local variable, thus also reducing code since the volatile variable
jiffies is accessed only once.
Signed-off-by: Elias Oltmanns <eo@nebensachen.de> Signed-off-by: Tejun Heo <tj.kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Mon, 3 Nov 2008 03:37:49 +0000 (12:37 +0900)]
sata_nv: fix generic, nf2/3 detection regression
All three flavors of sata_nv's are different in how their hardreset
behaves.
* generic: Hardreset is not reliable. Link often doesn't come online
after hardreset.
* nf2/3: A little bit better - link comes online with longer debounce
timing. However, nf2/3 can't reliable wait for the first D2H
Register FIS, so it can't wait for device readiness or classify the
device after hardreset. Follow-up SRST required.
* ck804: Hardreset finally works.
The core layer change to prefer hardreset and follow up changes
exposed the above issues and caused various detection regressions for
all three flavors. This patch, hopefully, fixes all the known issues
and should make sata_nv error handling more reliable.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
catched by gcc:
drivers/ata/sata_via.c: In function 'svia_init_one':
drivers/ata/sata_via.c:567: warning: 'host' may be used uninitialized in this function
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Joseph Chan <JosephChan@via.com.tw> Cc: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Mikael Pettersson [Fri, 31 Oct 2008 07:03:55 +0000 (08:03 +0100)]
sata_promise: add ATA engine reset to reset ops
Promise ATA engines need to be reset when errors occur.
That's currently done for errors detected by sata_promise itself,
but it's not done for errors like timeouts detected outside of
the low-level driver.
The effect of this omission is that a timeout tends to result
in a sequence of failed COMRESETs after which libata EH gives
up and disables the port. At that point the port's ATA engine
hangs and even reloading the driver will not resume it.
To fix this, make sata_promise override ->hardreset on SATA
ports with code which calls pdc_reset_port() on the port in
question before calling libata's hardreset. PATA ports don't
use ->hardreset, so for those we override ->softreset instead.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Eilon Greenstein [Tue, 4 Nov 2008 00:46:40 +0000 (16:46 -0800)]
bnx2x: Calling netif_carrier_off at the end of the probe
netif_carrier_off was called too early at the probe. In case of failure
or simply bad timing, this can cause a fatal error since linkwatch_event
might run too soon.
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eilon Greenstein [Tue, 4 Nov 2008 00:46:19 +0000 (16:46 -0800)]
bnx2x: PCI configuration bug on big-endian
The current code read nothing but zeros on big-endian (wrong part of the
32bits). This caused poor performance on big-endian machines. Though this
issue did not cause the system to crash, the performance is significantly
better with the fix so I view it as critical bug fix.
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eilon Greenstein [Tue, 4 Nov 2008 00:45:55 +0000 (16:45 -0800)]
bnx2x: Removing the PMF indication when unloading
When the PMF flag is set, the driver can access the HW freely. When the
driver is unloaded, it should not access the HW. The problem caused fatal
errors when "ethtool -i" was called after the calling instance was unloaded
and another instance was already loaded
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Lennert Buytenhek [Sat, 1 Nov 2008 05:32:20 +0000 (06:32 +0100)]
mv643xx_eth: fix SMI bus access timeouts
The mv643xx_eth mii bus implementation uses wait_event_timeout() to
wait for SMI completion interrupts.
If wait_event_timeout() would return zero, mv643xx_eth would conclude
that the SMI access timed out, but this is not necessarily true --
wait_event_timeout() can also return zero in the case where the SMI
completion interrupt did happen in time but where it took longer than
the requested timeout for the process performing the SMI access to be
scheduled again. This would lead to occasional SMI access timeouts
when the system would be under heavy load.
The fix is to ignore the return value of wait_event_timeout(), and
to re-check the SMI done bit after wait_event_timeout() returns to
determine whether or not the SMI access timed out.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix renaming one hardlink on top of another
[CIFS] fix error in smb_send2
[CIFS] Reduce number of socket retries in large write path
Jeff Layton [Mon, 3 Nov 2008 19:05:08 +0000 (14:05 -0500)]
cifs: fix renaming one hardlink on top of another
cifs: fix renaming one hardlink on top of another
POSIX says that renaming one hardlink on top of another to the same
inode is a no-op. We had the logic mostly right, but forgot to clear
the return code.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
Linus Torvalds [Mon, 3 Nov 2008 18:21:40 +0000 (10:21 -0800)]
Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing, ring-buffer: add paranoid checks for loops
ftrace: use kretprobe trampoline name to test in output
tracing, alpha: undefined reference to `save_stack_trace'
Linus Torvalds [Mon, 3 Nov 2008 18:15:40 +0000 (10:15 -0800)]
Merge branch 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
io mapping: clean up #ifdefs
io mapping: improve documentation
i915: use io-mapping interfaces instead of a variety of mapping kludges
resources: add io-mapping functions to dynamically map large device apertures
x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps
Linus Torvalds [Mon, 3 Nov 2008 18:14:59 +0000 (10:14 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda: make a STAC_DELL_EQ option
ALSA: emu10k1 - Add more invert_shared_spdif flag to Audigy models
ALSA: hda - Add a quirk for another Acer Aspire (1025:0090)
ALSA: remove direct access of dev->bus_id in sound/isa/*
sound: struct device - replace bus_id with dev_name(), dev_set_name()
ALSA: Fix PIT lockup on some chipsets when using the PC-Speaker
ALSA: rawmidi - Add open check in rawmidi callbacks
ALSA: hda - Add digital-mic for ALC269 auto-probe mode
ALSA: hda - Disable broken mic auto-muting in Realtek codes
Linus Torvalds [Mon, 3 Nov 2008 17:58:40 +0000 (09:58 -0800)]
Merge branch 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
i915: Add GEM ioctl to get available aperture size.
drm/radeon: fixup further bus mastering confusion.
build fix: CONFIG_DRM_I915=y && CONFIG_ACPI=n
Steven Rostedt [Fri, 31 Oct 2008 13:58:35 +0000 (09:58 -0400)]
tracing, ring-buffer: add paranoid checks for loops
While writing a new tracer, I had a bug where I caused the ring-buffer
to recurse in a bad way. The bug was with the tracer I was writing
and not the ring-buffer itself. But it took a long time to find the
problem.
This patch adds paranoid checks into the ring-buffer infrastructure
that will catch bugs of this nature.
Note: I put the bug back in the tracer and this patch showed the error
nicely and prevented the lockup.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Fri, 31 Oct 2008 19:44:07 +0000 (15:44 -0400)]
ftrace: use kretprobe trampoline name to test in output
Impact: ia64+tracing build fix
When a function is kprobed, the return address is set to the
kprobe_trampoline, or something similar. This caused the output
of the trace to look confusing when the parent seemed to be this
"kprobe_trampoline" function.
To fix this, Abhishek Sagar added a test of the instruction pointer
of the parent to see if it matched the kprobe_trampoline. If it
did, the output would print a "[unknown/kretprobe'd]" instead.
Unfortunately, not all archs do this the same way, and the trampoline
function may not be exported, which causes failures in builds.
This patch will compare the name instead of the pointer to see
if it matches. This prevents us from depending on a function from
being exported, and should work on all archs. The worst that can
happen is that an arch might use a different name and then we
go back to the confusing output. At least the arch will still build.
Arnaud Ebalard [Mon, 3 Nov 2008 09:30:23 +0000 (01:30 -0800)]
XFRM: copy_to_user_kmaddress() reports local address twice
While adding support for MIGRATE/KMADDRESS in strongSwan (as specified
in draft-ebalard-mext-pfkey-enhanced-migrate-00), Andreas Steffen
noticed that XFRMA_KMADDRESS attribute passed to userland contains the
local address twice (remote provides local address instead of remote
one).
This bug in copy_to_user_kmaddress() affects only key managers that use
native XFRM interface (key managers that use PF_KEY are not affected).
For the record, the bug was in the initial changeset I posted which
added support for KMADDRESS (13c1d18931ebb5cf407cb348ef2cd6284d68902d
'xfrm: MIGRATE enhancements (draft-ebalard-mext-pfkey-enhanced-migrate)').
Signed-off-by: Arnaud Ebalard <arno@natisbad.org> Reported-by: Andreas Steffen <andreas.steffen@strongswan.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Fri, 31 Oct 2008 19:50:41 +0000 (19:50 +0000)]
tracing, alpha: undefined reference to `save_stack_trace'
Impact: build fix on !stacktrace architectures
only select STACKTRACE on architectures that have STACKTRACE_SUPPORT
... since we also need to ifdef out the guts of ftrace_trace_stack().
We also want to disallow setting TRACE_ITER_STACKTRACE in trace_flags
on such configs, but that can wait.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Takashi Iwai [Mon, 3 Nov 2008 09:07:43 +0000 (10:07 +0100)]
ALSA: hda - Add a quirk for another Acer Aspire (1025:0090)
Added a quirk for another Acer Aspier laptop (1025:0090) with ALC883
codec. Reported in Novell bnc#426935:
https://bugzilla.novell.com/show_bug.cgi?id=426935
David S. Miller [Mon, 3 Nov 2008 08:04:24 +0000 (00:04 -0800)]
SMC91x: Fix compilation on some platforms.
This reverts 51ac3beffd4afaea4350526cf01fe74aaff25eff ('SMC91x: delete
unused local variable "lp"') and adds __maybe_unused markers to these
(potentially) unused variables.
The issue is that in some configurations SMC_IO_SHIFT evaluates
to '(lp->io_shift)', but in some others it's plain '0'.
Based upon a build failure report from Manuel Lauss.
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Sun, 2 Nov 2008 16:14:27 +0000 (16:14 +0000)]
udp: Fix the SNMP counter of UDP_MIB_INERRORS
UDP packets received in udpv6_recvmsg() are not only IPv6 UDP packets, but
also have IPv4 UDP packets, so when do the counter of UDP_MIB_INERRORS in
udpv6_recvmsg(), we should check whether the packet is a IPv6 UDP packet
or a IPv4 UDP packet.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Sun, 2 Nov 2008 16:11:01 +0000 (16:11 +0000)]
udp: Fix the SNMP counter of UDP_MIB_INDATAGRAMS
If UDP echo is sent to xinetd/echo-dgram, the UDP reply will be received
at the sender. But the SNMP counter of UDP_MIB_INDATAGRAMS will be not
increased, UDP6_MIB_INDATAGRAMS will be increased instead.
Endpoint A Endpoint B
UDP Echo request ----------->
(IPv4, Dst port=7)
<---------- UDP Echo Reply
(IPv4, Src port=7)
It do counter UDP[6]_MIB_INDATAGRAMS until udp[v6]_recvmsg. Because
xinetd used IPv6 socket to receive UDP messages, thus, when received
UDP packet, the UDP6_MIB_INDATAGRAMS will be increased in function
udpv6_recvmsg() even if the packet is a IPv4 UDP packet.
This patch fixed the problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
ide-gd: re-get capacity on revalidate
tx4938ide: Avoid underflow on calculation of a wait cycle
tx4938ide: Do not call devm_ioremap for whole 128KB
tx4938ide: Check minimum cycle time and SHWT range (v2)
ide: Switch to a common address
ide-cd: fix DMA alignment regression
Atsushi Nemoto [Sun, 2 Nov 2008 20:40:09 +0000 (21:40 +0100)]
tx4938ide: Check minimum cycle time and SHWT range (v2)
SHWT value is used as address valid to -CSx assertion and -CSx to -DIOx
assertion setup time, and contrarywise, -DIOx to -CSx release and -CSx
release to address invalid hold time, so it actualy applies 4 times and
so constitutes -DIOx recovery time. Check requirement of the recovery
time and cycle time. Also check SHWT maximum value.
Suggested-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: ralf@linux-mips.org Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Borislav Petkov [Sun, 2 Nov 2008 20:40:07 +0000 (21:40 +0100)]
ide-cd: fix DMA alignment regression
e5318b531b008c79d2a0c0df06a7b8628da38e2f ("ide: use the dma safe check for
REQ_TYPE_ATA_PC") introduced a regression which caused some ATAPI drives to
turn off DMA for REQ_TYPE_BLOCK_PC commands while burning and thus degrading
performance and ultimately causing an excessive amount of underruns.
The issue is documented also in:
http://bugzilla.kernel.org/show_bug.cgi?id=11742.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Tested-by: Valerio Passini <valerio.passini@unicam.it>
[bart: fixup patch description per comments from Sergei Shtylyov] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits)
af_unix: netns: fix problem of return value
IRDA: remove double inclusion of module.h
udp: multicast packets need to check namespace
net: add documentation for skb recycling
key: fix setkey(8) policy set breakage
bpa10x: free sk_buff with kfree_skb
xfrm: do not leak ESRCH to user space
net: Really remove all of LOOPBACK_TSO code.
netfilter: nf_conntrack_proto_gre: switch to register_pernet_gen_subsys()
netns: add register_pernet_gen_subsys/unregister_pernet_gen_subsys
net: delete excess kernel-doc notation
pppoe: Fix socket leak.
gianfar: Don't reset TBI<->SerDes link if it's already up
gianfar: Fix race in TBI/SerDes configuration
at91_ether: request/free GPIO for PHY interrupt
amd8111e: fix dma_free_coherent context
atl1: fix vlan tag regression
SMC91x: delete unused local variable "lp"
myri10ge: fix stop/go mmio ordering
bonding: fix panic when taking bond interface down before removing module
...
Will Newton [Tue, 28 Oct 2008 10:52:36 +0000 (10:52 +0000)]
drivers/net/smc911x.c: Fix lockdep warning on xmit.
dev_kfree_skb should not be called with irqs disabled, use dev_kfree_skb_irq
instead. The warning caused looks like this:
======================================================
[ INFO: hard-safe -> hard-unsafe lock order detected ]
2.6.28-rc1 #273
------------------------------------------------------
swapper/0 [HC0[0]:SC1[2]:HE0:SE0] is trying to acquire:
(clock-AF_INET){-..+}, at: [<4015c17c>] _sock_def_write_space+0x28/0xd8
and this task is already holding:
(&lp->lock){++..}, at: [<4013f230>] _smc911x_hard_start_xmit+0x30/0x4b8
which would create a new lock dependency:
(&lp->lock){++..} -> (clock-AF_INET){-..+}
Signed-off-by: Will Newton <will.newton@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Sheng Yang [Fri, 31 Oct 2008 04:37:41 +0000 (12:37 +0800)]
KVM: Enable Function Level Reset for assigned device
Ideally, every assigned device should in a clear condition before and after
assignment, so that the former state of device won't affect later work.
Some devices provide a mechanism named Function Level Reset, which is
defined in PCI/PCI-e document. We should execute it before and after device
assignment.
(But sadly, the feature is new, and most device on the market now don't
support it. We are considering using D0/D3hot transmit to emulate it later,
but not that elegant and reliable as FLR itself.)
[Update: Reminded by Xiantao, execute FLR after we ensure that the device can
be assigned to the guest.]
Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Rakib Mullick [Wed, 29 Oct 2008 21:13:39 +0000 (14:13 -0700)]
x86: KVM guest: fix section mismatch warning in kvmclock.c
WARNING: arch/x86/kernel/built-in.o(.text+0x1722c): Section mismatch
in reference from the function kvm_setup_secondary_clock() to the
function .devinit.text:setup_secondary_APIC_clock()
The function kvm_setup_secondary_clock() references
the function __devinit setup_secondary_APIC_clock().
This is often because kvm_setup_secondary_clock lacks a __devinit
annotation or the annotation of setup_secondary_APIC_clock is wrong.
Signed-off-by: Md.Rakib H. Mullick <rakib.mullick@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@redhat.com>
Max Dmitrichenko [Sun, 2 Nov 2008 07:34:10 +0000 (00:34 -0700)]
sparc64: Fix PCI resource mapping on sparc64
There is a problem discovered in recent versions of ATI Mach64 driver
in X.org on sparc64 architecture. In short, the driver fails to mmap
MMIO aperture (PCI resource #2).
I've found that kernel's __pci_mmap_make_offset() returns EINVAL. It
checks whether user attempts to mmap more than the resource length,
which is 0x1000 bytes in our case. But PAGE_SIZE on SPARC64 is 0x2000
and this is what actually is being mmaped. So __pci_mmap_make_offset()
failed for this PCI resource.
Signed-off-by: Max Dmitrichenko <dmitrmax@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Beregalov reports oops in __bzero() called from
copy_from_user_fixup() called from iov_iter_copy_from_user_atomic(),
when running dbench on tmpfs on sparc64: its __copy_from_user_inatomic
and __copy_to_user_inatomic should be avoiding, not calling, the fixups.
Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Sun, 2 Nov 2008 04:01:09 +0000 (21:01 -0700)]
net: add documentation for skb recycling
Commit 04a4bb55bcf35b63d40fd2725e58599ff8310dd7 ("net: add
skb_recycle_check() to enable netdriver skb recycling") added a
method for network drivers to recycle skbuffs, but while use of
this mechanism was documented in the commit message, it should
really have been added as a docbook comment as well -- this
patch does that.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Sat, 1 Nov 2008 18:20:39 +0000 (18:20 +0000)]
section fixes for cirrusfb
cirrusfb_zorro_unmap() may be called both from __devexit and (on
cleanup path) from __devinit. So it needs to be a normal function,
same as for cirrusfb_pci_unmap()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Sat, 1 Nov 2008 18:19:49 +0000 (18:19 +0000)]
oss: fix O_NONBLOCK in dmasound_core
We broke O_NONBLOCK handling in OSS dmasound_core in 2.3.11-pre3 - the
original code copied f_flags to open_mode and then checked for
O_NONBLOCK in there, but that got changed to copying f_mode and
O_NONBLOCK has not reached that field in any kernel version.
Since we do not care for any other bits, the fix is obvious...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 1 Nov 2008 17:36:30 +0000 (10:36 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix AMDC1E and XTOPOLOGY conflict in cpufeature
x86: build fix
by people adding the <linux/delay.h> include in two slightly different
places. Andrew's quilt scripts happily ignore the fuzz, and will
re-apply the patch even though they had conflicts.
Linus Torvalds [Sat, 1 Nov 2008 17:17:22 +0000 (10:17 -0700)]
x86: Clean up late e820 resource allocation
This makes the late e820 resources use 'insert_resource_expand_to_fit()'
instead of doing a 'reserve_region_with_split()', and also avoids
marking them as IORESOURCE_BUSY.
This results in us being perfectly happy to use pre-existing PCI
resources even if they were marked as being in a reserved region, while
still avoiding any _new_ allocations in the reserved regions. It also
makes for a simpler and more accurate resource tree.
Example resource allocation from Jonathan Corbet, who has firmware that
has an e820 reserved entry that covered a big range (e0000000-fed003ff),
and that had various PCI resources in it set up by firmware.
With old kernels, the reserved range would force us to re-allocate all
pre-existing PCI resources, and his reserved range would end up looking
like this:
and because the reserved entry had been split and moved into the
individual resources, and because it used the IORESOURCE_BUSY flag, the
drivers that actually wanted to _use_ those resources couldn't actually
attach to them:
e1000e 0000:00:19.0: BAR 0: can't reserve mem region [0xfe9e0000-0xfe9fffff]
HDA Intel 0000:00:1b.0: BAR 0: can't reserve mem region [0xfe9dc000-0xfe9dffff]
with this patch, the resource tree instead becomes
ie the one reserved region now ends up surrounding all the PCI resources
that were allocated inside of it by firmware, and because it is not
marked BUSY, drivers have no problem attaching to the pre-allocated
resources.
Reported-and-tested-by: Jonathan Corbet <corbet@lwn.net> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Robert Hancock <hancockr@shaw.ca> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>