Currently, GART alloc_coherent tries to allocate pages with GFP_DMA32
for a device having dma_masks > 24bit < 32bits. If GART gets an
address that a device can't access to, GART try to map the address to
a virtual I/O address that the device can access to.
But Andi pointed out, "The GART is somewhere in the 4GB range so you
cannot use it to map anything < 4GB. Also GART is pretty small."
http://lkml.org/lkml/2008/9/12/43
That is, it's possible that GART doesn't have virtual I/O address
space that a device can access to. The above behavior doesn't work for
a device having dma_masks > 24bit < 32bits.
This patch restores old GART alloc_coherent behavior (before the
alloc_coherent rewrite).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts:
commit bee44f294e
Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Date: Fri Sep 12 19:42:35 2008 +0900
x86: make GART to respect device's dma_mask about virtual mappings
I wrote the above commit to fix a GART alloc_coherent regression, that
can't handle a device having dma_masks > 24bit < 32bits, introduced by
the alloc_coherent rewrite:
http://lkml.org/lkml/2008/8/12/200
After the alloc_coherent rewrite, GART alloc_coherent tried to
allocate pages with GFP_DMA32. If GART got an address that a device
can't access to, GART mapped the address to a virtual I/O address. But
GART mapping mechanism didn't take account of dma mask, so GART could
use a virtual I/O address that the device can't access to again.
Alan pointed out:
" This is indeed a specific problem found with things like older
AACRAID where control blocks must be below 31bits and the GART
is above 0x80000000. "
The above commit modified GART mapping mechanism to take care of dma
mask. But Andi pointed out, "The GART is somewhere in the 4GB range so
you cannot use it to map anything < 4GB. Also GART is pretty small."
http://lkml.org/lkml/2008/9/12/43
That means it's possible that GART doesn't have virtual I/O address
space that a device can access to. The above commit (to modify GART
mapping mechanism to take care of dma mask) can't fix the regression
reliably so let's avoid making GART more complicated.
We need a solution that always works for dma_masks > 24bit <
32bits. That's how GART worked before the alloc_coherent rewrite.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make 32-bit setup_rt_frame() look like 64-bit version for unification.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Introduce new macro is_ia32 for unification of setup_rt_frame().
No effect in binary, compiler will optimize.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This helper function is for unification of setup_rt_frame().
No effect in binary.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Introduce signr_convert().
This function will help unification of setup_rt_frame().
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: prevent stale state of c1e_mask across CPU offline/online, fix
(1) mark mc_size in generic_load_microcode() as unitialized_var to avoid
gcc's (false) warning;
(2) mark request_microcode_user() as unsupported. The required changes
can be added later. Note, we don't break any user-space interfaces
here, as there were no kernels with support for AMD-specific ucode
update yet. The ucode has to be updated via 'firmware'.
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Boardrev is always treated as a u32 everywhere else, no reason to
byteswap the 0xc2 value. The only use is to print out if it is
a prerelease board, the test being:
(olpc_platform_info.boardrev & 0xf) < 8
Which is currently always true as be32_to_cpu(0xc2) & 0xf = 0
but I doubt that was the intention here. The consequences of the bug
are pretty minor though (incorrect boardrev displayed in dmesg when
ofw support not configured)
Also annotate the temporary used to read the boardrev in the ofw
case.
The confusion was noticed by Sparse:
arch/x86/kernel/olpc.c:206:32: warning: cast to restricted __be32
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
arch/x86/kernel/ptrace.c:763:29: warning: Using plain integer as NULL pointer
arch/x86/kernel/ptrace.c:777:46: warning: Using plain integer as NULL pointer
arch/x86/kernel/ptrace.c:1115:45: warning: Using plain integer as NULL pointer
arch/x86/kernel/ds.c:482:26: warning: Using plain integer as NULL pointer
arch/x86/kernel/ds.c:487:25: warning: Using plain integer as NULL pointer
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The OLPC doesn't support APM but also doesn't have DMI, so we can't detect
and disable it based on DMI data. So, just disable based on machine_is_olpc()
Signed-off-by: Jeremy Katz <katzj@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix build error introduced by commit 4faac97d44 ("x86: prevent stale
state of c1e_mask across CPU offline/online").
process_32.c needs to include idle.h to get the prototype for
c1e_remove_cpu()
Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
timers: fix build error in !oneshot case
x86: c1e_idle: don't mark TSC unstable if CPU has invariant TSC
x86: prevent C-states hang on AMD C1E enabled machines
clockevents: prevent mode mismatch on cpu online
clockevents: check broadcast device not tick device
clockevents: prevent stale tick_next_period for onlining CPUs
x86: prevent stale state of c1e_mask across CPU offline/online
clockevents: prevent cpu online to interfere with nohz
Renaming based on patch from Dmitry Adamushko.
Further clarification by renaming define and variable related to
microcode container file.
Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Renaming based on patch from Dmitry Adamushko.
Made code more readable by renaming define and variables related
to microcode _container_file_ header to make it distinguishable from
microcode _patch_ header.
Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently a SIGTRAP can denote any one of below reasons.
- Breakpoint hit
- H/W debug register hit
- Single step
- Signal sent through kill() or rasie()
Architectures like powerpc/parisc provides infrastructure to demultiplex
SIGTRAP signal by passing down the information for receiving SIGTRAP through
si_code of siginfot_t structure. Here is an attempt is generalise this
infrastructure by extending it to x86 and x86_64 archs.
Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: akpm@linux-foundation.org
Cc: paulus@samba.org
Cc: linuxppc-dev@ozlabs.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Combine both generic and arch-specific parts of microcode into a
single module (arch-specific parts are config-dependent).
Also while we are at it, move arch-specific parts from microcode.h
into their respective arch-specific .c files.
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Cc: "Peter Oruba" <peter.oruba@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Functional TSC is marked unstable on AMD family 0x10 and 0x11 CPUs.
This would be wrong because for those CPUs "invariant TSC" means:
"The TSC counts at the same rate in all P-states, all C states, S0,
or S1"
(See "Processor BIOS and Kernel Developer's Guides" for those CPUs.)
[ tglx: Changed C1E to AMD C1E in the printks to avoid confusion
with Intel C1E ]
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Impact: System hang when AMD C1E machines switch into C2/C3
AMD C1E enabled systems do not work with normal ACPI C-states
even if the BIOS is advertising them. Limit the C-states to
C1 for the ACPI processor idle code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Impact: hang which happens across CPU offline/online on AMD C1E systems.
When a CPU goes offline then the corresponding bit in the broadcast
mask is cleared. For AMD C1E enabled CPUs we do not reenable the
broadcast when the CPU comes online again as we do not clear the
corresponding bit in the c1e_mask, which keeps track which CPUs
have been switched to broadcast already. So on those !$@#& machines
we never switch back to broadcasting after a CPU offline/online cycle.
Clear the bit when the CPU plays dead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
27-rc fails to boot up if configured to use modules.
Turns out vsmp_patch was marked __init, and vsmp_patch being the
pvops 'patch' routine for vsmp, a call to vsmp_patch just turns out
to execute a code page with series of 0xcc (POISON_FREE_INITMEM -- int3).
vsmp_patch has been marked with __init ever since pvops, however,
apply_paravirt can be called during module load causing calls to
freed memory location.
Since apply_paravirt can only be called during init/module load, make
vsmp_patch with "__init_or_module"
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
DMI tables need a blank NULL tail.
fixes the crash on Ingo's test box.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make use of FW_BUG interface to give vendors and users the ability to
automatically check for powernow-k8 related BIOS bugs by:
dmesg |grep "Firmware Bug"
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This patch against tip/x86/iommu virtually reverts
2842e5bf31. But just reverting the
commit breaks AMD IOMMU so this patch also includes some fixes.
The above commit adds new two options to x86 IOMMU generic kernel boot
options, fullflush and nofullflush. But such change that affects all
the IOMMUs needs more discussion (all IOMMU parties need the chance to
discuss it):
http://lkml.org/lkml/2008/9/19/106
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There's a small window when NMI watchdog is being set up that if any NMIs
are triggered, the NMI code will make make use of not initalized wd_ops
elements:
void setup_apic_nmi_watchdog(void *unused)
{
if (__get_cpu_var(wd_enabled))
return;
/* cheap hack to support suspend/resume */
/* if cpu0 is not active neither should the other cpus */
if (smp_processor_id() != 0 && atomic_read(&nmi_active) <= 0)
return;
switch (nmi_watchdog) {
case NMI_LOCAL_APIC:
/* enable it before to avoid race with handler */
--> __get_cpu_var(wd_enabled) = 1;
--> if (lapic_watchdog_init(nmi_hz) < 0) {
(...)
asmlinkage notrace __kprobes void default_do_nmi(struct pt_regs *regs)
{
(...)
if (nmi_watchdog_tick(regs, reason))
return;
(...)
notrace __kprobes int
nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
{
(...)
if (!__get_cpu_var(wd_enabled))
return rc;
switch (nmi_watchdog) {
case NMI_LOCAL_APIC:
rc |= lapic_wd_event(nmi_hz);
(...)
int lapic_wd_event(unsigned nmi_hz)
{
struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
u64 ctr;
--> rdmsrl(wd->perfctr_msr, ctr);
and wd->*_msr will be initialized on each processor type specific setup, after
enabling NMIs for PMIs. Since the counter was just set, the chances of an
performance counter generated NMI is minimal, but any other unknown NMI would
trigger the problem. This patch fixes the problem by setting everything up
before enabling performance counter generated NMIs and will set wd_enabled
using a callback function.
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
P4s have a quirk that makes necessary to clear P4_CCCR_OVF bit on the CCCR
everytime the PMI is triggered. When booting the kernel with reset_devices
(more specific kdump case), the counters reach zero and the PMI will be
generated. This is not a problem on other processors but on P4s, it'll
continue to generate NMIs until that bit is cleared. Since there may be
other users of the performance counters, clear and disable all of them
when booting with reset_devices option.
We have a P4 box here that crashes because of this problem. Since the kdump
kernel usually boots with only one processor active, the second logical
unit won't be set up, therefore, MSR_P4_IQ_CCCR1 (and other performance
counter registers) won't be cleared and P4_CCCR_OVF may be still set because
the previous kernel was using this register. An NMI is triggered because of
the MSR_P4_IQ_CCCR1 right after the NMI delivery is enabled, triggering the
race fixed on my previous email.
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86 has set_bit_string() that does the exact same thing that
set_bit_area() in lib/iommu-helper.c does.
This patch exports set_bit_area() in lib/iommu-helper.c as
iommu_area_reserve(), converts GART, Calgary, and AMD IOMMU to use it.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
so could help catch attention about bug in bios about mtrr mask setting.
WARN_ONCE got into mainline already, lets use it.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a check for ioremap() failure in copy_oldmem_page().
This patch also includes small coding style fixes.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The bad_bios_dmi_table() quirk never triggered because we do DMI setup
too late. Move it a bit earlier.
Also change the CONFIG_X86_RESERVE_LOW_64K quirk to operate on the e820
table directly instead of messing with early reservations - this handles
overlaps (which do occur in this low range of RAM) more gracefully.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
the exact timing of the corruption check isn't too important (it's once a
minute timer), use round_jiffies() to align it and avoid extra wakeups.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The alloc_coherent implementation for AMD IOMMU currently uses
*dev->dma_mask per default. This patch changes it to prefer
dev->coherent_dma_mask if it is set.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The command buffer release function uses the CMD_BUF_SIZE macro for
get_order. Replace this with iommu->cmd_buf_size which is more reliable
about the actual size of the buffer.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The current calculation of the IVHD entry size is hard to read. So move
this code to a seperate function to make it more clear what this
calculation does.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The ctrl variable is only u32 and readl also returns a 32 bit value. So
the cast to u64 is pointless. Remove it with this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The amd_iommu_pd_alloc_bitmap is allocated with a calculated order and
freed with order 1. This is not a bug since the calculated order always
evaluates to 1, but its unclean code. So replace the 1 with the
calculation in the release path.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The current calculation is very complicated. This patch replaces it with
a much simpler version.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Remove the memset and use __GFP_ZERO at allocation time instead.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86's common alloc_coherent (dma_alloc_coherent in dma-mapping.h) sets
up the gfp flag according to the device dma_mask but AMD IOMMU doesn't
need it for devices that the IOMMU can do virtual mappings for. This
patch avoids unnecessary low zone allocation.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>