Commit Graph

31810 Commits

Author SHA1 Message Date
Ingo Molnar
b814d41f09 x86, mm: fault.c, simplify kmmio_fault()
Impact: cleanup

Remove an #ifdef from kmmio_fault() - we can do this by
providing default implementations for is_kmmio_active()
and kmmio_handler(). The compiler optimizes it all away
in the !CONFIG_MMIOTRACE case.

Also, while at it, clean up mmiotrace.h a bit:

 - standard header guards
 - standard vertical spaces for structure definitions

No code changed (both with mmiotrace on and off in the config):

   text	   data	    bss	    dec	    hex	filename
   2947	     12	     12	   2971	    b9b	fault.o.before
   2947	     12	     12	   2971	    b9b	fault.o.after

Cc: Pekka Paalanen <pq@iki.fi>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-21 00:09:42 +01:00
Ingo Molnar
121d5d0a7e x86, mm: fault.c, enable PF_RSVD checks on 32-bit too
Impact: improve page fault handling robustness

The 'PF_RSVD' flag (bit 3) of the page-fault error_code is a
relatively recent addition to x86 CPUs, so the 32-bit do_fault()
implementation never had it. This flag gets set when the CPU
detects nonzero values in any reserved bits of the page directory
entries.

Extend the existing 64-bit check for PF_RSVD in do_page_fault()
to 32-bit too. If we detect such a fault then we print a more
informative oops and the pagetables.

This unifies the code some more, removes an ugly #ifdef and improves
the 32-bit page fault code robustness a bit. It slightly increases
the 32-bit kernel text size.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-21 00:09:41 +01:00
Ingo Molnar
8c938f9fae x86, mm: fault.c, factor out the vm86 fault check
Impact: cleanup

Instead of an ugly, open-coded, #ifdef-ed vm86 related legacy check
in do_page_fault(), put it into the check_v8086_mode() helper
function and merge it with an existing #ifdef.

Also, simplify the code flow a tiny bit in the helper.

No code changed:

arch/x86/mm/fault.o:

   text	   data	    bss	    dec	    hex	filename
   2711	     12	     12	   2735	    aaf	fault.o.before
   2711	     12	     12	   2735	    aaf	fault.o.after

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-21 00:09:41 +01:00
Ingo Molnar
107a03678c x86, mm: fault.c, refactor/simplify the is_prefetch() code
Impact: no functionality changed

Factor out the opcode checker into a helper inline.

The code got a tiny bit smaller:

   text	   data	    bss	    dec	    hex	filename
   4632	     32	     24	   4688	   1250	fault.o.before
   4618	     32	     24	   4674	   1242	fault.o.after

And it got cleaner / easier to review as well.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-21 00:09:40 +01:00
Ingo Molnar
2d4a71676f x86, mm: fault.c cleanup
Impact: cleanup, no code changed

Clean up various small details, which can be correctness checked
automatically:

 - tidy up the include file section
 - eliminate unnecessary includes
 - introduce show_signal_msg() to clean up code flow
 - standardize the code flow
 - standardize comments and other style details
 - more cleanups, pointed out by checkpatch

No code changed on either 32-bit nor 64-bit:

arch/x86/mm/fault.o:

   text	   data	    bss	    dec	    hex	filename
   4632	     32	     24	   4688	   1250	fault.o.before
   4632	     32	     24	   4688	   1250	fault.o.after

the md5 changed due to a change in a single instruction:

   2e8a8241e7f0d69706776a5a26c90bc0  fault.o.before.asm
   c5c3d36e725586eb74f0e10692f0193e  fault.o.after.asm

Because a __LINE__ reference in a WARN_ONCE() has changed.

On 32-bit a few stack offsets changed - no code size difference
nor any functionality difference.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-21 00:09:39 +01:00
Alok Kataria
fdb17aeb28 x86, vmi: TSC going backwards check in vmi clocksource, cleanup
clean up vmi_read_cycles to use max()

Reported-b: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: Zach Amsden <zach@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-20 19:31:03 +01:00
Ingo Molnar
c9e1585b1b Merge branch 'tip/x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into x86/mm 2009-02-20 18:51:43 +01:00
Ingo Molnar
7a5714e018 x86, pat: add large-PAT check to split_large_page()
Impact: future-proof the split_large_page() function

Linus noticed that split_large_page() is not safe wrt. the
PAT bit: it is bit 12 on the 1GB and 2MB page table level
(_PAGE_BIT_PAT_LARGE), and it is bit 7 on the 4K page
table level (_PAGE_BIT_PAT).

Currently it is not a problem because we never set
_PAGE_BIT_PAT_LARGE on any of the large-page mappings - but
should this happen in the future the split_large_page() would
silently lift bit 12 into the lowlevel 4K pte and would start
corrupting the physical page frame offset. Not fun.

So add a debug warning, to make sure if something ever sets
the PAT bit then this function gets updated too.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-20 17:48:49 +01:00
Steven Rostedt
3c3e5694ad x86: check PMD in spurious_fault handler
Impact: fix to prevent hard lockup on bad PMD permissions

If the PMD does not have the correct permissions for a page access,
but the PTE does, the spurious fault handler will mistake the fault
as a lazy TLB transaction. This will result in an infinite loop of:

 fault -> spurious_fault check (pass) -> return to code -> fault

This patch adds a check and a warn on if the PTE passes the permissions
but the PMD does not.

[ Updated: Ingo Molnar suggested using WARN_ONCE with some text ]

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-20 11:44:47 -05:00
Ingo Molnar
609162850d Merge branches 'x86/asm', 'x86/cleanups' and 'x86/headers' into x86/core 2009-02-20 17:40:50 +01:00
Ingo Molnar
3b6f7b9beb Merge branch 'x86/urgent' into x86/core 2009-02-20 17:40:43 +01:00
Vegard Nossum
ecab22aa6d x86: use symbolic constants for MSR_IA32_MISC_ENABLE bits
Impact: Cleanup. No functional changes.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-20 12:07:43 +01:00
Ingo Molnar
07a66d7c53 x86: use the right protections for split-up pagetables
Steven Rostedt found a bug in where in his modified kernel
ftrace was unable to modify the kernel text, due to the PMD
itself having been marked read-only as well in
split_large_page().

The fix, suggested by Linus, is to not try to 'clone' the
reference protection of a huge-page, but to use the standard
(and permissive) page protection bits of KERNPG_TABLE.

The 'cloning' makes sense for the ptes but it's a confused and
incorrect concept at the page table level - because the
pagetable entry is a set of all ptes and hence cannot
'clone' any single protection attribute - the ptes can be any
mixture of protections.

With the permissive KERNPG_TABLE, even if the pte protections
get changed after this point (due to ftrace doing code-patching
or other similar activities like kprobes), the resulting combined
protections will still be correct and the pte's restrictive
(or permissive) protections will control it.

Also update the comment.

This bug was there for a long time but has not caused visible
problems before as it needs a rather large read-only area to
trigger. Steve possibly hacked his kernel with some really
large arrays or so. Anyway, the bug is definitely worth fixing.

[ Huang Ying also experienced problems in this area when writing
  the EFI code, but the real bug in split_large_page() was not
  realized back then. ]

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Huang Ying <ying.huang@intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-20 08:35:03 +01:00
Tejun Heo
11124411aa x86: convert to the new dynamic percpu allocator
Impact: use new dynamic allocator, unified access to static/dynamic
        percpu memory

Convert to the new dynamic percpu allocator.

* implement populate_extra_pte() for both 32 and 64
* update setup_per_cpu_areas() to use pcpu_setup_static()
* define __addr_to_pcpu_ptr() and __pcpu_ptr_to_addr()
* define config HAVE_DYNAMIC_PER_CPU_AREA

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:09 +09:00
Tejun Heo
f0aa661790 vmalloc: implement vm_area_register_early()
Impact: allow multiple early vm areas

There are places where kernel VM area needs to be allocated before
vmalloc is initialized.  This is done by allocating static vm_struct,
initializing several fields and linking it to vmlist and later vmalloc
initialization picking up these from vmlist.  This is currently done
manually and if there's more than one such areas, there's no defined
way to arbitrate who gets which address.

This patch implements vm_area_register_early(), which takes vm_area
struct with flags and size initialized, assigns address to it and puts
it on the vmlist.  This way, multiple early vm areas can determine
which addresses they should use.  The only current user - alpha mm
init - is converted to use it.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:08 +09:00
Rusty Russell
b36128c830 alloc_percpu: change percpu_ptr to per_cpu_ptr
Impact: cleanup

There are two allocated per-cpu accessor macros with almost identical
spelling.  The original and far more popular is per_cpu_ptr (44
files), so change over the other 4 files.

tj: kill percpu_ptr() and update UP too

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: mingo@redhat.com
Cc: lenb@kernel.org
Cc: cpufreq@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:08 +09:00
Lai Jiangshan
42f8faecf7 x86: use percpu data for 4k hardirq and softirq stacks
Impact: economize memory for large NR_CPUS

percpu data is setup earlier than irq, we can use percpu data
to economize memory.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:26:10 +09:00
Alok N Kataria
48ffc70b67 x86, vmi: TSC going backwards check in vmi clocksource
Impact: fix time warps under vmware

Similar to the check for TSC going backwards in the TSC clocksource,
we also need this check for VMI clocksource.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: stable@kernel.org
2009-02-20 07:53:08 +01:00
Linus Torvalds
a5e7536388 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] xen_domu build fix
  [IA64] fixes configs and add default config for ia64 xen domU
  [IA64] Remove redundant cpu_clear() in __cpu_disable path
  [IA64] Revert "prevent ia64 from invoking irq handlers on offline CPUs"
  [IA64] bte_copy of BTE_MAX_XFER trips BUG_ON.
  [IA64] Build fix for __early_pfn_to_nid() undefined link error
2009-02-19 13:09:20 -08:00
Tony Luck
ec8148de85 [IA64] xen_domu build fix
arch/ia64/xen/xen_pv_ops.c:156: error: xen_init_ops causes a section type conflict
arch/ia64/xen/xen_pv_ops.c:340: error: xen_iosapic_ops causes a section type conflict

Signed-off-by: Tony Luck <tony.luck@intel.com>
2009-02-19 12:05:00 -08:00
Isaku Yamahata
1d5b20f490 [IA64] fixes configs and add default config for ia64 xen domU
This patch fixes xen related Kconfigs and add default config
file for ia64 xen domU.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Tony Luck <aegl@agluck-desktop.(none)>
2009-02-19 11:39:06 -08:00
Alex Chiang
c0acdea214 [IA64] Remove redundant cpu_clear() in __cpu_disable path
The second call to cpu_clear() is redundant, as we've already removed
the CPU from cpu_online_map before calling migrate_platform_irqs().

Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Tony Luck <aegl@agluck-desktop.(none)>
2009-02-19 11:32:50 -08:00
Alex Chiang
66db2e6331 [IA64] Revert "prevent ia64 from invoking irq handlers on offline CPUs"
This reverts commit e7b140365b.

Commit e7b14036 removes the targetted disabled CPU from the
cpu_online_map after calls to migrate_platform_irqs and fixup_irqs.

Paul McKenney states that the reasoning behind the patch was to
prevent irq handlers from running on CPUs marked offline because:

	RCU happily ignores CPUs that don't have their bits set in
	cpu_online_map, so if there are RCU read-side critical sections
	in the irq handlers being run, RCU will ignore them.  If the
	other CPUs were running, they might sequence through the RCU
	state machine, which could result in data structures being
	yanked out from under those irq handlers, which in turn could
	result in oopses or worse.

Unfortunately, both ia64 functions above look at cpu_online_map to find
a new CPU to migrate interrupts onto. This means we can potentially
migrate an interrupt off ourself back to... ourself. Uh oh.

This causes an oops when we finally try to process pending interrupts on
the CPU we want to disable. The oops results from calling __do_IRQ with
a NULL pt_regs:

Unable to handle kernel NULL pointer dereference (address 0000000000000040)
Call Trace:
 [<a000000100016930>] show_stack+0x50/0xa0
                                sp=e0000009c922fa00 bsp=e0000009c92214d0
 [<a0000001000171a0>] show_regs+0x820/0x860
                                sp=e0000009c922fbd0 bsp=e0000009c9221478
 [<a00000010003c700>] die+0x1a0/0x2e0
                                sp=e0000009c922fbd0 bsp=e0000009c9221438
 [<a0000001006e92f0>] ia64_do_page_fault+0x950/0xa80
                                sp=e0000009c922fbd0 bsp=e0000009c92213d8
 [<a00000010000c7a0>] ia64_native_leave_kernel+0x0/0x270
                                sp=e0000009c922fc60 bsp=e0000009c92213d8
 [<a0000001000ecdb0>] profile_tick+0xd0/0x1c0
                                sp=e0000009c922fe30 bsp=e0000009c9221398
 [<a00000010003bb90>] timer_interrupt+0x170/0x3e0
                                sp=e0000009c922fe30 bsp=e0000009c9221330
 [<a00000010013a800>] handle_IRQ_event+0x80/0x120
                                sp=e0000009c922fe30 bsp=e0000009c92212f8
 [<a00000010013aa00>] __do_IRQ+0x160/0x4a0
                                sp=e0000009c922fe30 bsp=e0000009c9221290
 [<a000000100012290>] ia64_process_pending_intr+0x2b0/0x360
                                sp=e0000009c922fe30 bsp=e0000009c9221208
 [<a0000001000112d0>] fixup_irqs+0xf0/0x2a0
                                sp=e0000009c922fe30 bsp=e0000009c92211a8
 [<a00000010005bd80>] __cpu_disable+0x140/0x240
                                sp=e0000009c922fe30 bsp=e0000009c9221168
 [<a0000001006c5870>] take_cpu_down+0x50/0xa0
                                sp=e0000009c922fe30 bsp=e0000009c9221148
 [<a000000100122610>] stop_cpu+0xd0/0x200
                                sp=e0000009c922fe30 bsp=e0000009c92210f0
 [<a0000001000e0440>] kthread+0xc0/0x140
                                sp=e0000009c922fe30 bsp=e0000009c92210c8
 [<a000000100014ab0>] kernel_thread_helper+0xd0/0x100
                                sp=e0000009c922fe30 bsp=e0000009c92210a0
 [<a00000010000a4c0>] start_kernel_thread+0x20/0x40
                                sp=e0000009c922fe30 bsp=e0000009c92210a0

I don't like this revert because it is fragile. ia64 is getting lucky
because we seem to only ever process timer interrupts in this path, but
if we ever race with an IPI here, we definitely use RCU and have the
potential of hitting an oops that Paul describes above.

Patching ia64's timer_interrupt() to check for NULL pt_regs is
insufficient though, as we still hit the above oops.

As a short term solution, I do think that this revert is the right
answer. The revert hold up under repeated testing (24+ hour test runs)
with this setup:

	- 8-way rx6600
	- randomly toggling CPU online/offline state every 2 seconds
	- running CPU exercisers, memory hog, disk exercisers, and
	  network stressors
	- average system load around ~160

In the long term, we really need to figure out why we set pt_regs = NULL
in ia64_process_pending_intr(). If it turns out that it is unnecessary
to do so, then we could safely re-introduce e7b14036 (along with some
other logic to be smarter about migrating interrupts).

One final note: x86 also removes the disabled CPU from cpu_online_map
and then re-enables interrupts for 1ms, presumably to handle any pending
interrupts:

arch/x86/kernel/irq_32.c (and irq_64.c):
cpu_disable_common:
	[remove cpu from cpu_online_map]

	fixup_irqs():
		for_each_irq:
			[break CPU affinities]

		local_irq_enable();
		mdelay(1);
		local_irq_disable();

So they are doing implicitly what ia64 is doing explicitly.

Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Tony Luck <aegl@agluck-desktop.(none)>
2009-02-19 11:32:26 -08:00
Robin Holt
39d481cba2 [IA64] bte_copy of BTE_MAX_XFER trips BUG_ON.
BTE_MAX_XFER is wrong.  It is one greater than the number of cache
lines the BTE is actually able to transfer.  If you request a transfer
of exactly BTE_MAX_XFER size, you trip a very cryptic BUG_ON() which
should certainly be made more clear.

This patch fixes that constant and also cleans up the BUG_ON()s in
arch/ia64/sn/kernel/bte.c to test one condition per line.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Tony Luck <aegl@agluck-desktop.(none)>
2009-02-19 11:29:31 -08:00
Tony Luck
334f85b647 [IA64] Build fix for __early_pfn_to_nid() undefined link error
ia64 only defines __early_pfn_to_nid() for SPARSEMEM && NUMA configurations,
so the recent:

	commit: f2dbcfa738
	mm: clean up for early_pfn_to_nid()

ends up with some link problems for certain configuration files.

Fix arch/ia64/Kconfig to only define HAVE_ARCH_EARLY_PFN_TO_NID in the
cases where we do provide this function.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2009-02-19 11:22:36 -08:00
Linus Torvalds
402a917aca Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 5405/1: ep93xx: remove unused gesbc9312.h header
  [ARM] 5404/1: Fix condition in arm_elf_read_implies_exec() to set READ_IMPLIES_EXEC
  [ARM] omap: fix clock reparenting in omap2_clk_set_parent()
  [ARM] 5403/1: pxa25x_ep_fifo_flush() *ep->reg_udccs always set to 0
  [ARM] 5402/1: fix a case of wrap-around in sanity_check_meminfo()
  [ARM] 5401/1: Orion: fix edge triggered GPIO interrupt support
  [ARM] 5400/1: Add support for inverted rdy_busy pin for Atmel nand device controller
  [ARM] 5391/1: AT91: Enable GPIO clocks earlier
  [ARM] 5390/1: AT91: Watchdog fixes
  [ARM] 5398/1: Add Wan ZongShun to MAINTAINERS for W90P910
  [ARM] omap: fix _omap2_clksel_get_src_field()
  [ARM] omap: fix omap2_divisor_to_clksel() error return value
2009-02-19 09:52:12 -08:00
Ingo Molnar
e9ce0c37c2 Merge branch 'x86/untangle2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into x86/headers 2009-02-19 18:15:01 +01:00
Linus Torvalds
bcf8951fc2 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown
  x86, mce: use force_sig_info to kill process in machine check
  x86, mce: reinitialize per cpu features on resume
  x86, rcu: fix strange load average and ksoftirqd behavior
2009-02-19 09:14:35 -08:00
Hartley Sweeten
9dd446f657 [ARM] 5405/1: ep93xx: remove unused gesbc9312.h header
Remove the gesbc9312.h header since it is unused.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-19 16:13:02 +00:00
Cyrill Gorcunov
cb425afd21 x86: compressed head_32 - use ENTRY,ENDPROC macros
Impact: clenaup

Linker script will put startup_32 at predefined
address so using startup_32 will not bloat the
code size.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:13:01 +01:00
Cyrill Gorcunov
2d4eeecb98 x86: compressed head_64 - use ENTRY,ENDPROC macros
Impact: clenaup

Linker script will put startup_32 at predefined
address so using ENTRY will not bloat the code
size.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:13:01 +01:00
Cyrill Gorcunov
324bda9e47 x86: pmjump - use GLOBAL,ENDPROC macros
Impact: cleanup

We are in setup stage so we use GLOBAL
instead of ENTRY and do not increase code
size.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:13:00 +01:00
Cyrill Gorcunov
2f79555097 x86: copy.S - use GLOBAL,ENDPROC macros
Impact: cleanup

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:13:00 +01:00
Cyrill Gorcunov
1b25f3b4e1 x86: linkage - get rid of _X86 macros
Impact: cleanup

There was an attempt to bring build-time checking for
missed ENTRY_X86/END_X86 and KPROBE... pairs. Using
them will add messy in code. Get just rid of them.
This commit could be easily restored if the need appear
in future.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:12:59 +01:00
Cyrill Gorcunov
95695547a7 x86: asm linkage - introduce GLOBAL macro
If the code is time critical and this entry is called
from other places we use ENTRY to have it globally defined
and especially aligned.

Contrary we have some snippets which are size
critical. So we use plane ".globl name; name:"
directive. Introduce GLOBAL macro for this.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 17:12:59 +01:00
Makito SHIOKAWA
9da616fb99 [ARM] 5404/1: Fix condition in arm_elf_read_implies_exec() to set READ_IMPLIES_EXEC
READ_IMPLIES_EXEC must be set when:
o binary _is_ an executable stack (i.e. not EXSTACK_DISABLE_X)
o processor architecture is _under_ ARMv6 (XN bit is supported from ARMv6)

Signed-off-by: Makito SHIOKAWA <lkhmkt@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-19 14:45:27 +00:00
Heiko Carstens
23d75d9cad [S390] fix "mem=" handling in case of standby memory
Standby memory detected with the sclp interface gets always registered
with add_memory calls without considering the limitationt that the
"mem=" kernel paramater implies.
So fix this and only register standby memory that is below the specified
limit.
This fixes zfcpdump since it uses "mem=32M". In case there is appr.
2GB standby memory present all of usable memory would be used for the
struct pages needed for standby memory.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-02-19 15:19:19 +01:00
Christian Borntraeger
d5cd0343d2 [S390] Fix timeval regression on s390
commit aa5e97ce4b
[PATCH] improve precision of process accounting.

Introduced a timing regression:
-bash-3.2# time ls
real    0m0.006s
user    0m1.754s
sys     0m1.094s

The problem was introduced by an error in cputime_to_timeval.
Cputime is now 1/4096 microsecond, therefore, we have to divide
the remainder with 4096 to get the microseconds.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-02-19 15:19:19 +01:00
Russell King
41f3103fcf [ARM] omap: fix clock reparenting in omap2_clk_set_parent()
When changing the parent of a clock, it is necessary to keep the
clock use counts balanced otherwise things the parent state will
get corrupted.  Since we already disable and re-enable the clock,
we might as well use the recursive versions instead.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-19 13:25:16 +00:00
Hiroshi Shimamoto
71d8f9784a x86: syscalls.h: remove asmlinkage from declaration of sys_rt_sigreturn()
Impact: cleanup

asmlinkage for sys_rt_sigreturn() no longer exists in arch/x86/kernel/signal.c.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 12:18:54 +01:00
Nicolas Pitre
3fd9825c42 [ARM] 5402/1: fix a case of wrap-around in sanity_check_meminfo()
In the non highmem case, if two memory banks of 1GB each are provided,
the second bank would evade suppression since its virtual base would
be 0.  Fix this by disallowing any memory bank which virtual base
address is found to be lower than PAGE_OFFSET.

Reported-by: Lennert Buytenhek <buytenh@marvell.com>

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-19 09:49:45 +00:00
Jaswinder Singh Rajput
de5483029b x86: include/asm/processor.h remove double declaration of print_cpu_info
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 10:12:18 +01:00
KAMEZAWA Hiroyuki
cc2559bccc mm: fix memmap init for handling memory hole
Now, early_pfn_in_nid(PFN, NID) may returns false if PFN is a hole.
and memmap initialization was not done. This was a trouble for
sparc boot.

To fix this, the PFN should be initialized and marked as PG_reserved.
This patch changes early_pfn_in_nid() return true if PFN is a hole.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
KAMEZAWA Hiroyuki
f2dbcfa738 mm: clean up for early_pfn_to_nid()
What's happening is that the assertion in mm/page_alloc.c:move_freepages()
is triggering:

	BUG_ON(page_zone(start_page) != page_zone(end_page));

Once I knew this is what was happening, I added some annotations:

	if (unlikely(page_zone(start_page) != page_zone(end_page))) {
		printk(KERN_ERR "move_freepages: Bogus zones: "
		       "start_page[%p] end_page[%p] zone[%p]\n",
		       start_page, end_page, zone);
		printk(KERN_ERR "move_freepages: "
		       "start_zone[%p] end_zone[%p]\n",
		       page_zone(start_page), page_zone(end_page));
		printk(KERN_ERR "move_freepages: "
		       "start_pfn[0x%lx] end_pfn[0x%lx]\n",
		       page_to_pfn(start_page), page_to_pfn(end_page));
		printk(KERN_ERR "move_freepages: "
		       "start_nid[%d] end_nid[%d]\n",
		       page_to_nid(start_page), page_to_nid(end_page));
 ...

And here's what I got:

	move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00]
	move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00]
	move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff]
	move_freepages: start_nid[1] end_nid[0]

My memory layout on this box is:

[    0.000000] Zone PFN ranges:
[    0.000000]   Normal   0x00000000 -> 0x0081ff5d
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[8] active PFN ranges
[    0.000000]     0: 0x00000000 -> 0x00020000
[    0.000000]     1: 0x00800000 -> 0x0081f7ff
[    0.000000]     1: 0x0081f800 -> 0x0081fe50
[    0.000000]     1: 0x0081fed1 -> 0x0081fed8
[    0.000000]     1: 0x0081feda -> 0x0081fedb
[    0.000000]     1: 0x0081fedd -> 0x0081fee5
[    0.000000]     1: 0x0081fee7 -> 0x0081ff51
[    0.000000]     1: 0x0081ff59 -> 0x0081ff5d

So it's a block move in that 0x81f600-->0x81f7ff region which triggers
the problem.

This patch:

Declaration of early_pfn_to_nid() is scattered over per-arch include
files, and it seems it's complicated to know when the declaration is used.
 I think it makes fix-for-memmap-init not easy.

This patch moves all declaration to include/linux/mm.h

After this,
  if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
     -> Use static definition in include/linux/mm.h
  else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
     -> Use generic definition in mm/page_alloc.c
  else
     -> per-arch back end function will be called.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: David Miller <davem@davemlloft.net>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
Andi Kleen
07db1c140e x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown
Impact: Bugfix

The ifdef for the apic clear on shutdown for the 64bit intel thermal
vector was incorrect and never triggered. Fix that.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:24:34 -08:00
Andi Kleen
380851bc6b x86, mce: use force_sig_info to kill process in machine check
Impact: bug fix (with tolerant == 3)

do_exit cannot be called directly from the exception handler because
it can sleep and the exception handler runs on the exception stack.
Use force_sig() instead.

Based on a earlier patch by Ying Huang who debugged the problem.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:24:31 -08:00
Andi Kleen
6ec68bff3c x86, mce: reinitialize per cpu features on resume
Impact: Bug fix

This fixes a long standing bug in the machine check code. On resume the
boot CPU wouldn't get its vendor specific state like thermal handling
reinitialized. This means the boot cpu wouldn't ever get any thermal
events reported again.

Call the respective initialization functions on resume

v2: Remove ancient init because they don't have a resume device anyways.
    Pointed out by Thomas Gleixner.
v3: Now fix the Subject too to reflect v2 change

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:24:28 -08:00
Nicolas Pitre
fd4b9b3650 [ARM] 5401/1: Orion: fix edge triggered GPIO interrupt support
The GPIO interrupts can be configured as either level triggered or edge
triggered, with a default of level triggered.  When an edge triggered
interrupt is requested, the gpio_irq_set_type method is called which
currently switches the given IRQ descriptor between two struct irq_chip
instances: orion_gpio_irq_level_chip and orion_gpio_irq_edge_chip. This
happens via __setup_irq() which also calls irq_chip_set_defaults() to
assign default methods to uninitialized ones.  The problem is that
irq_chip_set_defaults() is called before the irq_chip reference is
switched, leaving the new irq_chip (orion_gpio_irq_edge_chip in this
case) with uninitialized methods such as chip->startup() causing a kernel
oops.

Many solutions are possible, such as making irq_chip_set_defaults() global
and calling it from gpio_irq_set_type(), or calling __irq_set_trigger()
before irq_chip_set_defaults() in __setup_irq().  But those require
modifications to the generic IRQ code which might have adverse effect on
other architectures, and that would still be a fragile arrangement.
Manually copying the missing methods from within gpio_irq_set_type()
would be really ugly and it would break again the day new methods with
automatic defaults are added.

A better solution is to have a single irq_chip instance which can deal
with both edge and level triggered interrupts.  It is also a good idea
to switch the IRQ handler instead, as the edge IRQ handler allows for
one edge IRQ event to be queued as the IRQ is actually masked only when
that second IRQ is received, at which point the hardware can queue an
additional IRQ event, making edge triggered interrupts a bit more
reliable.

Tested-by: Martin Michlmayr <tbm@cyrius.com>

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-17 22:37:09 +00:00
Linus Torvalds
f8effd1a4a Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  doc: mmiotrace.txt, buffer size control change
  trace: mmiotrace to the tracer menu in Kconfig
  mmiotrace: count events lost due to not recording
2009-02-17 14:29:15 -08:00
Linus Torvalds
35010334aa Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, vm86: fix preemption bug
  x86, olpc: fix model detection without OFW
  x86, hpet: fix for LS21 + HPET = boot hang
  x86: CPA avoid repeated lazy mmu flush
  x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context
  x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption
  x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem
  x86/cpa: make sure cpa is safe to call in lazy mmu mode
  x86, ptrace, mm: fix double-free on race
2009-02-17 14:27:39 -08:00
Linus Torvalds
b30b774930 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/vsx: Fix VSX alignment handler for regs 32-63
  powerpc/ps3: Move ps3_mm_add_memory to device_initcall
  powerpc/mm: Fix numa reserve bootmem page selection
  powerpc/mm: Fix _PAGE_CHG_MASK to protect _PAGE_SPECIAL
2009-02-17 14:23:49 -08:00
Ingo Molnar
9be1b56a3e x86, apic: separate 32-bit setup functionality out of apic_32.c
Impact: build fix, cleanup

A couple of arch setup callbacks were mistakenly in apic_32.c, breaking
the build.

Also simplify the code a bit.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 23:12:48 +01:00
Linus Torvalds
39a65762d4 Merge branch 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: VMX: Flush volatile msrs before emulating rdmsr
  KVM: Fix assigned devices circular locking dependency
  KVM: x86: fix LAPIC pending count calculation
  KVM: Fix INTx for device assignment
  KVM: MMU: Map device MMIO as UC in EPT
  KVM: x86: disable kvmclock on non constant TSC hosts
  KVM: PIT: fix i8254 pending count read
  KVM: Fix racy in kvm_free_assigned_irq
  KVM: Add kvm_arch_sync_events to sync with asynchronize events
  KVM: mmu_notifiers release method
  KVM: Avoid using CONFIG_ in userspace visible headers
  KVM: ia64: fix fp fault/trap handler
2009-02-17 14:04:32 -08:00
Paul E. McKenney
bf51935f3e x86, rcu: fix strange load average and ksoftirqd behavior
Damien Wyart reported high ksoftirqd CPU usage (20%) on an
otherwise idle system.

The function-graph trace Damien provided:

>   799.521187 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521371 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521555 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521738 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521934 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522068 |   1)  ksoftir-2324  |               |                rcu_check_callbacks() {
>   799.522208 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522392 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522575 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522759 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522956 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523074 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.523214 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523397 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523579 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523762 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523960 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524079 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.524220 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524403 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524587 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524770 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
> [ . . . ]

Shows rcu_check_callbacks() being invoked way too often. It should be called
once per jiffy, and here it is called no less than 22 times in about
3.5 milliseconds, meaning one call every 160 microseconds or so.

Why do we need to call rcu_pending() and rcu_check_callbacks() from the
idle loop of 32-bit x86, especially given that no other architecture does
this?

The following patch removes the call to rcu_pending() and
rcu_check_callbacks() from the x86 32-bit idle loop in order to
reduce the softirq load on idle systems.

Reported-by: Damien Wyart <damien.wyart@free.fr>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 22:47:45 +01:00
H. Peter Anvin
a7eb518998 x86: truncate ISA addresses to unsigned int
Impact: Cleanup; fix inappropriate macro use

ISA addresses on x86 are mapped 1:1 with the physical address space.
Since the ISA address space is only 24 bits (32 for VLB or LPC) it
will always fit in an unsigned int, and at least in the aha1542 driver
using a wider type would cause an undesirable promotion.  Hence
explicitly cast the ISA bus addresses to unsigned int.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
2009-02-17 13:01:51 -08:00
Ingo Molnar
2a05180fe2 x86, apic: move remaining APIC drivers to arch/x86/kernel/apic/*
Move the 32-bit extended-arch APIC drivers to arch/x86/kernel/apic/
too, and rename apic_64.c to probe_64.c.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 20:35:47 +01:00
Ingo Molnar
f62bae5009 x86, apic: move APIC drivers to arch/x86/kernel/apic/*
arch/x86/kernel/ is getting a bit crowded, and the APIC
drivers are scattered into various different files.

Move them to arch/x86/kernel/apic/*, and also remove
the 'gen' prefix from those which had it.

Also move APIC related functionality: the IO-APIC driver,
the NMI and the IPI code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 18:17:36 +01:00
Ingo Molnar
be163a159b x86, apic: rename 'genapic' to 'apic'
Impact: cleanup

Now that all APIC code is consolidated there's nothing 'gen' about
apics anymore - so rename 'struct genapic' to 'struct apic'.

This shortens the code and is nicer to read as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:53:57 +01:00
Ingo Molnar
ab6fb7c0b0 x86, apic: remove ->store_NMI_vector()
Impact: cleanup

It's not used by anything anymore.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:53:56 +01:00
Ingo Molnar
cb81eaedf1 x86, numaq_32: clean up, misc
Impact: cleanup

 - misc other cleanups that change the md5 signature
 - consolidate global variables
 - remove unnecessary __numaq_mps_oem_check() wrapper
 - make numaq_mps_oem_check static
 - update copyrights
 - misc other cleanups pointed out by checkpatch

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:53:54 +01:00
Ingo Molnar
36afc3af04 x86, numaq_32: clean up
Impact: cleanup

- refactor smp_dump_qct()
- tidy up include files, remove duplicates
- misc other cleanups, pointed out by checkpatch

No code changed:

md5:
   9c0bc01a53558c77df0f2ebcda7e11a9  numaq_32.o.before.asm
   9c0bc01a53558c77df0f2ebcda7e11a9  numaq_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:51 +01:00
Ingo Molnar
7da18ed924 x86, es7000: misc cleanups
These are cleanups that change the md5 signature:

 - asm/ => linux/ include conversion
 - simplify the code flow of find_unisys_acpi_oem_table()
 - move ACPI methods into one #ifdef block
 - remove 0/NULL initialization of statics
 - simplify/standardize printouts
 - update copyrights
 - more cleanups, pointed out by checkpatch

arch/x86/kernel/es7000_32.o:

   text	   data	    bss	    dec	    hex	filename
   2693	    192	     44	   2929	    b71	es7000_32.o.before
   2688	    192	     44	   2924	    b6c	es7000_32.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:50 +01:00
Ingo Molnar
352887d1c9 x86, es7000: remove dead code, clean up
Impact: cleanup

 - a number of structure definitions were stale
 - remove needless wrappers around apic definitions
 - fix details noticed by checkpatch

No code changed:

md5:
   029d8fde0aaf6e934ea63bd8b36430fd  es7000_32.o.before.asm
   029d8fde0aaf6e934ea63bd8b36430fd  es7000_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:49 +01:00
Ingo Molnar
d3185b37df x86, es7000: remove externs
Impact: cleanup

In the subarch times there were a number of externs between
various bits of the ES7000 code. Now that there's a single
es7000-platform support file, the externs can be removed and
the functions can be changed the statics.

Beyond the cleanup factor, this also shrinks the size of the
kernel image a bit:

arch/x86/kernel/es7000_32.o:

   text	   data	    bss	    dec	    hex	filename
   2813	    192	     44	   3049	    be9	es7000_32.o.before
   2693	    192	     44	   2929	    b71	es7000_32.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:48 +01:00
Ingo Molnar
b9e0d1aa97 x86, apic: remove apicid_cluster()
There were multiple definitions of apicid_cluster() scattered around
in APIC drivers - but the definitions are equivalent to the already
existing generic APIC_CLUSTER() method.

So remove apicid_cluster() and change all users to APIC_CLUSTER().

No code changed:

md5:
   1b8244ba8d3d6a454593ce10f09dfa58  summit_32.o.before.asm
   1b8244ba8d3d6a454593ce10f09dfa58  summit_32.o.after.asm

md5:
   a593d98a882bf534622c70d9568497ac  es7000_32.o.before.asm
   a593d98a882bf534622c70d9568497ac  es7000_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:47 +01:00
Ingo Molnar
2c4ce18c95 x86, es7000: clean up
No code changed:

arch/x86/kernel/es7000_32.o:

   text	   data	    bss	    dec	    hex	filename
   2813	    192	     44	   3049	    be9	es7000_32.o.before
   2813	    192	     44	   3049	    be9	es7000_32.o.after

md5:
   a593d98a882bf534622c70d9568497ac  es7000_32.o.before.asm
   a593d98a882bf534622c70d9568497ac  es7000_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:46 +01:00
Ingo Molnar
2f205bc47f x86, apic: clean up the cpu_2_logical_apiciddeclaration
extern declarations were scattered in 4 files - consolidate them
into apic.h.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:46 +01:00
Ingo Molnar
77313190d1 x86, apic: clean up arch/x86/kernel/bigsmp_32.c
Impact: cleanup

- remove unnecessary indirections that were artifacts of the subarch code
- clean up include file section
- clean up various small details

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:45 +01:00
Ingo Molnar
5c615feb90 x86, apic: remove stale references to APIC_DEFINITION
Impact: cleanup

APIC_DEFINITION was a hack from the x86 subarch times, it has no
meaning anymore - remove it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:45 +01:00
Ingo Molnar
e641f5f525 x86, apic: remove duplicate asm/apic.h inclusions
Impact: cleanup

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:44 +01:00
Ingo Molnar
7b6aa335ca x86, apic: remove genapic.h
Impact: cleanup

Remove genapic.h and remove all references to it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:44 +01:00
Ingo Molnar
e2780a68f8 x86, apic: merge genapic.h into apic.h
Impact: cleanup

Reduce the number of include files to worry about.
Also, most of the users of APIC facilities had to
include genapic.h already, which embedded apic.h,
so the distinction was meaningless.

[ include apic.h from genapic.h for compatibility. ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:43 +01:00
Ingo Molnar
28aa29eeb3 remove: genapic prepare
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:42 +01:00
Ingo Molnar
7d01d32d3b x86, apic: fix build fallout of genapic changes
- make oprofile build
- select X86_X2APIC from X86_UV - it relies on it
- export genapic for oprofile modular build

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 13:13:25 +01:00
Yinghai Lu
c1eeb2de41 x86: fold apic_ops into genapic
Impact: cleanup

make it simpler, don't need have one extra struct.

v2: fix the sgi_uv build

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 12:22:20 +01:00
Yinghai Lu
06cd9a7dc8 x86: add x2apic config
Impact: cleanup

so could deselect x2apic
and INTR_REMAP will select x2apic

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 12:22:20 +01:00
Ingo Molnar
ee8b53c1cf x86: remove stale arch/x86/include/asm/page_64.h.rej file
Introduced by:

  51c78eb: x86: create _types.h counterparts for page*.h

Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 12:20:13 +01:00
Ingo Molnar
494df596f9 Merge branches 'x86/acpi', 'x86/apic', 'x86/cpudetect', 'x86/headers', 'x86/paravirt', 'x86/urgent' and 'x86/xen'; commit 'v2.6.29-rc5' into x86/core 2009-02-17 12:07:00 +01:00
Gregory CLEMENT
744f659272 [ARM] 5400/1: Add support for inverted rdy_busy pin for Atmel nand device controller
Add support for inverted rdy_busy pin for Atmel nand device controller
It will fix building error on NeoCore926 board.

Acked-by: Andrew Victor <linux@maxim.org.za>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Gregory CLEMENT <gclement@adeneo.adetelgroup.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-16 21:40:39 +00:00
Yinghai Lu
98c061b6cf x86: make APIC_init_uniprocessor() more like smp_prepare_cpus()
Impact: cleanup

1. move localise_nmi_watchdog() later
2. change setup_boot_APIC_clock() to setup_boot_clock() for 64-bit

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 09:37:04 +01:00
Yinghai Lu
3bd25d0fa3 x86: pre init pirq_entries[]
Impact: cleanup

set default value early - this allows the removal of a number
of dynamic initialization codepaths, and an #ifdef.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 09:36:58 +01:00
Jeremy Fitzhardinge
c99608637e x86, xen: do multicall callbacks with interrupts disabled
We can't call the callbacks after enabling interrupts, as we may get a
nested multicall call, which would cause a great deal of havok.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 08:56:41 +01:00
Jeremy Fitzhardinge
3d39e9d07b x86, xen: degrade BUG to WARN when multicall fails
If one of the components of a multicall fails, WARN rather than BUG,
to help with debugging.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 08:56:24 +01:00
Ian Campbell
b93d51dc62 x86, xen: record and display initiator of each multicall when debugging
Store the caller for each multicall so we can report it on failure.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 08:56:11 +01:00
Jeremy Fitzhardinge
9033304a15 x86, xen: short-circuit tests for dom0
When testing for a dom0/initial/privileged domain, make sure the
predicate evaluates to a compile-time 0 if CONFIG_XEN_DOM0 isn't
enabled.  This will make most of the dom0 code evaporate without
much more effort.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 08:55:50 +01:00
Rusty Russell
1371be0f7c cpumask: Use cpu_*_mask accessors code: alpha
Impact: use new API, fix SMP bug.

Use the new accessors rather than frobbing bits directly.

This also removes the bug introduced in ee0c468b (alpha: compile
fixes) which had Alpha setting bits on an on-stack cpumask, not the
cpu_online_map.

Cc: Richard Henderson <rth@twiddle.net>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 17:32:00 +10:30
Rusty Russell
a0abd520fd cpumask: fix powernow-k8: partial revert of 2fdf66b491
Impact: fix powernow-k8 when acpi=off (or other error).

There was a spurious change introduced into powernow-k8 in this patch:
so that we try to "restore" the cpus_allowed we never saved.  We revert
that file.

See lkml "[PATCH] x86/powernow: fix cpus_allowed brokage when
acpi=off" from Yinghai for the bug report.

Cc: Mike Travis <travis@sgi.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 17:31:59 +10:30
Yinghai Lu
970ec1a821 [IA64] fix __apci_unmap_table
Impact: fix build error

to fix:

  tip/arch/ia64/kernel/acpi.c:203: error: conflicting types for '__acpi_unmap_table'
  tip/include/linux/acpi.h:82: error: previous declaration of '__acpi_unmap_table' was here
  tip/arch/ia64/kernel/acpi.c:203: error: conflicting types for '__acpi_unmap_table'
  tip/include/linux/acpi.h:82: error: previous declaration of '__acpi_unmap_table' was here

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 00:43:24 +01:00
Pekka Paalanen
6bc5c366b1 trace: mmiotrace to the tracer menu in Kconfig
Impact: cosmetic change in Kconfig menu layout

This patch was originally suggested by Peter Zijlstra, but seems it
was forgotten.

CONFIG_MMIOTRACE and CONFIG_MMIOTRACE_TEST were selectable
directly under the Kernel hacking / debugging menu in the kernel
configuration system. They were present only for x86 and x86_64.

Other tracers that use the ftrace tracing framework are in their own
sub-menu. This patch moves the mmiotrace configuration options there.
Since the Kconfig file, where the tracer menu is, is not architecture
specific, HAVE_MMIOTRACE_SUPPORT is introduced and provided only by
x86/x86_64. CONFIG_MMIOTRACE now depends on it.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-15 20:03:28 +01:00
Yinghai Lu
88d0f550d7 x86: make 32bit to call enable_IO_APIC early like 64bit
Impact: cleanup

So we remove some #ifdefs.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-15 13:23:46 +01:00
Thomas Gleixner
be716615fe x86, vm86: fix preemption bug
Commit 3d2a71a596 ("x86, traps: converge
do_debug handlers") changed the preemption disable logic of do_debug()
so vm86_handle_trap() is called with preemption disabled resulting in:

 BUG: sleeping function called from invalid context at include/linux/kernel.h:155
 in_atomic(): 1, irqs_disabled(): 0, pid: 3005, name: dosemu.bin
 Pid: 3005, comm: dosemu.bin Tainted: G        W  2.6.29-rc1 #51
 Call Trace:
  [<c050d669>] copy_to_user+0x33/0x108
  [<c04181f4>] save_v86_state+0x65/0x149
  [<c0418531>] handle_vm86_trap+0x20/0x8f
  [<c064e345>] do_debug+0x15b/0x1a4
  [<c064df1f>] debug_stack_correct+0x27/0x2c
  [<c040365b>] sysenter_do_call+0x12/0x2f
 BUG: scheduling while atomic: dosemu.bin/3005/0x10000001

Restore the original calling convention and reenable preemption before
calling handle_vm86_trap().

Reported-by: Michal Suchanek <hramrach@centrum.cz>
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-15 10:46:13 +01:00
Yinghai Lu
f6db44df5b x86: fix typo in filter_cpuid_features()
Impact: fix wrong disabling of cpu features

an amd system got this strange output:

 CPU: CPU feature monitor disabled due to lack of CPUID level 0x5

but in /proc/cpuinfo I have:

 cpuid level	: 5

on intel system:

 CPU: CPU feature monitor disabled due to lack of CPUID level 0x5
 CPU: CPU feature dca disabled due to lack of CPUID level 0x9

but in /proc/cpuinfo i have:

 cpuid level     : 11

Tt turns out there is a typo, and we should use level member in df.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-15 09:03:29 +01:00
Avi Kivity
516a1a7e9d KVM: VMX: Flush volatile msrs before emulating rdmsr
Some msrs (notable MSR_KERNEL_GS_BASE) are held in the processor registers
and need to be flushed to the vcpu struture before they can be read.

This fixes cygwin longjmp() failure on Windows x64.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15 02:47:39 +02:00
Marcelo Tosatti
b682b814e3 KVM: x86: fix LAPIC pending count calculation
Simplify LAPIC TMCCT calculation by using hrtimer provided
function to query remaining time until expiration.

Fixes host hang with nested ESX.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15 02:47:38 +02:00
Sheng Yang
2aaf69dcee KVM: MMU: Map device MMIO as UC in EPT
Software are not allow to access device MMIO using cacheable memory type, the
patch limit MMIO region with UC and WC(guest can select WC using PAT and
PCD/PWT).

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15 02:47:37 +02:00
Marcelo Tosatti
abe6655dd6 KVM: x86: disable kvmclock on non constant TSC hosts
This is better.

Currently, this code path is posing us big troubles,
and we won't have a decent patch in time. So, temporarily
disable it.

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15 02:47:36 +02:00
Marcelo Tosatti
d2a8284e8f KVM: PIT: fix i8254 pending count read
count_load_time assignment is bogus: its supposed to contain what it
means, not the expiration time.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15 02:47:36 +02:00
Sheng Yang
ba4cef31d5 KVM: Fix racy in kvm_free_assigned_irq
In the past, kvm_get_kvm() and kvm_put_kvm() was called in assigned device irq
handler and interrupt_work, in order to prevent cancel_work_sync() in
kvm_free_assigned_irq got a illegal state when waiting for interrupt_work done.
But it's tricky and still got two problems:

1. A bug ignored two conditions that cancel_work_sync() would return true result
in a additional kvm_put_kvm().

2. If interrupt type is MSI, we would got a window between cancel_work_sync()
and free_irq(), which interrupt would be injected again...

This patch discard the reference count used for irq handler and interrupt_work,
and ensure the legal state by moving the free function at the very beginning of
kvm_destroy_vm(). And the patch fix the second bug by disable irq before
cancel_work_sync(), which may result in nested disable of irq but OK for we are
going to free it.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15 02:47:36 +02:00
Sheng Yang
ad8ba2cd44 KVM: Add kvm_arch_sync_events to sync with asynchronize events
kvm_arch_sync_events is introduced to quiet down all other events may happen
contemporary with VM destroy process, like IRQ handler and work struct for
assigned device.

For kvm_arch_sync_events is called at the very beginning of kvm_destroy_vm(), so
the state of KVM here is legal and can provide a environment to quiet down other
events.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15 02:47:36 +02:00
Avi Kivity
7a0eb1960e KVM: Avoid using CONFIG_ in userspace visible headers
Kconfig symbols are not available in userspace, and are not stripped by
headers-install.  Avoid their use by adding #defines in <asm/kvm.h> to
suit each architecture.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15 02:47:35 +02:00
Yang Zhang
d39123a486 KVM: ia64: fix fp fault/trap handler
The floating-point registers f6-f11 is used by vmm and
saved in kvm-pt-regs, so should set the correct bit mask
and the pointer in fp_state, otherwise, fpswa may touch
vmm's fp registers instead of guests'.

In addition, for fp trap handling,  since the instruction
which leads to fp trap is completely executed, so can't
use retry machanism to re-execute it, because it may
pollute some registers.

Signed-off-by: Yang Zhang <yang.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-02-15 02:47:35 +02:00
Chris Ball
e49590b6dd x86, olpc: fix model detection without OFW
Impact: fix "garbled display, laptop is unusable" bug

Commit e51a1ac2df ("x86, olpc: fix endian
bug in openfirmware workaround") breaks model comparison on OLPC; the value
0xc2 needs to be scaled up by olpc_board().

The pre-patch version was wrong, but accidentally worked anyway
(big-endian 0xc2 is big enough to satisfy all other board revisions,
but little endian 0xc2 is not).

Signed-off-by: Chris Ball <cjb@laptop.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Andres Salomon <dilinger@queued.net>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-14 23:05:25 +01:00
Andrew Victor
2b768b6cdb [ARM] 5391/1: AT91: Enable GPIO clocks earlier
Enable the GPIO clocks earlier in the initialization sequence.  This
allow the board-setup code to read and set GPIO pins.

Signed-off-by: Marc Pignat <marc.pignat@hevs.ch>
Signed-off-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-14 16:03:36 +00:00
Andrew Victor
2af29b7861 [ARM] 5390/1: AT91: Watchdog fixes
The recently merged AT91SAM9 watchdog driver uses the
AT91SAM9X_WATCHDOG config variable, whereas the original version of
the driver (and the platform support code) used AT91SAM9_WATCHDOG.
This causes the watchdog platform_device to never be registered, and
therefore the driver not to be initialized.

This patch:
- updates the platform support code to use AT91SAM9X_WATCHDOG.
- includes <linux/io.h> to fix compile error (same fix as was applied
to at91rm9200_wdt.c)
- fixes comment regarding watchdog clock-rates in at91rm9200.

Signed-off-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-14 16:01:57 +00:00
Russell King
abf239657b [ARM] omap: fix _omap2_clksel_get_src_field()
_omap2_clksel_get_src_field() was returning the first entry which was
either the default _or_ applicable to the SoC.  This is wrong - we
should be returning the first default which is applicable to the SoC.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-14 13:25:38 +00:00
Russell King
9132f1b453 [ARM] omap: fix omap2_divisor_to_clksel() error return value
The error checks for omap2_divisor_to_clksel() and comment disagree with
the actual value returned on error.  Fix this to return the correct error
value.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-14 13:24:10 +00:00
Jeremy Fitzhardinge
8960f8c8e7 Merge commit 'tip/x86/headers' into x86/untangle2
* commit 'tip/x86/headers': (42 commits)
  x86: fix "__udivdi3" [drivers/scsi/aha1542.ko] undefined
  unconditionally include asm/types.h from linux/types.h
  make linux/types.h as assembly safe
  Neither asm/types.h nor linux/types.h is required for arch/ia64/include/asm/fpu.h
  headers_check fix cleanup: linux/reiserfs_fs.h
  headers_check fix cleanup: linux/nubus.h
  headers_check fix cleanup: linux/coda_psdev.h
  headers_check fix: x86, setup.h
  headers_check fix: x86, prctl.h
  headers_check fix: linux/reinserfs_fs.h
  headers_check fix: linux/socket.h
  headers_check fix: linux/nubus.h
  headers_check fix: linux/in6.h
  headers_check fix: linux/coda_psdev.h
  headers_check fix: xtensa, swab.h
  headers_check fix: powerpc, swab.h
  headers_check fix: powerpc, spu_info.h
  headers_check fix: powerpc, ps3fb.h
  headers_check fix: powerpc, kvm.h
  headers_check fix: powerpc, elf.h
  ...
2009-02-13 12:53:17 -08:00
Ingo Molnar
22796b1572 Merge branch 'core/header-fixes' into x86/headers
Conflicts:
	arch/x86/include/asm/setup.h
2009-02-13 21:05:03 +01:00
James Bottomley
bf33a70a73 x86: fix "__udivdi3" [drivers/scsi/aha1542.ko] undefined
Commit 976e8f677e ("x86: asm/io.h: unify
virt_to_phys/phys_to_virt") changed the return of virt_to_phys from long
to phys_addr_t which is unsigned long long on a PAE platform.

So, I could suggest a fix below since isa addresses may never be above
32 bits.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13 21:02:02 +01:00
Jeremy Fitzhardinge
9b3651cbc2 x86: move more pagetable-related definitions into pgtable*.h
PAGETABLE_LEVELS and the PTE masks should be in pgtable*.h

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-13 11:35:01 -08:00
Jeremy Fitzhardinge
0341c14da4 x86: use _types.h headers in asm where available
In general, the only definitions that assembly files can use
are in _types.S headers (where available), so convert them.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-13 11:35:01 -08:00
Dimitri Sivanich
c466ed2e43 x86, UV: set full apicid in uv_hub_send_ipi
The uv_hub_send_ipi() function needs to set the full apicid in the
UVH_IPI_INT mmr.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13 19:13:13 +01:00
Ian Campbell
694aa96060 xen: fix xen_flush_tlb_others
The commit
    commit 4595f9620c
    Author: Rusty Russell <rusty@rustcorp.com.au>
    Date:   Sat Jan 10 21:58:09 2009 -0800

        x86: change flush_tlb_others to take a const struct cpumask

causes xen_flush_tlb_others to allocate a multicall and then issue it
without initializing it in the case where the cpumask is empty,
leading to:

        [    8.354898] 1 multicall(s) failed: cpu 1
        [    8.354921] Pid: 2213, comm: bootclean Not tainted 2.6.29-rc3-x86_32p-xenU-tip #135
        [    8.354937] Call Trace:
        [    8.354955]  [<c01036e3>] xen_mc_flush+0x133/0x1b0
        [    8.354971]  [<c0105d2a>] ? xen_force_evtchn_callback+0x1a/0x30
        [    8.354988]  [<c0105a60>] xen_flush_tlb_others+0xb0/0xd0
        [    8.355003]  [<c0126643>] flush_tlb_page+0x53/0xa0
        [    8.355018]  [<c0176a80>] do_wp_page+0x2a0/0x7c0
        [    8.355034]  [<c0238f0a>] ? notify_remote_via_irq+0x3a/0x70
        [    8.355049]  [<c0178950>] handle_mm_fault+0x7b0/0xa50
        [    8.355065]  [<c0131a3e>] ? wake_up_new_task+0x8e/0xb0
        [    8.355079]  [<c01337b5>] ? do_fork+0xe5/0x320
        [    8.355095]  [<c0121919>] do_page_fault+0xe9/0x240
        [    8.355109]  [<c0121830>] ? do_page_fault+0x0/0x240
        [    8.355125]  [<c032457a>] error_code+0x72/0x78
        [    8.355139]   call  1/1: op=2863311530 arg=[aaaaaaaa] result=-38     xen_flush_tlb_others+0x41/0xd0

Since empty cpumasks are rare and undoing an xen_mc_entry() is tricky
just issue such requests normally.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13 13:54:14 +01:00
Ingo Molnar
beb6943d8d x86 headers: protect page_32.h via __ASSEMBLY__
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13 13:36:47 +01:00
Ingo Molnar
e43623b4ed x86 headers: include page_types.h in pgtable_types.h
To properly pick up details like PTE_FLAGS_MASK.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13 13:24:19 +01:00
Ingo Molnar
56cefcea7c x86 headers: include linux/types.h
To properly pick up types relied on by prototypes like 'bool'.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13 13:23:02 +01:00
Ingo Molnar
999c7880cc x86 headers: remove duplicate pud_large() definition
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13 13:15:55 +01:00
Ingo Molnar
b233969eaa Merge branch 'x86/untangle2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into x86/headers
Conflicts:
	arch/x86/include/asm/page.h
	arch/x86/include/asm/pgtable.h
	arch/x86/mach-voyager/voyager_smp.c
	arch/x86/mm/fault.c
2009-02-13 13:09:00 +01:00
Ingo Molnar
7032e86967 Merge branches 'x86/paravirt', 'x86/pat', 'x86/setup-v2', 'x86/subarch', 'x86/uaccess' and 'x86/urgent' into x86/core 2009-02-13 09:47:32 +01:00
Ingo Molnar
f268fe7333 Merge branch 'x86/mm' into x86/core 2009-02-13 09:47:24 +01:00
Ingo Molnar
a56cdcb662 Merge branches 'x86/acpi', 'x86/asm', 'x86/cpudetect', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/doc', 'x86/header-fixes', 'x86/headers' and 'x86/minor-fixes' into x86/core 2009-02-13 09:46:36 +01:00
Ingo Molnar
881c47760b Merge branch 'x86/cleanups' into x86/core 2009-02-13 09:45:42 +01:00
Ingo Molnar
ab639f3593 Merge branch 'core/percpu' into x86/core 2009-02-13 09:45:09 +01:00
Ingo Molnar
f8a6b2b9ce Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/kernel/acpi/boot.c
	arch/x86/mm/fault.c
2009-02-13 09:44:22 +01:00
john stultz
b13e24644c x86, hpet: fix for LS21 + HPET = boot hang
Between 2.6.23 and 2.6.24-rc1 a change was made that broke IBM LS21
systems that had the HPET enabled in the BIOS, resulting in boot hangs
for x86_64.

Specifically commit b8ce335906, which
merges the i386 and x86_64 HPET code.

Prior to this commit, when we setup the HPET timers in x86_64, we did
the following:

	hpet_writel(HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL |
                    HPET_TN_32BIT, HPET_T0_CFG);

However after the i386/x86_64 HPET merge, we do the following:

	cfg = hpet_readl(HPET_Tn_CFG(timer));
	cfg |= HPET_TN_ENABLE | HPET_TN_PERIODIC |
			HPET_TN_SETVAL | HPET_TN_32BIT;
	hpet_writel(cfg, HPET_Tn_CFG(timer));

However on LS21s with HPET enabled in the BIOS, the HPET_T0_CFG register
boots with Level triggered interrupts (HPET_TN_LEVEL) enabled. This
causes the periodic interrupt to be not so periodic, and that results in
the boot time hang I reported earlier in the delay calibration.

My fix: Always disable HPET_TN_LEVEL when setting up periodic mode.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13 09:15:46 +01:00
Michael Neuling
26456dcfb8 powerpc/vsx: Fix VSX alignment handler for regs 32-63
Fix the VSX alignment handler for VSX registers > 32.  32-63 are stored
in the VMX part of the thread_struct not the FPR part.

Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: stable@kernel.org (2.6.27 & .28 please)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-13 16:37:45 +11:00
Geoff Levand
0047656e2a powerpc/ps3: Move ps3_mm_add_memory to device_initcall
Change the PS3 hotplug memory routine ps3_mm_add_memory() from
a core_initcall to a device_initcall.

core_initcall routines run before the powerpc topology_init()
startup routine, which is a subsys_initcall, resulting in
failure of ps3_mm_add_memory() when CONFIG_NUMA=y.  When
ps3_mm_add_memory() fails the system will boot with just the
128 MiB of boot memory

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-13 16:37:45 +11:00
Dave Hansen
06eccea6c3 powerpc/mm: Fix numa reserve bootmem page selection
Fix the powerpc NUMA reserve bootmem page selection logic.

commit 8f64e1f2d1 (powerpc: Reserve
in bootmem lmb reserved regions that cross NUMA nodes) changed
the logic for how the powerpc LMB reserved regions were converted
to bootmen reserved regions.  As the folowing discussion reports,
the new logic was not correct.

mark_reserved_regions_for_nid() goes through each LMB on the
system that specifies a reserved area.  It searches for
active regions that intersect with that LMB and are on the
specified node.  It attempts to bootmem-reserve only the area
where the active region and the reserved LMB intersect.  We
can not reserve things on other nodes as they may not have
bootmem structures allocated, yet.

We base the size of the bootmem reservation on two possible
things.  Normally, we just make the reservation start and
stop exactly at the start and end of the LMB.

However, the LMB reservations are not aware of NUMA nodes and
on occasion a single LMB may cross into several adjacent
active regions.  Those may even be on different NUMA nodes
and will require separate calls to the bootmem reserve
functions.  So, the bootmem reservation must be trimmed to
fit inside the current active region.

That's all fine and dandy, but we trim the reservation
in a page-aligned fashion.  That's bad because we start the
reservation at a non-page-aligned address: physbase.

The reservation may only span 2 bytes, but that those bytes
may span two pfns and cause a reserve_size of 2*PAGE_SIZE.

Take the case where you reserve 0x2 bytes at 0x0fff and
where the active region ends at 0x1000.  You'll jump into
that if() statment, but node_ar.end_pfn=0x1 and
start_pfn=0x0.  You'll end up with a reserve_size=0x1000,
and then call

  reserve_bootmem_node(node, physbase=0xfff, size=0x1000);

0x1000 may not be on the same node as 0xfff.  Oops.

In almost all the vm code, end_<anything> is not inclusive.
If you have an end_pfn of 0x1234, page 0x1234 is not
included in the range.  Using PFN_UP instead of the
(>> >> PAGE_SHIFT) will make this consistent with the other VM
code.

We also need to do math for the reserved size with physbase
instead of start_pfn.  node_ar.end_pfn << PAGE_SHIFT is
*precisely* the end of the node.  However,
(start_pfn << PAGE_SHIFT) is *NOT* precisely the beginning
of the reserved area.  That is, of course, physbase.
If we don't use physbase here, the reserve_size can be
made too large.

From: Dave Hansen <dave@linux.vnet.ibm.com>
Tested-by: Geoff Levand <geoffrey.levand@am.sony.com>  Tested on PS3.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-13 16:37:45 +11:00
Philippe Gerum
fbc78b07ba powerpc/mm: Fix _PAGE_CHG_MASK to protect _PAGE_SPECIAL
Fix _PAGE_CHG_MASK so that pte_modify() does not affect the _PAGE_SPECIAL bit.

Signed-off-by: Philippe Gerum <rpm@xenomai.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-13 16:37:44 +11:00
Thomas Gleixner
7ad9de6ac8 x86: CPA avoid repeated lazy mmu flush
Impact: Flush the lazy MMU only once

Pending mmu updates only need to be flushed once to bring the
in-memory pagetable state up to date.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-02-12 23:11:58 +01:00
Thomas Gleixner
34b0900d32 x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context
Impact: Catch cases where lazy MMU state is active in a preemtible context

arch_flush_lazy_mmu_cpu() has been changed to disable preemption so
the checks in enter/leave will never trigger. Put the preemtible()
check into arch_flush_lazy_mmu_cpu() to catch such cases.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-02-12 23:11:58 +01:00
Jeremy Fitzhardinge
d85cf93da6 x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption
Impact: avoid access to percpu vars in preempible context

They are intended to be used whenever there's the possibility
that there's some stale state which is going to be overwritten
with a queued update, or to force a state change when we may be
in lazy mode.  Either way, we could end up calling it with
preemption enabled, so wrap the functions in their own little
preempt-disable section so they can be safely called in any
context (though preemption should never be enabled if we're actually
in a lazy state).

(Move out of line to avoid #include dependencies.)
    
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-02-12 23:11:58 +01:00
Ingo Molnar
d88316c243 x86, 32-bit: refactor find_low_pfn_range()
Impact: cleanup

Make the max_low_pfn logic a bit more standard between
lowmem_pfn_init() and highmem_pfn_init().

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-12 15:21:17 +01:00
Ingo Molnar
4769843bc2 x86, 32-bit: clean up find_low_pfn_range()
Impact: cleanup

Split find_low_pfn_range() into two functions:

 - lowmem_pfn_init()
 - highmem_pfn_init()

The former gets called if all of RAM fits into lowmem,
otherwise we call highmem_pfn_init().

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-12 15:21:16 +01:00
Ingo Molnar
3023533de4 x86: fix warning in find_low_pfn_range()
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-12 15:21:15 +01:00
Ingo Molnar
bd282422fe x86, defconfig: turn off CONFIG_SCSI_ISCSI_ATTRS=y
It was enabled by mistake - iscsi is not included in a typical
default PC, and no other architecture has it built-in (=y) either.

Turn it off.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-12 13:06:43 +01:00
Ingo Molnar
556831063b x86, defconfig: turn off CONFIG_ENABLE_WARN_DEPRECATED
deprecation warnings have become rather noisy lately:

drivers/i2c/i2c-core.c: In function ‘i2c_new_device’:
drivers/i2c/i2c-core.c:283: warning: ‘i2c_attach_client’ is deprecated (declared at include/linux/i2c.h:434)
drivers/i2c/i2c-core.c: In function ‘i2c_del_adapter’:
drivers/i2c/i2c-core.c:646: warning: ‘detach_client’ is deprecated (declared at include/linux/i2c.h:154)
drivers/i2c/i2c-core.c: In function ‘i2c_register_driver’:
drivers/i2c/i2c-core.c:713: warning: ‘detach_client’ is deprecated (declared at include/linux/i2c.h:154)
drivers/i2c/i2c-core.c: In function ‘__detach_adapter’:
drivers/i2c/i2c-core.c:780: warning: ‘detach_client’ is deprecated (declared at include/linux/i2c.h:154)
drivers/i2c/i2c-core.c: At top level:
drivers/i2c/i2c-core.c:876: warning: ‘i2c_attach_client’ is deprecated (declared at drivers/i2c/i2c-core.c:827)
drivers/i2c/i2c-core.c:876: warning: ‘i2c_attach_client’ is deprecated (declared at drivers/i2c/i2c-core.c:827)
drivers/i2c/i2c-core.c:904: warning: ‘i2c_detach_client’ is deprecated (declared at drivers/i2c/i2c-core.c:879)
drivers/i2c/i2c-core.c:904: warning: ‘i2c_detach_client’ is deprecated (declared at drivers/i2c/i2c-core.c:879)

So turn it off for now - these reminders can obscure critical warnings.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-12 12:53:50 +01:00
Ingo Molnar
dd5fc55449 x86, defconfig: update the 64-bit defconfig
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-12 12:48:48 +01:00
Ingo Molnar
bc8bd002b8 x86, defconfig: update the 32-bit defconfig
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-12 12:43:52 +01:00
Tobias Klauser
270c5609e2 sh: Storage class should be before const qualifier
The C99 specification states in section 6.11.5:

The placement of a storage-class specifier other than at the
beginning of the declaration specifiers in a declaration is an
obsolescent feature.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-02-12 17:26:09 +09:00
Suresh Siddha
be03d9e802 x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem
Jeff Mahoney reported:

> With Suse's hwinfo tool, on -tip:
> WARNING: at arch/x86/mm/pat.c:637 reserve_pfn_range+0x5b/0x26d()

reserve_pfn_range() is not tracking the memory range below 1MB
as non-RAM and as such is inconsistent with similar checks in
reserve_memtype() and free_memtype()

Rename the pagerange_is_ram() to pat_pagerange_is_ram() and add the
"track legacy 1MB region as non RAM" condition.

And also, fix reserve_pfn_range() to return -EINVAL, when the pfn
range is RAM. This is to be consistent with this API design.

Reported-and-tested-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-12 08:27:27 +01:00
Jeremy Fitzhardinge
4f06b0436b x86/cpa: make sure cpa is safe to call in lazy mmu mode
Impact: fix race leading to crash under KVM and Xen

The CPA code may be called while we're in lazy mmu update mode - for
example, when using DEBUG_PAGE_ALLOC and doing a slab allocation
in an interrupt handler which interrupted a lazy mmu update.  In this
case, the in-memory pagetable state may be out of date due to pending
queued updates.  We need to flush any pending updates before inspecting
the page table.  Similarly, we must explicitly flush any modifications
CPA may have made (which comes down to flushing queued operations when
flushing the TLB).

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Stable Kernel <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-12 08:27:26 +01:00
Randy Dunlap
58105ef185 x86: UV: fix header struct usage
Impact: Fixes warning

Fix uv.h struct usage:

arch/x86/include/asm/uv/uv.h:16: warning: 'struct mm_struct' declared inside parameter list
arch/x86/include/asm/uv/uv.h:16: warning: its scope is only this definition or declaration, which is probably not what you want

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-11 17:17:29 -08:00
H. Peter Anvin
7445250927 x86: merge sys_rt_sigreturn between 32 and 64 bits
Impact: cleanup

With the recent changes in the 32-bit code to make system calls which
use struct pt_regs take a pointer, sys_rt_sigreturn() have become
identical between 32 and 64 bits, and both are empty wrappers around
do_rt_sigreturn().  Remove both wrappers and rename both to
sys_rt_sigreturn().

Cc: Brian Gerst <brgerst@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-11 16:31:40 -08:00
Jeremy Fitzhardinge
54321d947a x86: move pte types into pgtable*.h
pgtable*.h is intended for definitions relating to actual pagetables
and their entries, so move all the definitions for
(pte|pmd|pud|pgd)(val)?_t to the appropriate pgtable*.h headers.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-11 14:54:10 -08:00
Jeremy Fitzhardinge
e2f5bda941 x86: define pud_flags and pud_large properly to allow non-PAE builds 2009-02-11 14:54:10 -08:00
Jeremy Fitzhardinge
e42778de31 x86: move defs around to allow paravirt.h to just include page_types.h
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
2009-02-11 14:54:10 -08:00
Jeremy Fitzhardinge
1dfc07aad5 x86: move 2 and 3 level asm-generic defs into page-defs
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
2009-02-11 14:54:09 -08:00
Jeremy Fitzhardinge
51c78eb3f0 x86: create _types.h counterparts for page*.h
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
2009-02-11 14:54:09 -08:00
Jeremy Fitzhardinge
1484096ceb x86: Include pgtable_32|64_types.h in pgtable_types.h
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
2009-02-11 14:54:09 -08:00
Jeremy Fitzhardinge
fb3551491b x86: Split pgtable_64.h into pgtable_64_types.h and pgtable_64.h
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
2009-02-11 14:54:09 -08:00
Jeremy Fitzhardinge
f402a65f93 x86: Split pgtable_32.h into pgtable_32.h and pgtable_32_types.h
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
2009-02-11 14:54:09 -08:00
Jeremy Fitzhardinge
8d19c99faf Split pgtable.h into pgtable_types.h and pgtable.h
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
2009-02-11 14:54:09 -08:00
Jeremy Fitzhardinge
b924a28138 x86: rename *-defs.h to *-_types.h for consistency
The kernel tends to call definition-only headers *_types.h, so rename
the x86 page/pgtable headers accordingly.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-11 14:54:09 -08:00
Brian Gerst
b12bdaf11f x86: use regparm(3) for passed-in pt_regs pointer
Some syscalls need to access the pt_regs structure, either to copy
user register state or to modifiy it.  This patch adds stubs to load
the address of the pt_regs struct into the %eax register, and changes
the syscalls to take the pointer as an argument instead of relying on
the assumption that the pt_regs structure overlaps the function
arguments.

Drop the use of regparm(1) due to concern about gcc bugs, and to move
in the direction of the eventual removal of regparm(0) for asmlinkage.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-11 14:00:56 -08:00
Jaswinder Singh Rajput
ba1511bf7f x86: kernel/mpparse.c fix compilation warnings
arch/x86/kernel/mpparse.c: In function ‘smp_scan_config’:
 arch/x86/kernel/mpparse.c:696: warning: format ‘%08lx’ expects type ‘long unsigned int’, but argument 3 has type ‘phys_addr_t’
 arch/x86/kernel/mpparse.c: In function ‘update_mp_table’:
 arch/x86/kernel/mpparse.c:1014: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 2 has type ‘phys_addr_t’

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 21:01:08 +01:00
Jaswinder Singh Rajput
7651194fb7 x86: mm/init_32.c fix compilation warning
arch/x86/mm/init_32.c: In function ‘find_low_pfn_range’:
 arch/x86/mm/init_32.c:696: warning: format ‘%u’ expects type ‘unsigned int’, but

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 21:00:47 +01:00
Jeremy Fitzhardinge
9049a11de7 Merge commit 'remotes/tip/x86/paravirt' into x86/untangle2
* commit 'remotes/tip/x86/paravirt': (175 commits)
  xen: use direct ops on 64-bit
  xen: make direct versions of irq_enable/disable/save/restore to common code
  xen: setup percpu data pointers
  xen: fix 32-bit build resulting from mmu move
  x86/paravirt: return full 64-bit result
  x86, percpu: fix kexec with vmlinux
  x86/vmi: fix interrupt enable/disable/save/restore calling convention.
  x86/paravirt: don't restore second return reg
  xen: setup percpu data pointers
  x86: split loading percpu segments from loading gdt
  x86: pass in cpu number to switch_to_new_gdt()
  x86: UV fix uv_flush_send_and_wait()
  x86/paravirt: fix missing callee-save call on pud_val
  x86/paravirt: use callee-saved convention for pte_val/make_pte/etc
  x86/paravirt: implement PVOP_CALL macros for callee-save functions
  x86/paravirt: add register-saving thunks to reduce caller register pressure
  x86/paravirt: selectively save/restore regs around pvops calls
  x86: fix paravirt clobber in entry_64.S
  x86/pvops: add a paravirt_ident functions to allow special patching
  xen: move remaining mmu-related stuff into mmu.c
  ...

Conflicts:
	arch/x86/mach-voyager/voyager_smp.c
	arch/x86/mm/fault.c
2009-02-11 11:52:22 -08:00
Linus Torvalds
94dba89533 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  timers: fix TIMER_ABSTIME for process wide cpu timers
  timers: split process wide cpu clocks/timers, fix
  x86: clean up hpet timer reinit
  timers: split process wide cpu clocks/timers, remove spurious warning
  timers: split process wide cpu clocks/timers
  signal: re-add dead task accumulation stats.
  x86: fix hpet timer reinit for x86_64
  sched: fix nohz load balancer on cpu offline
2009-02-11 08:24:32 -08:00
Linus Torvalds
9ce04f9238 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ptrace, x86: fix the usage of ptrace_fork()
  i8327: fix outb() parameter order
  x86: fix math_emu register frame access
  x86: math_emu info cleanup
  x86: include correct %gs in a.out core dump
  x86, vmi: put a missing paravirt_release_pmd in pgd_dtor
  x86: find nr_irqs_gsi with mp_ioapic_routing
  x86: add clflush before monitor for Intel 7400 series
  x86: disable intel_iommu support by default
  x86: don't apply __supported_pte_mask to non-present ptes
  x86: fix grammar in user-visible BIOS warning
  x86/Kconfig.cpu: make Kconfig help readable in the console
  x86, 64-bit: print DMI info in the oops trace
2009-02-11 08:23:22 -08:00
Linus Torvalds
b3f2caaaa8 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  tracing, x86: fix constraint for parent variable
  tracing, x86: fix fixup section to return to original code
  profiling: fix broken profiling regression
2009-02-11 08:22:26 -08:00
Linus Torvalds
93431dd7af Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] Update default configuration.
  [S390] dasd: fix race in dasd timer handling
  [S390] dasd: bus_id -> dev_name() conversion.
  [S390] Fix init irq proc build break.
  [S390] vdso: fix per cpu vdso pointer in lowcore
2009-02-11 08:21:29 -08:00
Linus Torvalds
da8dbb88db Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/mm: Fix _PAGE_COHERENT support on classic ppc32 HW
2009-02-11 08:21:11 -08:00
Ingo Molnar
17993b49b1 x86: make hibernation always-possible
This commit:

  aced3ce: x86/Voyager: remove HIBERNATION Kconfig quirk

Made hibernation only available on UP - instead of making it available
on all of x86. Fix it.

Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 17:20:51 +01:00
Markus Metzger
9f339e7028 x86, ptrace, mm: fix double-free on race
Ptrace_detach() races with __ptrace_unlink() if the traced task is
reaped while detaching. This might cause a double-free of the BTS
buffer.

Change the ptrace_detach() path to only do the memory accounting in
ptrace_bts_detach() and leave the buffer free to ptrace_bts_untrace()
which will be called from __ptrace_unlink().

The fix follows a proposal from Oleg Nesterov.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 15:44:20 +01:00
Ravikiran G Thirumalai
c5c606d9dc x86: cleanup, rename CONFIG_X86_NON_STANDARD to CONFIG_X86_EXTENDED_PLATFORM
Patch to rename the CONFIG_X86_NON_STANDARD to CONFIG_X86_EXTENDED_PLATFORM.

The new name represents the subarches better. Also, default this to 'y'
so that many of the sub architectures that were not easily visible now
become visible.

Also re-organize the extended architecture platform and non standard
platform list alphabetically as suggested by Ingo.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 14:17:40 +01:00
Dean Nelson
1c0040047d SGI IA64 UV: fix ia64 build error in the linux-next tree
Fix the ia64 build error that occurs in the linux-next tree by introducing
an ia64 version of uv.h.

Additionally, clean up the usage of is_uv_system().

Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 13:31:47 +01:00
Brian Gerst
9c8bb6b534 x86: drop -fno-stack-protector annotations after pt_regs fixes
Now that no functions rely on struct pt_regs being passed by value,
various "no stack protector" annotations can be dropped.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 12:40:45 +01:00
Brian Gerst
253f29a4ae x86: pass in pt_regs pointer for syscalls that need it
Some syscalls need to access the pt_regs structure, either to copy
user register state or to modifiy it.  This patch adds stubs to load
the address of the pt_regs struct into the %eax register, and changes
the syscalls to regparm(1) to receive the pt_regs pointer as the
first argument.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 12:40:45 +01:00
Brian Gerst
aa78bcfa01 x86: use pt_regs pointer in do_device_not_available()
The generic exception handler (error_code) passes in the pt_regs
pointer and the error code (unused in this case).  The commit
"x86: fix math_emu register frame access" changed this to pass by
value, which doesn't work correctly with stack protector enabled.
Change it back to use the pt_regs pointer.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 12:40:44 +01:00
Ingo Molnar
891393745a Merge commit 'v2.6.29-rc4' into x86/cleanups 2009-02-11 11:38:55 +01:00
Tejun Heo
5c79d2a517 x86: fix x86_32 stack protector bugs
Impact: fix x86_32 stack protector

Brian Gerst found out that %gs was being initialized to stack_canary
instead of stack_canary - 20, which basically gave the same canary
value for all threads.  Fixing this also exposed the following bugs.

* cpu_idle() didn't call boot_init_stack_canary()

* stack canary switching in switch_to() was being done too late making
  the initial run of a new thread use the old stack canary value.

Fix all of them and while at it update comment in cpu_idle() about
calling boot_init_stack_canary().

Reported-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 11:33:49 +01:00
Ingo Molnar
160d8dac12 x86, apic: make generic_apic_probe() generally available
Impact: build fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 11:27:39 +01:00
Ingo Molnar
d5b5a232b2 Merge branch 'x86/apic' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into x86/apic 2009-02-11 10:49:40 +01:00
Alok Kataria
0e81cb59c7 x86, apic: fix initialization of wakeup_cpu
With refactoring of wake_cpu macros the 32bit code in tip doesn't
execute generic_apic_probe if CONFIG_X86_32_NON_STANDARD is not set.

Even on a x86 STANDARD cpu we need to execute the generic_apic_probe
function, as we rely on this function to execute the update_genapic
quirk which initilizes apic->wakeup_cpu.

Failing to do so results in we making a call to a null function in do_boot_cpu.

The stack trace without the patch goes like this.

Booting processor 1 APIC 0x1 ip 0x6000
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<(null)>] (null)
*pdpt = 0000000000839001 *pde = 0000000000c97067 *pte = 0000000000000163
Oops: 0000 [#1] SMP
last sysfs file:
Modules linked in:

Pid: 1, comm: swapper Not tainted (2.6.29-rc4-tip #18) VMware Virtual Platform
EIP: 0062:[<00000000>] EFLAGS: 00010293 CPU: 0
EIP is at 0x0
EAX: 00000001 EBX: 00006000 ECX: c077ed00 EDX: 00006000
ESI: 00000001 EDI: 00000001 EBP: ef04cf40 ESP: ef04cf1c
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 006a
Process swapper (pid: 1, ti=ef04c000 task=ef050000 task.ti=ef04c000)
Stack:
 c0644e52 00000000 ef04cf24 ef04cf24 c064468d c0886dc0 00000000 c0702aea
 ef055480 00000001 00000101 dead4ead ffffffff ffffffff c08af530 00000000
 c0709715 ef04cf60 ef04cf60 00000001 00000000 00000000 dead4ead ffffffff
Call Trace:
 [<c0644e52>] ? native_cpu_up+0x2de/0x45b
 [<c064468d>] ? do_fork_idle+0x0/0x19
 [<c0645c5e>] ? _cpu_up+0x88/0xe8
 [<c0645d20>] ? cpu_up+0x42/0x4e
 [<c07e7462>] ? kernel_init+0x99/0x14b
 [<c07e73c9>] ? kernel_init+0x0/0x14b
 [<c040375f>] ? kernel_thread_helper+0x7/0x10
Code:  Bad EIP value.
EIP: [<00000000>] 0x0 SS:ESP 006a:ef04cf1c

I think we should call generic_apic_probe unconditionally for 32 bit now.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 10:48:14 +01:00
Martin Schwidefsky
95ec807e0a [S390] Update default configuration.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-02-11 10:37:40 +01:00
Sachin Sant
0addff8151 [S390] Fix init irq proc build break.
Embed init_irq_proc(s390) within CONFIG_PROC_FS to fix a build break.

Signed-off-by : Sachin Sant <sachinp@in.ibm.com>
2009-02-11 10:37:39 +01:00
Martin Schwidefsky
d5e842c4b7 [S390] vdso: fix per cpu vdso pointer in lowcore
The vdso_per_cpu_data entry in the lowcore structure uses __u32
instead of __u64. If the data page is above 4GB the pointer is
truncated and the kernel crashes.

Reported-by: Mijo Safradin <mijo@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-02-11 10:37:39 +01:00
Steven Rostedt
f47a454db9 tracing, x86: fix constraint for parent variable
The constraint used for retrieving and restoring the parent function
pointer is incorrect. The parent variable is a pointer, and the
address of the pointer is modified by the asm statement and not
the pointer itself. It is incorrect to pass it in as an output
constraint since the asm will never update the pointer.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 10:06:13 +01:00
David S. Miller
1b0e235cc9 sparc64: Fix crashes in jbusmc_print_dimm()
Return was missing for the case where there is no dimm
info match.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-11 00:54:07 -08:00
Kumar Gala
f99fb8a2cb powerpc/mm: Fix _PAGE_COHERENT support on classic ppc32 HW
The following commit:

commit 64b3d0e812
Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date:   Thu Dec 18 19:13:51 2008 +0000

    powerpc/mm: Rework usage of _PAGE_COHERENT/NO_CACHE/GUARDED

broke setting of the _PAGE_COHERENT bit in the PPC HW PTE.  Since we now
actually set _PAGE_COHERENT in the Linux PTE we shouldn't be clearing it
out before we propogate it to the PPC HW PTE.

Reported-by: Martyn Welch <martyn.welch@gefanuc.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-11 16:07:02 +11:00
Linus Torvalds
1385a7ae65 Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] AACI: timeout will reach -1
  [ARM] Storage class should be before const qualifier
  [ARM] pxa: stop and disable IRQ for each DMA channels at startup
  [ARM] pxa: make more SSCR0 bit definitions visible on multiple processors
  [ARM] pxa: fix missing of __REG() definition for ac97 registers access
  [ARM] pxa: fix NAND and MMC clock initialization for pxa3xx
2009-02-10 15:54:50 -08:00
Linus Torvalds
c36c63c511 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Add missing sparsemem.h include
  powerpc/pci: mmap anonymous memory when legacy_mem doesn't exist
  powerpc/cell: Add missing #include for oprofile
  powerpc/ftrace: Fix math to calculate offset in TOC
  powerpc: Don't emulate mr. instructions
  powerpc/fsl-booke: Fix mapping functions to use phys_addr_t
  arch/powerpc: Eliminate double sizeof
  powerpc/cpm2: Fix set interrupt type
  powerpc/83xx: Fix TSEC0 workability on MPC8313E-RDB boards
  powerpc/83xx: Fix missing #{address,size}-cells in mpc8313erdb.dts
  powerpc/83xx: Build breakage for CONFIG_PM but no CONFIG_SUSPEND
2009-02-10 11:55:12 -08:00
Linus Torvalds
226b79104f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: Fix probe_kernel_{read,write}().
  sparc64: Kill .fixup section bloat.
  sparc64: Don't hook up pcr_ops on spitfire chips.
  sparc64: Call dump_stack() in die_nmi().
2009-02-10 11:48:49 -08:00
Steven Rostedt
e3944bfac9 tracing, x86: fix fixup section to return to original code
Impact: fix to prevent a kernel crash on fault

If for some reason the pointer to the parent function on the
stack takes a fault, the fix up code will not return back to
the original faulting code. This can lead to unpredictable
results and perhaps even a kernel panic.

A fault should not happen, but if it does, we should simply
disable the tracer, warn, and continue running the kernel.
It should not lead to a kernel crash.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-10 13:07:13 -05:00
Clemens Ladisch
b52af40923 i8327: fix outb() parameter order
In i8237A_resume(), when resetting the DMA controller, the parameters to
dma_outb() were mixed up.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
[ cleaned up the file a tiny bit. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 13:13:23 +01:00
Tobias Klauser
e0fc4f97ab [ARM] Storage class should be before const qualifier
The C99 specification states in section 6.11.5:

The placement of a storage-class specifier other than at the beginning of the
declaration specifiers in a declaration is an obsolescent feature.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-10 09:59:19 +00:00
Michael Neuling
0b2f82872f powerpc: Add missing sparsemem.h include
arch/powerpc/platforms/pseries/hotplug-memory.c uses
remove_section_mapping() but doesn't include sparsemem.h which defines
it.  This can cause compilation fails for some configs.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-10 14:39:09 +11:00
Benjamin Herrenschmidt
5b11abfdb5 powerpc/pci: mmap anonymous memory when legacy_mem doesn't exist
The new legacy_mem file in sysfs is causing problems with X on machines
that don't support legacy memory access. The way I initially implemented
it, we would fail with -ENXIO when trying to mmap it, thus exposing to
X that we do support the API but there is no legacy memory.

Unfortunately, X poor error handling is causing it to fail to start when
it gets this error.

This implements a workaround hack that instead maps anonymous memory
instead (using shmem if VM_SHARED is set, just like /dev/zero does).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-10 14:39:08 +11:00
Michael Neuling
d87bf76679 powerpc/cell: Add missing #include for oprofile
arch/powerpc/oprofile/cell/spu_profiler.c is missing a asm/time.h
include which is required for ppc_proc_freq.  This can cause compile
failures for some config combinations.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-10 14:39:08 +11:00
Steven Rostedt
f25f9074c2 powerpc/ftrace: Fix math to calculate offset in TOC
Impact: fix dynamic ftrace with large modules in PPC64

The math to calculate the offset into the TOC that is taken from reading
the trampoline is incorrect. The bottom half of the offset is a signed
extended short. The current code was using an OR to create the offset
when it should have been using an addition.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-10 14:39:08 +11:00
Ananth N Mavinakayanahalli
eef336189b powerpc: Don't emulate mr. instructions
Currently emulate_step() emulates mr. instructions without updating cr0
and this can be disastrous. Don't emulate mr.

This bug has been around for a while, but I am not sure if its a worthy
-stable candidate. I'll leave it to Ben do decide.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-10 14:39:07 +11:00
Kumar Gala
6c24b17453 powerpc/fsl-booke: Fix mapping functions to use phys_addr_t
Fixed v_mapped_by_tlbcam() and p_mapped_by_tlbcam() to use phys_addr_t
instead of unsigned long.  In 36-bit physical mode we really need these
functions to deal with phys_addr_t when trying to match a physical
address or when returning one.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-02-09 21:11:55 -06:00
Tejun Heo
60a5317ff0 x86: implement x86_32 stack protector
Impact: stack protector for x86_32

Implement stack protector for x86_32.  GDT entry 28 is used for it.
It's set to point to stack_canary-20 and have the length of 24 bytes.
CONFIG_CC_STACKPROTECTOR turns off CONFIG_X86_32_LAZY_GS and sets %gs
to the stack canary segment on entry.  As %gs is otherwise unused by
the kernel, the canary can be anywhere.  It's defined as a percpu
variable.

x86_32 exception handlers take register frame on stack directly as
struct pt_regs.  With -fstack-protector turned on, gcc copies the
whole structure after the stack canary and (of course) doesn't copy
back on return thus losing all changed.  For now, -fno-stack-protector
is added to all files which contain those functions.  We definitely
need something better.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:42:01 +01:00
Tejun Heo
ccbeed3a05 x86: make lazy %gs optional on x86_32
Impact: pt_regs changed, lazy gs handling made optional, add slight
        overhead to SAVE_ALL, simplifies error_code path a bit

On x86_32, %gs hasn't been used by kernel and handled lazily.  pt_regs
doesn't have place for it and gs is saved/loaded only when necessary.
In preparation for stack protector support, this patch makes lazy %gs
handling optional by doing the followings.

* Add CONFIG_X86_32_LAZY_GS and place for gs in pt_regs.

* Save and restore %gs along with other registers in entry_32.S unless
  LAZY_GS.  Note that this unfortunately adds "pushl $0" on SAVE_ALL
  even when LAZY_GS.  However, it adds no overhead to common exit path
  and simplifies entry path with error code.

* Define different user_gs accessors depending on LAZY_GS and add
  lazy_save_gs() and lazy_load_gs() which are noop if !LAZY_GS.  The
  lazy_*_gs() ops are used to save, load and clear %gs lazily.

* Define ELF_CORE_COPY_KERNEL_REGS() which always read %gs directly.

xen and lguest changes need to be verified.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:42:00 +01:00
Tejun Heo
d9a89a26e0 x86: add %gs accessors for x86_32
Impact: cleanup

On x86_32, %gs is handled lazily.  It's not saved and restored on
kernel entry/exit but only when necessary which usually is during task
switch but there are few other places.  Currently, it's done by
calling savesegment() and loadsegment() explicitly.  Define
get_user_gs(), set_user_gs() and task_user_gs() and use them instead.

While at it, clean up register access macros in signal.c.

This cleans up code a bit and will help future changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:41:58 +01:00
Tejun Heo
f0d96110f9 x86: use asm .macro instead of cpp #define in entry_32.S
Impact: cleanup

Use .macro instead of cpp #define where approriate.  This cleans up
code and will ease future changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:41:57 +01:00
Tejun Heo
d627ded5ab x86: no stack protector for vdso
Impact: avoid crash on vsyscall

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:41:56 +01:00
Tejun Heo
5d707e9c8e stackprotector: update make rules
Impact: no default -fno-stack-protector if stackp is enabled, cleanup

Stackprotector make rules had the following problems.

* cc support test and warning are scattered across makefile and
  kernel/panic.c.

* -fno-stack-protector was always added regardless of configuration.

Update such that cc support test and warning are contained in makefile
and -fno-stack-protector is added iff stackp is turned off.  While at
it, prepare for 32bit support.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:41:54 +01:00
Tejun Heo
76397f72fb x86: stackprotector.h misc update
Impact: misc udpate

* wrap content with CONFIG_CC_STACK_PROTECTOR so that other arch files
  can include it directly

* add missing includes

This will help future changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:41:29 +01:00