Commit Graph

85449 Commits

Author SHA1 Message Date
Roland Dreier
5128bdc97a IB/core: Remove unused struct ib_device.flags member
Avoid confusion about what it might mean, since it's never initialized.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-08 14:47:26 -08:00
Eli Cohen
e0605d9199 IB/core: Add IP checksum offload support
Add a device capability to show when it can handle checksum offload.
Also add a send flag for inserting checksums and a csum_ok field to
the completion record.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-08 14:37:56 -08:00
Eli Cohen
7143740d26 IPoIB: Add send gather support
This patch acts as a preparation for using checksum offload for IB
devices capable of inserting/verifying checksum in IP packets.  The
patch does not actaully turn on NETIF_F_SG - we defer that to the
patches adding checksum offload capabilities.

We only add support for send gathers for datagram mode, since existing
HW does not support checksum offload on connected QPs.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-08 14:32:37 -08:00
Eli Cohen
eb14032f9e IPoIB: Add high DMA feature flag
All current InfiniBand devices can handle all DMA addresses, and it's
hard to imagine anyone would be silly enough to build a new device
that couldn't.  Therefore, enable the NETIF_F_HIGHDMA feature for IPoIB.

This has no effect for no, but is needed when we enable gather/scatter
support and checksum stateless offloads.

Signed-off-by: Eli Cohen <eli@mellnaox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-08 13:39:26 -08:00
Jack Morgenstein
ea54b10c77 IB/mlx4: Use multiple WQ blocks to post smaller send WQEs
ConnectX HCA supports shrinking WQEs, so that a single work request
can be made of multiple units of wqe_shift.  This way, WRs can differ
in size, and do not have to be a power of 2 in size, saving memory and
speeding up send WR posting.  Unfortunately, if we do this then the
wqe_index field in CQEs can't be used to look up the WR ID anymore, so
our implementation does this only if selective signaling is off.

Further, on 32-bit platforms, we can't use vmap() to make the QP
buffer virtually contigious. Thus we have to use constant-sized WRs to
make sure a WR is always fully within a single page-sized chunk.

Finally, we use WRs with the NOP opcode to avoid wrapping around the
queue buffer in the middle of posting a WR, and we set the
NoErrorCompletion bit to avoid getting completions with error for NOP
WRs.  However, NEC is only supported starting with firmware 2.2.232,
so we use constant-sized WRs for older firmware.  And, since MLX QPs
only support SEND, we use constant-sized WRs in this case.

When stamping during NOP posting, do stamping following setting of the
NOP WQE valid bit.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-08 13:30:02 -08:00
Russ Anderson
785285fc8b [IA64] Fix large MCA bootmem allocation
The MCA code allocates bootmem memory for NR_CPUS, regardless
of how many cpus the system actually has.  This change allocates
memory only for cpus that actually exist.

On my test system with NR_CPUS = 1024, reserved memory was reduced by 130944k.

Before: Memory: 27886976k/28111168k available (8282k code, 242304k reserved, 5928k data, 1792k init)
After:  Memory: 28017920k/28111168k available (8282k code, 111360k reserved, 5928k data, 1792k init)

Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-02-08 12:01:53 -08:00
Tony Luck
427639354f [IA64] Simplify cpu_idle_wait
This is just Venki's patch[*] for x86 ported to ia64.

* http://marc.info/?l=linux-kernel&m=120249201318159&w=2

Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-02-08 12:01:40 -08:00
Petr Tesarik
aa91a2e900 [IA64] Synchronize RBS on PTRACE_ATTACH
When attaching to a stopped process, the RSE must be explicitly
synced to user-space, so the debugger can read the correct values.

Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
CC: Roland McGrath <roland@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-02-08 12:01:29 -08:00
Petr Tesarik
3b2ce0b178 [IA64] Synchronize kernel RSE to user-space and back
This is base kernel patch for ptrace RSE bug. It's basically a backport
from the utrace RSE patch I sent out several weeks ago. please review.

when a thread is stopped (ptraced), debugger might change thread's user
stack (change memory directly), and we must avoid the RSE stored in
kernel to override user stack (user space's RSE is newer than kernel's
in the case). To workaround the issue, we copy kernel RSE to user RSE
before the task is stopped, so user RSE has updated data.  we then copy
user RSE to kernel after the task is resummed from traced stop and
kernel will use the newer RSE to return to user.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
CC: Roland McGrath <roland@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-02-08 12:01:18 -08:00
Petr Tesarik
5aa92ffda1 [IA64] Rename TIF_PERFMON_WORK back to TIF_NOTIFY_RESUME
Since the RSE synchronization will need a TIF_ flag, but all

work-to-be-done bits are already used, so we have to multiplex
TIF_NOTIFY_RESUME again.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-02-08 12:00:54 -08:00
Tony Luck
ad9e39c70f [IA64] Wire up timerfd_{create,settime,gettime} syscalls
Add ia64 hooks for the new syscalls that were added in
commit 4d672e7ac7

Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-02-08 12:00:32 -08:00
Tony Lindgren
e27a93a944 ARM: OMAP1: Misc clean-up
This patch cleans up omap1 files to sync up with linux-omap tree:

- Remove omap-generic MMC config as it should be defined in board-*.c files
  instead of using board-generic.c
- New style I2C board_info from David Brownell <dbrownell@users.sourceforge.net>
- Init section fixes from Dirk Behme <dirk.behme@googlemail.com>

Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:02 -08:00
Tony Lindgren
80dbfde54b ARM: OMAP1: Update defconfigs for omap1
Update defconfigs for omap1

Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:02 -08:00
Andrzej Zaborowski
d7730cc01f ARM: OMAP1: Palm Tungsten E board clean-up
Mostly gpio clean-up.

Signed-off-by: Andrzej Zaborowski <balrog@zabor.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:01 -08:00
Jarkko Nikula
1ed16a86b4 ARM: OMAP1: Use I2C bus registration helper for omap1
This patch starts using introduced I2C bus registration helper by cleaning
up registration currently done in various places and by doing necessary
board file modifications.

Signed-off-by: Jarkko Nikula <jarkko.nikula@nokia.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:01 -08:00
Vivek Kutal
feb72f3b31 ARM: OMAP1: Remove omap_sram_idle()
This patch removes omap_sram_idle() that is no longer used.

The function called in pm_idle is omap_sram_suspend, omap_sram_idle() is
not used anywhere in omap1.

Signed-off-by: Vivek Kutal <vivek.kutal@celunite.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:01 -08:00
Vivek Kutal
010bb0cf42 ARM: OMAP1: PM fixes for OMAP1
This patch does the following:

- Fixes the omap_pm_idle() code so that we enter WFI mode in idle.
- /sys/power/sleep_while_idle is created only when 32k timer is used

Signed-off-by: Vivek Kutal <vivek.kutal@celunite.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:01 -08:00
Carlos Eduardo Aguiar
087c50302f ARM: OMAP1: Use MMC multislot structures for Siemens SX1 board
Use MMC multislot structures for Siemens SX1 board

Signed-off-by: Carlos Eduardo Aguiar <carlos.aguiar@indt.org.br>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:01 -08:00
Felipe Balbi
138ab9f832 ARM: OMAP1: Make omap1 use MMC multislot structures
Make omap1 use new MMC multislot structures. The related MMC
patches will be sent separately.

Signed-off-by: Felipe Balbi <felipe.lima@indt.org.br>
Signed-off-by: Anderson Briglia <anderson.briglia@indt.org.br>
Signed-off-by: Carlos Eduardo Aguiar <carlos.aguiar@indt.org.br>
Signed-off-by: David Cohen <david.cohen@indt.org.br>
Signed-off-by: Eduardo Valentin <eduardo.valentin@indt.org.br>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:01 -08:00
David Cohen
6e2d410724 ARM: OMAP1: Change the comments to C style
Change the comments to C style

Signed-off-by: David Cohen <david.cohen@indt.org.br>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:00 -08:00
Tony Lindgren
78be63252b ARM: OMAP1: Make omap1 boards to use omap_nand_platform_data
This patch adds omap_nand_platform data based on a patch
by Shahrom Sharif-Kashani <sshahrom@micron.com>, and makes
omap1 boards to use omap_nand_platform_data instead of
nand_platform_data used earlier.

Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:00 -08:00
Jarkko Nikula
85d05fb3fd ARM: OMAP: Add helper module for board specific I2C bus registration
This helper module simplifies I2C bus registration for different OMAP
platforms by doing registration in one place only and to allow board
specific bus configuration like clock rate and number of busses configured.

Helper should cover OMAP processors from first to third generation.

This patch just adds the feature and current implementation cleanup and
board file modifications will be done in following patches.

Signed-off-by: Jarkko Nikula <jarkko.nikula@nokia.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:00 -08:00
Syed Mohammed, Khasim
ce2df9ca41 ARM: OMAP: Add dmtimer support for OMAP3
Add DM timer support for OMAP3.

Fixed source clocks for 3430 by Paul Walmsley <paul@pwsan.com>.

Signed-off-by: Syed Mohammed Khasim <x0khasim@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:00 -08:00
Syed Mohammed, Khasim
471b3aa70c ARM: OMAP: Pre-3430 clean-up for dmtimer.c
Cleanup DM timer list for OMAP2 and OMAP1 to allow
adding support for 3430.

Signed-off-by: Syed Mohammed Khasim  <x0khasim@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:38:00 -08:00
Anand Gadiyar
f8151e5c32 ARM: OMAP: Add DMA support for chaining and 3430
Add DMA support for chaining and 3430.

Also remove old DEBUG_PRINTS as noted by Russell King.

Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:37:59 -08:00
Kevin Hilman
5eb3bb9c0d ARM: OMAP: Add 24xx GPIO debounce support
Add 24xx GPIO debounce support. Also minor formatting
clean-up.

Signed-off-by: Kevin Hilman <khilman@mvista.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:37:59 -08:00
Tony Lindgren
d11ac9791b ARM: OMAP: Get rid of unnecessary ifdefs in GPIO code
Just use cpu_is_omapXXXX() instead. This does not increase
object size.

Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:37:59 -08:00
Syed Mohammed, Khasim
5492fb1a46 ARM: OMAP: Add 3430 gpio support
This patch adds 3430 gpio support.

It also contains a fix by Paul Walmsley <paul@pwsan.com> to use the
correct clock names for OMAP3430.

Signed-off-by: Syed Mohammed Khasim <x0khasim@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:37:59 -08:00
Syed Mohammed Khasim
2c17f61599 ARM: OMAP: Add 3430 CPU identification macros
This patch adds omap3430 CPU identification macros.

Silicon revision check macros added by Girish S G <girishsg@ti.com>.

CPU identification macro and silicon revision check macros
cleaned up by Paul Walmsley <paul@pwsan.com>.

Signed-off-by: Syed Mohammed Khasim <x0khasim@ti.com>
Signed-off-by: Girish S G <girishsg@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:37:58 -08:00
Tony Lindgren
1cccd2a728 ARM: OMAP: Request DSP memory for McBSP
On OMAP1 some McBSP features depend on DSP. Also export
polling functions as suggested by Luis Cargnini.

Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-02-08 10:37:57 -08:00
Byron Bradley
ee44391eae [ARM] Orion: Use the sata_mv driver for the TS-209 SATA
The TS-209 has a two port integrated SATA controller.
Use the sata_mv driver for this.

Signed-off-by: Byron Bradley <byron.bbradley@gmail.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
2008-02-08 13:31:06 -05:00
Byron Bradley
3b277c2965 [ARM] Orion: Use the sata_mv driver for the Kurobox SATA
The Kurobox has a two port integrated SATA controller.
Use the sata_mv driver for this.

Signed-off-by: Byron Bradley <byron.bbradley@gmail.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
2008-02-08 13:29:21 -05:00
Mike Frysinger
920e526f93 [Blackfin] arch: import defines for BF547 -- it is just like the BF548, but no CAN
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
2008-02-09 02:07:08 +08:00
Linus Torvalds
0cf975e169 Merge branch 'cris' of git://www.jni.nu/cris
* 'cris' of git://www.jni.nu/cris: (158 commits)
  CRIS v32: Remove hwregs/timer_defs.h, it is now architecture specific.
  CRIS v32: Change drivers/i2c.c locking.
  CRIS v32: Rewrite ARTPEC-3 gpio driver to avoid volatiles and general cleanup.
  CRIS: Add new timerfd syscall entries.
  MAINTAINERS: Add my information for the CRIS port.
  CRIS v32: Correct spelling of bandwidth in function name.
  CRIS v32: Clean up nandflash.c for ARTPEC-3 and ETRAX FS.
  CRIS v10: Cleanup of drivers/gpio.c
  CRIS v10: drivers/net/cris/eth_v10.c rename LED defines to CRIS_LED to avoid name clash.
  CRIS: Make io_pwm_set_period members unsigned in etraxgpio.h
  CRIS: Move ETRAX_AXISFLASHMAP to common Kconfig file.
  CRIS: Drop regs parameter from call to profile_tick in kernel/time.c
  CRIS v32: Fix minor formatting issue in mach-a3/io.c
  CRIS v32: Initialize GIO even if we're rambooting in kernel/head.S
  CRIS v32: Remove kernel/arbiter.c, it now exists in machine dependent directory.
  CRIS v32: Minor changes to avoid errors in asm-cris/arch-v32/hwregs/reg_rdwr.h
  CRIS v32: arch-v32/hwregs/intr_vect_defs.h moved to machine dependent directory.
  CRIS v32: Correct offset for TASK_pid in asm-cris/arch-v32/offset.h
  CRIS v32: Move register map header to machine dependent directory.
  CRIS v32: Let compiler know that memory is clobbered after a break op.
  ...
2008-02-08 10:01:28 -08:00
Mike Frysinger
67f2d33ec0 [Blackfin] arch: fix build fails only include header files when enabled
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
2008-02-09 01:49:23 +08:00
Linus Torvalds
03054de1e0 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Enhanced partition statistics: documentation update
  Enhanced partition statistics: remove old partition statistics
  Enhanced partition statistics: procfs
  Enhanced partition statistics: sysfs
  Enhanced partition statistics: aoe fix
  Enhanced partition statistics: update partition statitics
  Enhanced partition statistics: core statistics
  block: fixup rq_init() a bit

Manually fixed conflict in drivers/block/aoe/aoecmd.c due to statistics
support.
2008-02-08 09:42:46 -08:00
Linus Torvalds
b5eb9513f7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
  dlm: add __init and __exit marks to init and exit functions
  dlm: eliminate astparam type casting
  dlm: proper types for asts and basts
  dlm: dlm/user.c input validation fixes
  dlm: fix dlm_dir_lookup() handling of too long names
  dlm: fix overflows when copying from ->m_extra to lvb
  dlm: make find_rsb() fail gracefully when namelen is too large
  dlm: receive_rcom_lock_args() overflow check
  dlm: verify that places expecting rcom_lock have packet long enough
  dlm: validate data in dlm_recover_directory()
  dlm: missing length check in check_config()
  dlm: use proper type for ->ls_recover_buf
  dlm: do not byteswap rcom_config
  dlm: do not byteswap rcom_lock
  dlm: dlm_process_incoming_buffer() fixes
  dlm: use proper C for dlm/requestqueue stuff (and fix alignment bug)
2008-02-08 09:33:02 -08:00
Linus Torvalds
dde0013782 Merge branch 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  [POWERPC] Add arch-specific walk_memory_remove() for 64-bit powerpc
  [POWERPC] Enable hotplug memory remove for 64-bit powerpc
  [POWERPC] Add remove_memory() for 64-bit powerpc
  [POWERPC] Make cell IOMMU fixed mapping printk more useful
  [POWERPC] Fix potential cell IOMMU bug when switching back to default DMA ops
  [POWERPC] Don't enable cell IOMMU fixed mapping if there are no dma-ranges
  [POWERPC] Fix cell IOMMU null pointer explosion on old firmwares
  [POWERPC] spufs: Fix timing dependent false return from spufs_run_spu
  [POWERPC] spufs: No need to have a runnable SPU for libassist update
  [POWERPC] spufs: Update SPU_Status[CISHP] in backing runcntl write
  [POWERPC] spufs: Fix state_mutex leaks
  [POWERPC] Disable G5 NAP mode during SMU commands on U3
2008-02-08 09:31:42 -08:00
Linus Torvalds
f3aafa6c25 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: Make use of the new fs/compat_binfmt_elf.c
  [SPARC64]: Make use of compat_sys_ptrace()

Manually fixed trivial delete/modift conflict in arch/sparc64/kernel/binfmt_elf32.c
2008-02-08 09:29:39 -08:00
Linus Torvalds
3668805a54 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  [IPSEC] flow: reorder "struct flow_cache_entry" and remove SLAB_HWCACHE_ALIGN
  [DECNET] ROUTE: remove unecessary alignment
  [IPSEC]: Add support for aes-ctr.
  [ISDN]: fix section mismatch warning in enpci_card_msg
  [TIPC]: declare proto_ops structures as 'const'.
  [TIPC]: Kill unused static inline (x5)
  [TC]: oops in em_meta
  [IPV6] Minor cleanup: remove unused definitions in net/ip6_fib.h
  [IPV6] Minor clenup: remove two unused definitions in net/ip6_route.h
  [AF_IUCV]: defensive programming of iucv_callback_txdone
  [AF_IUCV]: broken send_skb_q results in endless loop
  [IUCV]: wrong irq-disabling locking at module load time
  [CAN]: Minor clean-ups
  [CAN]: Move proto_{,un}register() out of spin-locked region
  [CAN]: Clean up module auto loading
  [IPSEC] flow: Remove an unnecessary ____cacheline_aligned
  [IPV4]: route: fix crash ip_route_input
  [NETFILTER]: xt_iprange: add missing #include
  [NETFILTER]: xt_iprange: fix typo in address family
  [NETFILTER]: nf_conntrack: fix ct_extend ->move operation
  ...
2008-02-08 09:27:06 -08:00
Linus Torvalds
7b791d4455 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  acer-wmi, tc1100-wmi: select ACPI_WMI
  ACPI: WMI: Improve Kconfig description
  ACPI: DMI: add Panasonic CF-52 and Thinpad X61
  ACPI: thermal: syntax, spelling, kernel-doc
  intel_menlo: build on X86 only
  ACPI: build WMI on X86 only
  ACPI: cpufreq: Print _PPC changes via cpufreq debug layer
  ACPI: add newline to printk
2008-02-08 09:25:58 -08:00
Jens Axboe
8811930dc7 splice: missing user pointer access verification
vmsplice_to_user() must always check the user pointer and length
with access_ok() before copying. Likewise, for the slow path of
copy_from_user_mmap_sem() we need to check that we may read from
the user region.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Cc: Wojciech Purczynski <cliph@research.coseinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:25:01 -08:00
Jan Kara
66191dc622 quota: turn quotas off when remounting read-only
Turn off quotas before filesystem is remounted read only.  Otherwise quota
will try to write to read-only filesystem which does no good...  We could
also just refuse to remount ro when quota is enabled but turning quota off
is consistent with what we do on umount.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:44 -08:00
Neil Brown
28ae094c62 ext3 can fail badly when device stops accepting BIO_RW_BARRIER requests
Some devices - notably dm and md - can change their behaviour in response
to BIO_RW_BARRIER requests.  They might start out accepting such requests
but on reconfiguration, they find out that they cannot any more.

ext3 (and other filesystems) deal with this by always testing if
BIO_RW_BARRIER requests fail with EOPNOTSUPP, and retrying the write
requests without the barrier (probably after waiting for any pending writes
to complete).

However there is a bug in the handling for this for ext3.

When ext3 (jbd actually) decides to submit a BIO_RW_BARRIER request, it
sets the buffer_ordered flag on the buffer head.  If the request completes
successfully, the flag STAYS SET.

Other code might then write the same buffer_head after the device has been
reconfigured to not accept barriers.  This write will then fail, but the
"other code" is not ready to handle EOPNOTSUPP errors and the error will be
treated as fatal.

This can be seen without having to reconfigure a device at exactly the
wrong time by putting:

		if (buffer_ordered(bh))
			printk("OH DEAR, and ordered buffer\n");

in the while loop in "commit phase 5" of journal_commit_transaction.

If it ever prints the "OH DEAR ..." message (as it does sometimes for
me), then that request could (in different circumstances) have failed
with EOPNOTSUPP, but that isn't tested for.

My proposed fix is to clear the buffer_ordered flag after it has been
used, as in the following patch.

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:44 -08:00
Eric Sandeen
2dafe1c4d6 reduce large do_mount stack usage with noinlines
do_mount() uses a whopping 616 bytes of stack on x86_64 in 2.6.24-mm1,
largely thanks to gcc inlining the various helper functions.

noinlining these can slim it down a lot; on my box this patch gets it down
to 168, which is mostly the struct nameidata nd; left on the stack.

These functions are called only as do_mount() helpers; none of them should
be in any path that would see a performance benefit from inlining...

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:44 -08:00
Jeff Dike
ac2a659968 uml: fix mm_context memory leak
[ Spotted by Miklos ]

Fix a memory leak in init_new_context.  The struct page ** buffer allocated
for install_special_mapping was never recorded, and thus leaked when the
mm_struct was freed.  Fix it by saving the pointer in mm_context_t and freeing
it in arch_exit_mmap.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:43 -08:00
Jeff Dike
5aaf5f7b87 uml: x86_64 should copy %fs during fork
%fs needs to be copied from parent to child during fork.

Tidied up some whitespace while I was here.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:43 -08:00
Jim Meyering
11a7ac23a2 uml: improved error handling while locating temp dir
* arch/um/os-Linux/mem.c (make_tempfile): Don't deref NULL upon failed malloc.

* arch/um/os-Linux/mem.c (make_tempfile): Handle NULL tempdir.
Don't let a long tempdir (e.g., via TMPDIR) provoke heap corruption.

[ jdike - formatting cleanups, deleted obsolete comment ]

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:43 -08:00
Jeff Dike
5134d8fea0 uml: style fixes in arch/um/os-Linux
Style changes under arch/um/os-Linux:
	include trimming
	CodingStyle fixes
	some printks needed severity indicators

make_tempfile turns out not to be used outside of mem.c, so it is now static.
Its declaration in tempfile.h is no longer needed, and tempfile.h itself is no
longer needed.

create_tmp_file was also made static.

checkpatch moans about an EXPORT_SYMBOL in user_syms.c which is part of a
macro definition - this is copying a bit of kernel infrastructure into the
libc side of UML because the kernel headers can't be included there.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:42 -08:00
Jeff Dike
536788fe2d uml: runtime host VMSPLIT detection
Calculate TASK_SIZE at run-time by figuring out the host's VMSPLIT - this is
needed on i386 if UML is to run on hosts with varying VMSPLITs without
recompilation.

TASK_SIZE is now defined in terms of a variable, task_size.  This gets rid of
an include of pgtable.h from processor.h, which can cause include loops.

On i386, task_size is calculated early in boot by probing the address space in
a binary search to figure out where the boundary between usable and non-usable
memory is.  This tries to make sure that a page that is considered to be in
userspace is, or can be made, read-write.  I'm concerned about a system-global
VDSO page in kernel memory being hit and considered to be a userspace page.

On x86_64, task_size is just the old value of CONFIG_TOP_ADDR.

A bunch of config variable are gone now.  CONFIG_TOP_ADDR is directly replaced
by TASK_SIZE.  NEST_LEVEL is gone since the relocation of the stubs makes it
irrelevant.  All the HOST_VMSPLIT stuff is gone.  All references to these in
arch/um/Makefile are also gone.

I noticed and fixed a missing extern in os.h when adding os_get_task_size.

Note: This has been revised to fix the 32-bit UML on 64-bit host bug that
Miklos ran into.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:42 -08:00