Commit Graph

10028 Commits

Author SHA1 Message Date
Tejun Heo
03faab7827 libata: implement ATA_QCFLAG_RETRY
Currently whether a command should be retried after failure is
determined inside ata_eh_finish().  Add ATA_QCFLAG_RETRY and move the
logic into ata_eh_autopsy().  This makes things clearer and helps
extending retry determination logic.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-17 15:44:20 -04:00
Tejun Heo
6fd3639011 libata: kill ata_chk_status()
ata_chk_status() just calls ops->check_status and it only adds
confusion with other status functions.  Kill it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-17 15:44:18 -04:00
Tejun Heo
071ce34d57 libata: move ata_pci_default_filter() out of CONFIG_PCI
ata_pci_default_filter() doesn't really have anything to do with PCI.
It's generally applicable to BMDMA controllers.  Move it out of
CONFIG_PCI.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-17 15:44:18 -04:00
Tejun Heo
624d5c514e libata: reorganize SFF related stuff
* Move SFF related functions from libata-core.c to libata-sff.c.

  ata_[bmdma_]sff_port_ops, ata_devchk(), ata_dev_try_classify(),
  ata_std_dev_select(), ata_tf_to_host(), ata_busy_sleep(),
  ata_wait_after_reset(), ata_wait_ready(), ata_bus_post_reset(),
  ata_bus_softreset(), ata_bus_reset(), ata_std_softreset(),
  sata_std_hardreset(), ata_fill_sg(), ata_fill_sg_dumb(),
  ata_qc_prep(), ata_dump_qc_prep(), ata_data_xfer(),
  ata_data_xfer_noirq(), ata_pio_sector(), ata_pio_sectors(),
  atapi_send_cdb(), __atapi_pio_bytes(), atapi_pio_bytes(),
  ata_hsm_ok_in_wq(), ata_hsm_qc_complete(), ata_hsm_move(),
  ata_pio_task(), ata_qc_issue_prot(), ata_host_intr(),
  ata_interrupt(), ata_std_ports()

* Make ata_pio_queue_task() global as it's now called from
  libata-sff.c.

* Move SFF related stuff in include/linux/libata.h and
  drivers/ata/libata.h into one place.  While at it, move timing
  constants into the global enum definition and fortify comments a
  bit.

This patch strictly moves stuff around and as such doesn't cause any
functional difference.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-17 15:44:18 -04:00
Tejun Heo
a1efdaba2d libata: make reset related methods proper port operations
Currently reset methods are not specified directly in the
ata_port_operations table.  If a LLD wants to use custom reset
methods, it should construct and use a error_handler which uses those
reset methods.  It's done this way for two reasons.

First, the ops table already contained too many methods and adding
four more of them would noticeably increase the amount of necessary
boilerplate code all over low level drivers.

Second, as ->error_handler uses those reset methods, it can get
confusing.  ie. By overriding ->error_handler, those reset ops can be
made useless making layering a bit hazy.

Now that ops table uses inheritance, the first problem doesn't exist
anymore.  The second isn't completely solved but is relieved by
providing default values - most drivers can just override what it has
implemented and don't have to concern itself about higher level
callbacks.  In fact, there currently is no driver which actually
modifies error handling behavior.  Drivers which override
->error_handler just wraps the standard error handler only to prepare
the controller for EH.  I don't think making ops layering strict has
any noticeable benefit.

This patch makes ->prereset, ->softreset, ->hardreset, ->postreset and
their PMP counterparts propoer ops.  Default ops are provided in the
base ops tables and drivers are converted to override individual reset
methods instead of creating custom error_handler.

* ata_std_error_handler() doesn't use sata_std_hardreset() if SCRs
  aren't accessible.  sata_promise doesn't need to use separate
  error_handlers for PATA and SATA anymore.

* softreset is broken for sata_inic162x and sata_sx4.  As libata now
  always prefers hardreset, this doesn't really matter but the ops are
  forced to NULL using ATA_OP_NULL for documentation purpose.

* pata_hpt374 needs to use different prereset for the first and second
  PCI functions.  This used to be done by branching from
  hpt374_error_handler().  The proper way to do this is to use
  separate ops and port_info tables for each function.  Converted.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:18 -04:00
Tejun Heo
9594719362 libata: kill port_info->sht and ->irq_handler
libata core layer doesn't care about sht or ->irq_handler.  Those are
only of interest to the LLD during initialization.  This is confusing
and has caused several drivers to have duplicate unused initializers
for these fields.

Currently only sata_nv uses these fields.  Make sata_nv use
->private_data, which is supposed to carry LLD-specific information,
instead and kill ->sht and ->irq_handler.  nv_pi_priv structure is
defined and struct literals are used to initialize private_data.
Notational overhead is negligible.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:17 -04:00
Tejun Heo
887125e374 libata: stop overloading port_info->private_data
port_info->private_data is currently used for two purposes - to record
private data about the port_info or to specify host->private_data to
use when allocating ata_host.

This overloading is confusing and counter-intuitive in that
port_info->private_data becomes host->private_data instead of
port->private_data.  In addition, port_info and host don't correspond
to each other 1-to-1.  Currently, the first non-NULL
port_info->private_data is used.

This patch makes port_info->private_data just be what it is -
private_data for the port_info where LLD can jot down extra info.
libata no longer sets host->private_data to the first non-NULL
port_info->private_data, @host_priv argument is added to
ata_pci_init_one() instead.  LLDs which use ata_pci_init_one() can use
this argument to pass in pointer to host private data.  LLDs which
don't should use init-register model anyway and can initialize
host->private_data directly.

Adding @host_priv instead of using init-register model for LLDs which
use ata_pci_init_one() is suggested by Alan Cox.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
2008-04-17 15:44:17 -04:00
Tejun Heo
1bd5b715a3 libata: make ata_pci_init_one() not use ops->irq_handler and pi->sht
ata_pci_init_one() is the only function which uses ops->irq_handler
and pi->sht.  Other initialization functions take the same information
as arguments.  This causes confusion and duplicate unused entries in
structures.

Make ata_pci_init_one() take sht as an argument and use ata_interrupt
implicitly.  All current users use ata_interrupt and if different irq
handler is necessary open coding ata_pci_init_one() using
ata_prepare_sff_host() and ata_activate_sff_host can be done under ten
lines including error handling and driver which requires custom
interrupt handler is likely to require custom initialization anyway.

As ata_pci_init_one() was the last user of ops->irq_handler, this
patch also kills the field.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:17 -04:00
Tejun Heo
029cfd6b74 libata: implement and use ops inheritance
libata lets low level drivers build ata_port_operations table and
register it with libata core layer.  This allows low level drivers
high level of flexibility but also burdens them with lots of
boilerplate entries.

This becomes worse for drivers which support related similar
controllers which differ slightly.  They share most of the operations
except for a few.  However, the driver still needs to list all
operations for each variant.  This results in large number of
duplicate entries, which is not only inefficient but also error-prone
as it becomes very difficult to tell what the actual differences are.

This duplicate boilerplates all over the low level drivers also make
updating the core layer exteremely difficult and error-prone.  When
compounded with multi-branched development model, it ends up
accumulating inconsistencies over time.  Some of those inconsistencies
cause immediate problems and fixed.  Others just remain there dormant
making maintenance increasingly difficult.

To rectify the problem, this patch implements ata_port_operations
inheritance.  To allow LLDs to easily re-use their own ops tables
overriding only specific methods, this patch implements poor man's
class inheritance.  An ops table has ->inherits field which can be set
to any ops table as long as it doesn't create a loop.  When the host
is started, the inheritance chain is followed and any operation which
isn't specified is taken from the nearest ancestor which has it
specified.  This operation is called finalization and done only once
per an ops table and the LLD doesn't have to do anything special about
it other than making the ops table non-const such that libata can
update it.

libata provides four base ops tables lower drivers can inherit from -
base, sata, pmp, sff and bmdma.  To avoid overriding these ops
accidentaly, these ops are declared const and LLDs should always
inherit these instead of using them directly.

After finalization, all the ops table are identical before and after
the patch except for setting .irq_handler to ata_interrupt in drivers
which didn't use to.  The .irq_handler doesn't have any actual effect
and the field will soon be removed by later patch.

* sata_sx4 is still using old style EH and currently doesn't take
  advantage of ops inheritance.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:17 -04:00
Tejun Heo
68d1d07b51 libata: implement and use SHT initializers
libata lets low level drivers build scsi_host_template and register it
to the SCSI layer.  This allows low level drivers high level of
flexibility but also burdens them with lots of boilerplate entries.

This patch implements SHT initializers which can be used to initialize
all the boilerplate entries in a sht.  Three variants of them are
implemented - BASE, BMDMA and NCQ - for different types of drivers.
Note that entries can be overriden by putting individual initializers
after the helper macro.

All sht tables are identical before and after this patch.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:17 -04:00
Tejun Heo
358f9a77a6 libata: implement and use ata_noop_irq_clear()
->irq_clear() is used to clear IRQ bit of a SFF controller and isn't
useful for drivers which don't use libata SFF HSM implementation.
However, it's a required callback and many drivers implement their own
noop version as placeholder.  This patch implements ata_noop_irq_clear
and use it to replace those custom placeholders.

Also, SFF drivers which don't support BMDMA don't need to use
ata_bmdma_irq_clear().  It becomes noop if BMDMA address isn't
initialized.  Convert them to use ata_noop_irq_clear().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:16 -04:00
Tejun Heo
c1bc899f58 libata: reorganize ata_port_operations
Over the time, ops in ata_port_operations has become a bit confusing.
Reorganize.  SFF/BMDMA ops are separated into separate a group as they
will be taken out of ata_port_operations later.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:16 -04:00
Tejun Heo
b558edddb1 libata: kill ata_ehi_schedule_probe()
ata_ehi_schedule_probe() was created to hide details of link-resuming
reset magic.  Now that all the softreset workarounds are gone,
scheduling probe is very simple - set probe_mask and request RESET.
Kill ata_ehi_schedule_probe() and open code it.  This also increases
consistency as ata_ehi_schedule_probe() couldn't cover individual
device probings so they were open-coded even when the helper existed.

While at it, define ATA_ALL_DEVICES as mask of all possible devices on
a link and always use it when requesting probe on link level for
simplicity and consistency.  Setting extra bits in the probe_mask
doesn't hurt anybody.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:16 -04:00
Tejun Heo
8cebf274dd libata: kill ATA_LFLAG_SKIP_D2H_BSY
Some controllers can't reliably record the initial D2H FIS after SATA
link is brought online for whatever reason.  Advanced controllers
which don't have traditional TF register based interface often have
this problem as they don't really have the TF registers to update
while the controller and link are being initialized.

SKIP_D2H_BSY works around the problem by skipping the wait for device
readiness before issuing SRST, so for such controllers libata issues
SRST blindly and hopes for the best.

Now that libata defaults to hardreset, this workaround is no longer
necessary.  For controllers which have support for hardreset, SRST is
never issued by itself.  It is only issued as follow-up SRST for
device classification and PMP initialization, so there's no need to
wait for it from prereset.

Kill ATA_LFLAG_SKIP_D2H_BSY.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:16 -04:00
Tejun Heo
672b2d65ba libata: kill ATA_EHI_RESUME_LINK
ATA_EHI_RESUME_LINK has two functions - promote reset to hardreset if
ATA_LFLAG_HRST_TO_RESUME is set and preventing EH from shortcutting
reset action when probing is requested.  The former is gone now and
the latter can easily be achieved by making EH to perform at least one
reset if reset is requested, which also makes more sense than
depending on RESUME_LINK flag.

As ATA_EHI_RESUME_LINK was the only EHI reset modifier, this also
kills reset modifier handling.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:16 -04:00
Tejun Heo
d692abd92f libata: kill ATA_LFLAG_HRST_TO_RESUME
Now that hardreset is the preferred method of resetting, there's no
need for ATA_LFLAG_HRST_TO_RESUME flag.  Kill it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:15 -04:00
Tejun Heo
cf48062658 libata: prefer hardreset
When both soft and hard resets are available, libata preferred
softreset till now.  The logic behind it was to be softer to devices;
however, this doesn't really help much.  Rationales for the change:

* BIOS may freeze lock certain things during boot and softreset can't
  unlock those.  This by itself is okay but during operation PHY event
  or other error conditions can trigger hardreset and the device may
  end up with different configuration.

  For example, after a hardreset, previously unlockable HPA can be
  unlocked resulting in different device size and thus revalidation
  failure.  Similar condition can occur during or after resume.

* Certain ATAPI devices require hardreset to recover after certain
  error conditions.  On PATA, this is done by issuing the DEVICE RESET
  command.  On SATA, COMRESET has equivalent effect.  The problem is
  that DEVICE RESET needs its own execution protocol.

  For SFF controllers with bare TF access, it can be easily
  implemented but more advanced controllers (e.g. ahci and sata_sil24)
  require specialized implementations.  Simply using hardreset solves
  the problem nicely.

* COMRESET initialization sequence is the norm in SATA land and many
  SATA devices don't work properly if only SRST is used.  For example,
  some PMPs behave this way and libata works around by always issuing
  hardreset if the host supports PMP.

  Like the above example, libata has developed a number of mechanisms
  aiming to promote softreset to hardreset if softreset is not going
  to work.  This approach is time consuming and error prone.

  Also, note that, dependingon how you read the specs, it could be
  argued that PMP fan-out ports require COMRESET to start operation.
  In fact, all the PMPs on the market except one don't work properly
  if COMRESET is not issued to fan-out ports after PMP reset.

* COMRESET is an integral part of SATA connection and any working
  device should be able to handle COMRESET properly.  After all, it's
  the way to signal hardreset during reboot.  This is the most used
  and recommended (at least by the ahci spec) method of resetting
  devices.

So, this patch makes libata prefer hardreset over softreset by making
the following changes.

* Rename ATA_EH_RESET_MASK to ATA_EH_RESET and use it whereever
  ATA_EH_{SOFT|HARD}RESET used to be used.  ATA_EH_{SOFT|HARD}RESET is
  now only used to tell prereset whether soft or hard reset will be
  issued.

* Strip out now unneeded promote-to-hardreset logics from
  ata_eh_reset(), ata_std_prereset(), sata_pmp_std_prereset() and
  other places.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2008-04-17 15:44:15 -04:00
Paul Gortmaker
cac1f3c8a8 phylib: factor out get_phy_id from within get_phy_device
We were already doing what amounts to a get_phy_id from within
get_phy_device, and rather than duplicate this for the TBIPA
probing, we might as well just factor it out and make it available
instead.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-17 15:31:33 -04:00
Jason Wessel
e3e2aaf7dc kgdb: add documentation
Add in the kgdb documentation for kgdb.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 20:05:42 +02:00
Jason Wessel
7c3078b637 kgdb: clocksource watchdog
In order to not trip the clocksource watchdog, kgdb must touch the
clocksource watchdog on the return to normal system run state.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 20:05:38 +02:00
Jason Wessel
f2d937f3bf consoles: polling support, kgdboc
polled console handling support, to access a console in an irq-less
way while in debug or irq context.

absolutely zero impact as long as CONFIG_CONSOLE_POLL is disabled.
(which is the default)

[ jan.kiszka@siemens.com: lots of cleanups ]
[ mingo@elte.hu: redesign, splitups, cleanups. ]

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 20:05:37 +02:00
Jason Wessel
dc7d552705 kgdb: core
kgdb core code. Handles the protocol and the arch details.

[ mingo@elte.hu: heavily modified, simplified and cleaned up. ]
[ xemul@openvz.org: use find_task_by_pid_ns ]

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 20:05:37 +02:00
Ingo Molnar
c33fa9f560 uaccess: add probe_kernel_write()
add probe_kernel_read() and probe_kernel_write().

Uninlined and restricted to kernel range memory only, as suggested
by Linus.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 20:05:36 +02:00
Matthew Wilcox
714493cd54 Improve semaphore documentation
Move documentation from semaphore.h to semaphore.c as requested by
Andrew Morton.  Also reformat to kernel-doc style and add some more
notes about the implementation.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:43:01 -04:00
Matthew Wilcox
b17170b2fa Simplify semaphore implementation
By removing the negative values of 'count' and relying on the wait_list to
indicate whether we have any waiters, we can simplify the implementation
by removing the protection against an unlikely race condition.  Thanks to
David Howells for his suggestions.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:54 -04:00
Matthew Wilcox
f1241c87a1 Add down_timeout and change ACPI to use it
ACPI currently emulates a timeout for semaphores with calls to
down_trylock and sleep.  This produces horrible behaviour in terms of
fairness and excessive wakeups.  Now that we have a unified semaphore
implementation, adding a real down_trylock is almost trivial.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:46 -04:00
Matthew Wilcox
f06d968658 Introduce down_killable()
down_killable() is the functional counterpart of mutex_lock_killable.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:40 -04:00
Matthew Wilcox
64ac24e738 Generic semaphore implementation
Semaphores are no longer performance-critical, so a generic C
implementation is better for maintainability, debuggability and
extensibility.  Thanks to Peter Zijlstra for fixing the lockdep
warning.  Thanks to Harvey Harrison for pointing out that the
unlikely() was unnecessary.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
2008-04-17 10:42:34 -04:00
Matthew Wilcox
8b91de2e58 Fix quota.h includes
quota.h currently relies on asm/semaphore.h (through some chain; it
doesn't actually include semaphore.h itself) to include wait.h.  As
well as being bad practice to rely on an implicit include, subsequent
patches will break this.  While I'm in this file, add atomic.h and
list.h, and sort the list of includes.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-17 10:42:14 -04:00
Christoph Hellwig
15aebd2866 udf: move headers out include/linux/
There's really no reason to keep udf headers in include/linux as they're
not used by anything but fs/udf/.

This patch merges most of include/linux/udf_fs_i.h into fs/udf/udf_i.h,
include/linux/udf_fs_sb.h into fs/udf/udf_sb.h and
include/linux/udf_fs.h into fs/udf/udfdecl.h.

The only thing remaining in include/linux/ is a stub of udf_fs_i.h
defining the four user-visible udf ioctls.  It's also moved from
unifdef-y to headers-y because it can be included unconditionally now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:23 +02:00
Oleg Nesterov
3f3eafc921 locking: remove unused double_spin_lock()
double_spin_lock() has no callers, and it can't be used without additional
lockdep annotations, remove it.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 12:22:31 +02:00
Oleg Nesterov
8e60e05fdc hrtimers: simplify lockdep handling
In order to avoid the false positive from lockdep, each per-cpu base->lock has
the separate lock class and migrate_hrtimers() uses double_spin_lock().

This is overcomplicated: except for migrate_hrtimers() we never take 2 locks
at once, and migrate_hrtimers() can use spin_lock_nested().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 12:22:31 +02:00
Thomas Gleixner
a332d86d3c hrtimer: add nanosleep specific restart_block member
The back and forth typecasting of restart_block->args is horrible. We
added a separate union member for futex already. Do the same for
nanosleep.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17 12:22:30 +02:00
Russell King
d7b906897e [S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h
> Generic code is not supposed to include irq.h. Replace this include
> by linux/hardirq.h instead and add/replace an include of linux/irq.h
> in asm header files where necessary.
> This change should only matter for architectures that make use of
> GENERIC_CLOCKEVENTS.
> Architectures in question are mips, x86, arm, sh, powerpc, uml and sparc64.
>
> I did some cross compile tests for mips, x86_64, arm, powerpc and sparc64.
> This patch fixes also build breakages caused by the include replacement in
> tick-common.h.

I generally dislike adding optional linux/* includes in asm/* includes -
I'm nervous about this causing include loops.

However, there's a separate point to be discussed here.

That is, what interfaces are expected of every architecture in the kernel.
If generic code wants to be able to set the affinity of interrupts, then
that needs to become part of the interfaces listed in linux/interrupt.h
rather than linux/irq.h.

So what I suggest is this approach instead (against Linus' tree of a
couple of days ago) - we move irq_set_affinity() and irq_can_set_affinity()
to linux/interrupt.h, change the linux/irq.h includes to linux/interrupt.h
and include asm/irq_regs.h where needed (asm/irq_regs.h is supposed to be
rarely used include since not much touches the stacked parent context
registers.)

Build tested on ARM PXA family kernels and ARM's Realview platform
kernels which both use genirq.

[ tglx@linutronix.de: add GENERIC_HARDIRQ dependencies ]

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:05 +02:00
Vladimir Sokolovsky
bbf8eed1a0 IB/mlx4: Add support for resizing CQs
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:33 -07:00
Eli Cohen
3fdcb97f0b IB/mlx4: Add support for modifying CQ moderation parameters
Signed-off-by: Eli Cohen <eli@mellnaox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:33 -07:00
Eli Cohen
b832be1e40 IB/mlx4: Add IPoIB LSO support
Add TSO support to the mlx4_ib driver.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:09:27 -07:00
Eli Cohen
8ff095ec4b IB/mlx4: Add IPoIB checksum offload support
ConnectX devices support checksum generation and verification of TCP
and UDP packets for UD IPoIB messages.  This patch checks if the HCA
supports this and sets the IB_DEVICE_UD_IP_CSUM capability flag if it
does.  It implements support for handling the IB_SEND_IP_CSUM send
flag and setting the csum_ok field in receive work completions.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Ali Ayub <ali@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:01:10 -07:00
Roland Dreier
37608eea86 mlx4_core: Fix confusion between mlx4_event and mlx4_dev_event enums
The struct mlx4_interface.event() method was supposed to get an enum
mlx4_dev_event, but the driver code was actually passing in the
hardware enum mlx4_event values.  Fix up the callers of
mlx4_dispatch_event() so that they pass in the right type of value,
and fix up the event method in mlx4_ib so that it can handle the enum
mlx4_dev_event values.

This eliminates the need for the subtype parameter to the event
method, so remove it.

This also fixes the sparse warning

    drivers/net/mlx4/intf.c:127:48: warning: mixing different enum types
    drivers/net/mlx4/intf.c:127:48:     int enum mlx4_event  versus
    drivers/net/mlx4/intf.c:127:48:     int enum mlx4_dev_event

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:01:08 -07:00
Andy Fleming
c5e38a949b phy: Clean up header style
Multi-line comments weren't all CodingStyle compliant

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-16 20:09:35 -04:00
Andy Fleming
9d9326d3bc phy: Change mii_bus id field to a string
Having the id field be an int was making more complex bus topologies
excessively difficult.  For now, just convert it to a string, and
change all instances of "bus->id = val" to
snprintf(id, MII_BUS_ID_LEN, "%x", val).

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-16 20:09:35 -04:00
Jochen Friedrich
612212a3f2 [POWERPC] i2c: OF helpers for the i2c API
This implements various helpers to support OF bindings for the i2c
API.

Signed-off-by: Jochen Friedrich <jochen@scram.de>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-17 07:46:11 +10:00
Anton Vorontsov
863fbf4966 [POWERPC] OF helpers for the GPIO API
This implements various helpers to support OF bindings for the GPIO
LIB API.

Previously this was PowerPC specific, but it seems this code isn't
arch-dependent anyhow, so let's place it into of/.

SPARC will not see this addition yet, real hardware seem to not use
GPIOs at all. But this might change:

   http://www.leox.org/docs/faq_MLleon.html

"16-bit I/O port" sounds promising. :-)

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-17 07:46:11 +10:00
Linus Torvalds
6af74b03e0 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: update git url for blktrace
  io context: increment task attachment count in ioc_task_link()
2008-04-16 07:45:45 -07:00
Linus Torvalds
b4b8f57965 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  [TCP]: Add return value indication to tcp_prune_ofo_queue().
  PS3: gelic: fix the oops on the broken IE returned from the hypervisor
  b43legacy: fix DMA mapping leakage
  mac80211: remove message on receiving unexpected unencrypted frames
  Update rt2x00 MAINTAINERS entry
  Add rfkill to MAINTAINERS file
  rfkill: Fix device type check when toggling states
  b43legacy: Fix usage of struct device used for DMAing
  ssb: Fix usage of struct device used for DMAing
  MAINTAINERS: move to generic repository for iwlwifi
  b43legacy: fix initvals loading on bcm4303
  rtl8187: Add missing priv->vif assignments
  netconsole: only set CON_PRINTBUFFER if the user specifies a netconsole
  [CAN]: Update documentation of struct sockaddr_can
  MAINTAINERS: isdn4linux@listserv.isdn4linux.de is subscribers-only
  [TCP]: Fix never pruned tcp out-of-order queue.
  [NET_SCHED] sch_api: fix qdisc_tree_decrease_qlen() loop
2008-04-16 07:44:27 -07:00
Denis V. Lunev
f3005d7f4a [NETNS]: Add netns refcnt debug for network devices.
dev_set_net is called for
- just allocated devices
- devices moving from one namespace to another
release_net has proper check inside to distinguish these cases.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 02:02:18 -07:00
Pavel Emelyanov
a9fde26078 [VLAN]: Tag vlan_group_device with net device, not ifindex.
Currently vlan group is searched using one key - the ifindex.
We'll have to lookup the vlan_group by two keys - ifindex and
net. Turning the vlan_group lookup key to struct net_device
pointer will make this process easier.

Besides, this will eliminate one more place in the networking,
that assumes that indexes are unique in the kernel.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:48:04 -07:00
Krzysztof Helt
5f1a3f2ac4 acpi thermal trip points increased to 12
The THERMAL_MAX_TRIPS value is set to 10.  It is too few for the Compaq AP550
machine which has 12 trip points.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Len Brown <lenb@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-15 19:35:41 -07:00
Jan Kara
335e92e8a5 vfs: fix possible deadlock in ext2, ext3, ext4 when using xattrs
mb_cache_entry_alloc() was allocating cache entries with GFP_KERNEL.  But
filesystems are calling this function while holding xattr_sem so possible
recursion into the fs violates locking ordering of xattr_sem and transaction
start / i_mutex for ext2-4.  Change mb_cache_entry_alloc() so that filesystems
can specify desired gfp mask and use GFP_NOFS from all of them.

Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Dave Jones <davej@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-15 19:35:41 -07:00
Michael Buesch
4ac58469f1 ssb: Fix usage of struct device used for DMAing
This fixes DMA on architectures where DMA is nontrivial, like PPC64.
We must use the host-device's (PCI) struct device for any DMA
operation instead of the SSB device. For this we add a new
struct device pointer to the SSB device structure that will always
point to the right device for DMAing.

Without this patch b43 and b44 drivers won't work on complex-DMA
architectures, that for example need dev->archdata for DMA operations.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-15 15:04:35 -04:00
David S. Miller
c50f68c8ae [LMB] Add lmb_alloc_nid()
A variant of lmb_alloc() that tries to allocate memory on a specified
NUMA node 'nid' but falls back to normal lmb_alloc() if that fails.

The caller provides a 'nid_range' function pointer which assists the
allocator.  It is given args 'start', 'end', and pointer to integer
'this_nid'.

It places at 'this_nid' the NUMA node id that corresponds to 'start',
and returns the end address within 'start' to 'end' at which memory
assosciated with 'nid' ends.

This callback allows a platform to use lmb_alloc_nid() in just
about any context, even ones in which early_pfn_to_nid() might
not be working yet.

This function will be used by the NUMA setup code on sparc64, and also
it can be used by powerpc, replacing it's hand crafted
"careful_allocation()" function in arch/powerpc/mm/numa.c

If x86 ever converts it's NUMA support over to using the LMB helpers,
it can use this too as it has something entirely similar.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-15 21:22:17 +10:00
Adrian Bunk
31efdf0530 [ISDN] include/linux/isdn.h: remove dead code
This patch remove the usage of a nonexisting kconfig variable.

Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:30:16 -07:00
Adrian Bunk
99971e70fd [WANPIPE]: Forgotten bits of Sangoma drivers removal.
Robert P. J. Day spotted that my removal of the Sangoma drivers missed
a few bits.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:27:58 -07:00
Jens Axboe
d237e5c7ce io context: increment task attachment count in ioc_task_link()
Thanks to Nikanth Karthikesan <knikanth@suse.de> for reporting this.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-15 09:25:33 +02:00
David Brownell
79966fd9b4 ARM: OMAP: I2C: tps65010 driver converts to gpiolib
Make the tps65010 driver use gpiolib to expose its GPIOs.

Note: This patch will get merged via omap tree instead of I2C
as it will cause some board updates. This has been discussed
at on the I2C list:

http://lists.lm-sensors.org/pipermail/i2c/2008-March/003031.html

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: i2c@lm-sensors.org
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-04-14 09:57:06 -07:00
Christoph Lameter
0f389ec630 slub: No need for per node slab counters if !SLUB_DEBUG
The per node counters are used mainly for showing data through the sysfs API.
If that API is not compiled in then there is no point in keeping track of this
data. Disable counters for the number of slabs and the number of total slabs
if !SLUB_DEBUG. Incrementing the per node counters is also accessing a
potentially contended cacheline so this could actually be a performance
benefit to embedded systems.

SLABINFO support is also affected. It now must depends on SLUB_DEBUG (which
is on by default).

Patch also avoids a check for a NULL kmem_cache_node pointer in new_slab()
if the system is not compiled with NUMA support.

[penberg@cs.helsinki.fi: fix oops and move ->nr_slabs into CONFIG_SLUB_DEBUG]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-04-14 18:53:02 +03:00
Linus Torvalds
533bb8a4d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (31 commits)
  [BRIDGE]: Fix crash in __ip_route_output_key with bridge netfilter
  [NETFILTER]: ipt_CLUSTERIP: fix race between clusterip_config_find_get and _entry_put
  [IPV6] ADDRCONF: Don't generate temporary address for ip6-ip6 interface.
  [IPV6] ADDRCONF: Ensure disabling multicast RS even if privacy extensions are disabled.
  [IPV6]: Use appropriate sock tclass setting for routing lookup.
  [IPV6]: IPv6 extension header structures need to be packed.
  [IPV6]: Fix ipv6 address fetching in raw6_icmp_error().
  [NET]: Return more appropriate error from eth_validate_addr().
  [ISDN]: Do not validate ISDN net device address prior to interface-up
  [NET]: Fix kernel-doc for skb_segment
  [SOCK] sk_stamp: should be initialized to ktime_set(-1L, 0)
  net: check for underlength tap writes
  net: make struct tun_struct private to tun.c
  [SCTP]: IPv4 vs IPv6 addresses mess in sctp_inet[6]addr_event.
  [SCTP]: Fix compiler warning about const qualifiers
  [SCTP]: Fix protocol violation when receiving an error lenght INIT-ACK
  [SCTP]: Add check for hmac_algo parameter in sctp_verify_param()
  [NET_SCHED] cls_u32: refcounting fix for u32_delete()
  [DCCP]: Fix skb->cb conflicts with IP
  [AX25]: Potential ax25_uid_assoc-s leaks on module unload.
  ...
2008-04-14 07:56:24 -07:00
Paul Mackerras
ac7c5353b1 Merge branch 'linux-2.6' 2008-04-14 21:11:02 +10:00
David S. Miller
334f8b2afd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6.26 2008-04-14 03:50:43 -07:00
David S. Miller
df39e8ba56 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/ehea/ehea_main.c
	drivers/net/wireless/iwlwifi/Kconfig
	drivers/net/wireless/rt2x00/rt61pci.c
	net/ipv4/inet_timewait_sock.c
	net/ipv6/raw.c
	net/mac80211/ieee80211_sta.c
2008-04-14 02:30:23 -07:00
Peter Warasin
e7bfd0a1a6 [NETFILTER]: bridge: add ebt_nflog watcher
This patch adds the ebtables nflog watcher to the kernel in order to
allow ebtables log through the nfnetlink_log backend.

Signed-off-by: Peter Warasin <peter@endian.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:54 +02:00
Patrick McHardy
dd13b01036 [NETFILTER]: nf_nat: kill helper and seq_adjust hooks
Connection tracking helpers (specifically FTP) need to be called
before NAT sequence numbers adjustments are performed to be able
to compare them against previously seen ones. We've introduced
two new hooks around 2.6.11 to maintain this ordering when NAT
modules were changed to get called from conntrack helpers directly.

The cost of netfilter hooks is quite high and sequence number
adjustments are only rarely needed however. Add a RCU-protected
sequence number adjustment function pointer and call it from
IPv4 conntrack after calling the helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy
2bc780499a [NETFILTER]: nf_conntrack: add DCCP protocol support
Add DCCP conntrack helper. Thanks to Gerrit Renker <gerrit@erg.abdn.ac.uk>
for review and testing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:49 +02:00
Patrick McHardy
d63a650736 [NETFILTER]: Add partial checksum validation helper
Move the UDP-Lite conntrack checksum validation to a generic helper
similar to nf_checksum() and make it fall back to nf_checksum()
in case the full packet is to be checksummed and hardware checksums
are available. This is to be used by DCCP conntrack, which also
needs to verify partial checksums.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:49 +02:00
Jan Engelhardt
3bb0362d2f [NETFILTER]: remove arpt_(un)register_target indirection macros
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:44 +02:00
Jan Engelhardt
95eea855af [NETFILTER]: remove arpt_target indirection macro
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:43 +02:00
Jan Engelhardt
4abff0775d [NETFILTER]: remove arpt_table indirection macro
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:43 +02:00
Jan Engelhardt
5452e425ad [NETFILTER]: annotate {arp,ip,ip6,x}tables with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:35 +02:00
Jan Engelhardt
b9f61b1603 [NETFILTER]: xt_sctp: simplify xt_sctp.h
The use of xt_sctp.h flagged up -Wshadow warnings in userspace, which
prompted me to look at it and clean it up. Basic operations have been
directly replaced by library calls (memcpy, memset is both available
in the kernel and userspace, and usually faster than a self-made
loop). The is_set and is_clear functions now use a processing time
shortcut, too.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 09:56:04 +02:00
Alexey Dobriyan
666953df35 [NETFILTER]: ip_tables: per-netns FILTER/MANGLE/RAW tables for real
Commit 9335f047fe aka
"[NETFILTER]: ip_tables: per-netns FILTER, MANGLE, RAW"
added per-netns _view_ of iptables rules. They were shown to user, but
ignored by filtering code. Now that it's possible to at least ping loopback,
per-netns tables can affect filtering decisions.

netns is taken in case of
	PRE_ROUTING, LOCAL_IN -- from in device,
	POST_ROUTING, LOCAL_OUT -- from out device,
	FORWARD -- from in device which should be equal to out device's netns.
		   This code is relatively new, so BUG_ON was plugged.

Wrappers were added to a) keep code the same from CONFIG_NET_NS=n users
(overwhelming majority), b) consolidate code in one place -- similar
changes will be done in ipv6 and arp netfilter code.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 09:56:02 +02:00
Gerrit Renker
f5572855ec [SKB]: __skb_queue_tail = __skb_insert before
This expresses __skb_queue_tail() in terms of __skb_insert(),
using __skb_insert_before() as auxiliary function.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:05:28 -07:00
Gerrit Renker
7de6c03336 [SKB]: __skb_append = __skb_queue_after
This expresses __skb_append in terms of __skb_queue_after, exploiting that

  __skb_append(old, new, list) = __skb_queue_after(list, old, new).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:05:09 -07:00
Gerrit Renker
bf29927588 [SKB]: __skb_queue_after(prev) = __skb_insert(prev, prev->next)
By reordering, __skb_queue_after() is expressed in terms of __skb_insert().

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:04:51 -07:00
Gerrit Renker
f525c06d12 [SKB]: __skb_dequeue = skb_peek + __skb_unlink
By rearranging the order of declarations, __skb_dequeue() is expressed in terms of

 * skb_peek() and
 * __skb_unlink(),

thus in effect mirroring the analogue implementation of __skb_dequeue_tail().

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:04:12 -07:00
YOSHIFUJI Hideaki
7cd636fe9c [IPV6]: IPv6 extension header structures need to be packed.
struct ipv6_opt_hdr is the common structure for IPv6 extension
headers, and it is common to increment the pointer to get
the real content.  On the other hand, since the structure
consists only of 1-byte next-header field and 1-byte length
field, size of that structure depends on architecture; 2 or 4.
Add "packed" attribute to get 2.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 23:33:52 -07:00
YOSHIFUJI Hideaki
cee8947338 [IPV6] MROUTE: Do not call ipv6_find_idev() directly.
Since NETDEV_REGISTER notifier chain is responsible for creating
inet6_dev{}, we do not need to call ipv6_find_idev() directly here.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 23:21:16 -07:00
David S. Miller
6fb9114e4b Merge branch 'net-2.6.26-misc-20080412b' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev 2008-04-12 19:19:46 -07:00
Paul Moore
03e1ad7b5d LSM: Make the Labeled IPsec hooks more stack friendly
The xfrm_get_policy() and xfrm_add_pol_expire() put some rather large structs
on the stack to work around the LSM API.  This patch attempts to fix that
problem by changing the LSM API to require only the relevant "security"
pointers instead of the entire SPD entry; we do this for all of the
security_xfrm_policy*() functions to keep things consistent.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 19:07:52 -07:00
Rusty Russell
14daa02139 net: make struct tun_struct private to tun.c
There's no reason for this to be in the header, and it just hurts
recompile time.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Max Krasnyanskiy <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 18:48:58 -07:00
YOSHIFUJI Hideaki
f3ee4010e8 [IPV6]: Define constants for link-local multicast addresses.
- Define link-local all-node / all-router multicast addresses.
- Remove ipv6_addr_all_nodes() and ipv6_addr_all_routers().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:19 +09:00
Linus Torvalds
14897e35fd Merge branch 'docs' of git://git.lwn.net/linux-2.6
* 'docs' of git://git.lwn.net/linux-2.6:
  Add additional examples in Documentation/spinlocks.txt
  Move sched-rt-group.txt to scheduler/
  Documentation: move rpc-cache.txt to filesystems/
  Documentation: move nfsroot.txt to filesystems/
  Spell out behavior of atomic_dec_and_lock() in kerneldoc
  Fix a typo in highres.txt
  Fixes to the seq_file document
  Fill out information on patch tags in SubmittingPatches
  Add the seq_file documentation
2008-04-11 13:24:16 -07:00
J. Bruce Fields
dc07e721a2 Spell out behavior of atomic_dec_and_lock() in kerneldoc
A little more detail here wouldn't hurt.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2008-04-11 13:17:46 -06:00
Heiko Carstens
b0fac02370 Fix "$(AS) -traditional" compile breakage caused by asmlinkage_protect
git commit 54a0151041 ("asmlinkage_protect
replaces prevent_tail_call") causes this build failure on s390:

    AS      arch/s390/kernel/entry64.o
  In file included from arch/s390/kernel/entry64.S:14:
  include/linux/linkage.h:34: error: syntax error in macro parameter list
  make[1]: *** [arch/s390/kernel/entry64.o] Error 1
  make: *** [arch/s390/kernel] Error 2

and some other architectures.  The reason is that some architectures add
the "-traditional" flag to the invocation of $(AS), which disables
variadic macro argument support.

So just surround the new define with an #ifndef __ASSEMBLY__ to prevent
any side effects on asm code.

Cc: Roland McGrath <roland@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-11 08:29:13 -07:00
Bjorn Helgaas
544451a1a3 pnp: increase number of devices supported per protocol
Increase the PNP "number of devices" limit.  We currently use an unsigned
char, which limits us to 256 devices per protocol.  This patch changes that to
an unsigned int.

Not all backends can take advantage of this: we limit ISAPNP to 10 devices in
isapnp_cfg_begin(), and PNPBIOS is limited to 256 devices because the BIOS
interfaces use a one-byte device node number.

But there is no limit on the number of PNPACPI devices we may have.  Large HP
Integrity machines have more than 256, which causes the current "unsigned char
number" to wrap around.  This causes errors like this:

    pnp: PnP ACPI init
    kobject_add failed for 00:00 with -EEXIST, don't try to register things with the same name in the same directory.

    Call Trace:
     [<a000000100010720>] show_stack+0x40/0xa0
     [<a0000001000107b0>] dump_stack+0x30/0x60
     [<a0000001001dbdf0>] kobject_add+0x290/0x2c0
     [<a0000001002bfd40>] device_add+0x160/0x860
     [<a0000001002c0470>] device_register+0x30/0x60
     [<a00000010026ba70>] __pnp_add_device+0x130/0x180
     [<a00000010026bb70>] pnp_add_device+0xb0/0xe0
     [<a0000001007f2730>] pnpacpi_add_device+0x510/0x5a0
     [<a0000001007f2810>] pnpacpi_add_device_handler+0x50/0x80

This patch increases the limit to fix this PNPACPI problem.  It should not
have any adverse effect on ISAPNP or PNPBIOS because their limits are still
enforced in the backends.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-11 08:06:44 -07:00
Linus Torvalds
d10d89ec78 Add commentary about the new "asmlinkage_protect()" macro
It's really a pretty ugly thing to need, and some day it will hopefully
be obviated by teaching gcc about the magic calling conventions for the
low-level system call code, but in the meantime we can at least add big
honking comments about why we need these insane and strange macros.

I took my comments from my version of the macro, but I ended up deciding
to just pick Roland's version of the actual code instead (with his
prettier syntax that uses vararg macros).  Thus the previous two commits
that actually implement it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 17:35:23 -07:00
Roland McGrath
54a0151041 asmlinkage_protect replaces prevent_tail_call
The prevent_tail_call() macro works around the problem of the compiler
clobbering argument words on the stack, which for asmlinkage functions
is the caller's (user's) struct pt_regs.  The tail/sibling-call
optimization is not the only way that the compiler can decide to use
stack argument words as scratch space, which we have to prevent.
Other optimizations can do it too.

Until we have new compiler support to make "asmlinkage" binding on the
compiler's own use of the stack argument frame, we have work around all
the manifestations of this issue that crop up.

More cases seem to be prevented by also keeping the incoming argument
variables live at the end of the function.  This makes their original
stack slots attractive places to leave those variables, so the compiler
tends not clobber them for something else.  It's still no guarantee, but
it handles some observed cases that prevent_tail_call() did not.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 17:28:26 -07:00
Patrick McHardy
4738c1db15 [SKFILTER]: Add SKF_ADF_NLATTR instruction
SKF_ADF_NLATTR searches for a netlink attribute, which avoids manually
parsing and walking attributes. It takes the offset at which to start
searching in the 'A' register and the attribute type in the 'X' register
and returns the offset in the 'A' register. When the attribute is not
found it returns zero.

A top-level attribute can be located using a filter like this
(example for nfnetlink, using struct nfgenmsg):

	...
	{
		/* A = offset of first attribute */
		.code	= BPF_LD | BPF_IMM,
		.k	= sizeof(struct nlmsghdr) + sizeof(struct nfgenmsg)
	},
	{
		/* X = CTA_PROTOINFO */
		.code	= BPF_LDX | BPF_IMM,
		.k	= CTA_PROTOINFO,
	},
	{
		/* A = netlink attribute offset */
		.code	= BPF_LD | BPF_B | BPF_ABS,
		.k	= SKF_AD_OFF + SKF_AD_NLATTR
	},
	{
		/* Exit if not found */
		.code   = BPF_JMP | BPF_JEQ | BPF_K,
		.k	= 0,
		.jt	= <error>
	},
	...

A nested attribute below the CTA_PROTOINFO attribute would then
be parsed like this:

	...
	{
		/* A += sizeof(struct nlattr) */
		.code	= BPF_ALU | BPF_ADD | BPF_K,
		.k	= sizeof(struct nlattr),
	},
	{
		/* X = CTA_PROTOINFO_TCP */
		.code	= BPF_LDX | BPF_IMM,
		.k	= CTA_PROTOINFO_TCP,
	},
	{
		/* A = netlink attribute offset */
		.code	= BPF_LD | BPF_B | BPF_ABS,
		.k	= SKF_AD_OFF + SKF_AD_NLATTR
	},
	...

The data of an attribute can be loaded into 'A' like this:

	...
	{
		/* X = A (attribute offset) */
		.code	= BPF_MISC | BPF_TAX,
	},
	{
		/* A = skb->data[X + k] */
		.code 	= BPF_LD | BPF_B | BPF_IND,
		.k	= sizeof(struct nlattr),
	},
	...

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 02:02:28 -07:00
Stephen Hemminger
43db6d65e0 socket: sk_filter deinline
The sk_filter function is too big to be inlined. This saves 2296 bytes
of text on allyesconfig.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 01:43:09 -07:00
Stephen Hemminger
b715631fad socket: sk_filter minor cleanups
Some minor style cleanups:
  * Move __KERNEL__ definitions to one place in filter.h
  * Use const for sk_filter_len
  * Line wrapping
  * Put EXPORT_SYMBOL next to function definition

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 01:33:47 -07:00
Michael Buesch
d625a29ba6 ssb: Add support for block-I/O
This adds support for block based I/O to SSB.
This is needed in order to efficiently support PIO data
transfers to the card.
The block-I/O support is only compiled, if it's selected by the
weird driver that needs it. So there's no overhead for sane devices.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 16:44:40 -04:00
Michael Buesch
8fe2b65a18 ssb: Turn suspend/resume upside down
Turn the SSB bus suspend mechanism upside down.
Instead of deciding by an internal reference count when to suspend/resume,
let the parent bus call us in their suspend/resume routine.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 15:05:57 -04:00
David S. Miller
8eefca4888 Merge branch 'net-2.6.26-isatap-20080403' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev 2008-04-08 02:33:36 -07:00
Rusty Russell
2557a933b7 virtio: remove overzealous BUG_ON.
The 'disable_cb' callback is designed as an optimization to tell the host
we don't need callbacks now.  As it is not reliable, the debug check is
overzealous: it can happen on two CPUs at the same time.  Document this.

Even if it were reliable, the virtio_net driver doesn't disable
callbacks on transmit so the START_USE/END_USE debugging reentrance
protection can be easily tripped even on UP.

Thanks to Balaji Rao for the bug report and testing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
CC: Balaji Rao <balajirrao@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-07 13:14:22 -07:00
James Bottomley
2f3edc6936 [SCSI] transport_class: BUG if we can't release the attribute container
Every current transport class calls transport_container_release but
ignores the return value.  This is catastrophic if it returns an error
because the containers are part of a global list and the next action of
almost every transport class is to free the memory used by the
container.

Fix this by making transport_container_release a void, but making it BUG
if attribute_container_release returns an error ... this catches the
root cause of a system panic much earlier.  If we don't do this, we get
an eventual BUG when the attribute container list notices the corruption
caused by the freed memory it's still referencing.

Also made attribute_container_release __must_check as a reminder.

Cc: Greg KH <greg@kroah.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:19:10 -05:00
FUJITA Tomonori
b1adaf65ba [SCSI] block: add sg buffer copy helper functions
This patch adds new three helper functions to copy data between an SG
list and a linear buffer.

- sg_copy_from_buffer copies data from linear buffer to an SG list

- sg_copy_to_buffer copies data from an SG list to a linear buffer

When the APIs copy data from a linear buffer to an SG list,
flush_kernel_dcache_page is called. It's not necessary for everyone
but it's a no-op on most architectures and in general the API is not
used in performance critical path.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:15:45 -05:00
Kai Makisara
40f6b36c62 [SCSI] st: add option to use SILI in variable block reads
Add new option MT_ST_SILI to enable setting the SILI bit in reads in variable
block mode. If SILI is set, reading a block shorter than the byte count does
not result in CHECK CONDITION. The length of the block is determined using the
residual count from the HBA. Avoiding the REQUEST SENSE command for every
block speeds up some real applications considerably.

Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-04-07 12:15:39 -05:00
Josh Boyer
834d97d452 [POWERPC] Add of_device_is_available function
IEEE 1275 defined a standard "status" property to indicate the operational
status of a device.  The property has four possible values: okay, disabled,
fail, fail-xxx.  The absence of this property means the operational status
of the device is unknown or okay.

This adds a function called of_device_is_available that checks the state
of the status property of a device.  If the property is absent or set to
either "okay" or "ok", it returns 1.  Otherwise it returns 0.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-07 13:49:23 +10:00
Stelian Pop
8d855317fc atmel_usba_udc: move endpoint declarations into platform data.
The atmel_usba_udc driver is being used by several platforms and arches
(avr32 and at91 ATM), and each platform may have different endpoint
settings.

The patch below moves the endpoint declarations into the platform
data and make the necessary adjustments for AVR32 (improved by
Haavard Skinnemoen <hskinnemoen@atmel.com>).

Signed-off-by: Stelian Pop <stelian@popies.net>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
2008-04-06 17:15:08 -04:00
YOSHIFUJI Hideaki
12802d058a [IPV6]: Comment MRT6_xxx sockopts in include/linux/in6.h.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-05 22:33:40 +09:00
YOSHIFUJI Hideaki
14fb64e1f4 [IPV6] MROUTE: Support PIM-SM (SSM).
Based on ancient patch by Mickael Hoerdt
<hoerdt@clarinet.u-strasbg.fr>, which is available at
<http://www-r2.u-strasbg.fr/~hoerdt/dev/linux_ipv6_mforwarding/patch-linux-ipv6-mforwarding-0.1a>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-05 22:33:39 +09:00
YOSHIFUJI Hideaki
7bc570c8b4 [IPV6] MROUTE: Support multicast forwarding.
Based on ancient patch by Mickael Hoerdt
<hoerdt@clarinet.u-strasbg.fr>, which is available at
<http://www-r2.u-strasbg.fr/~hoerdt/dev/linux_ipv6_mforwarding/patch-linux-ipv6-mforwarding-0.1a>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-05 22:33:38 +09:00
Paul Menage
8bab8dded6 cgroups: add cgroup support for enabling controllers at boot time
The effects of cgroup_disable=foo are:

- foo isn't auto-mounted if you mount all cgroups in a single hierarchy
- foo isn't visible as an individually mountable subsystem

As a result there will only ever be one call to foo->create(), at init time;
all processes will stay in this group, and the group will never be mounted on
a visible hierarchy.  Any additional effects (e.g.  not allocating metadata)
are up to the foo subsystem.

This doesn't handle early_init subsystems (their "disabled" bit isn't set be,
but it could easily be extended to do so if any of the early_init systems
wanted it - I think it would just involve some nastier parameter processing
since it would occur before the command-line argument parser had been run.

Hugh said:

  Ballpark figures, I'm trying to get this question out rather than
  processing the exact numbers: CONFIG_CGROUP_MEM_RES_CTLR adds 15% overhead
  to the affected paths, booting with cgroup_disable=memory cuts that back to
  1% overhead (due to slightly bigger struct page).

  I'm no expert on distros, they may have no interest whatever in
  CONFIG_CGROUP_MEM_RES_CTLR=y; and the rest of us can easily build with or
  without it, or apply the cgroup_disable=memory patches.

Unix bench's execl test result on x86_64 was

== just after boot without mounting any cgroup fs.==
mem_cgorup=off : Execl Throughput       43.0     3150.1      732.6
mem_cgroup=on  : Execl Throughput       43.0     2932.6      682.0
==

[lizf@cn.fujitsu.com: fix boot option parsing]
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04 14:46:26 -07:00
Linus Torvalds
3a143125dd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: revert assign IRQs to hpet timer
  x86: tsc prevent time going backwards
  xen: Clear PG_pinned in release_{pt,pd}()
  xen: Do not pin/unpin PMD pages
  xen: refactor xen_{alloc,release}_{pt,pd}()
  x86, agpgart: scary messages are fortunately obsolete
  xen: fix grant table bug
  x86: fix breakage of vSMP irq operations
  x86: print message if nmi_watchdog=2 cannot be enabled
  x86: fix nmi_watchdog=2 on Pentium-D CPUs
2008-04-04 14:42:58 -07:00
Thomas Gleixner
5761d64b27 x86: revert assign IRQs to hpet timer
The commits:

commit 37a47db8d7
Author: Balaji Rao <balajirrao@gmail.com>
Date:   Wed Jan 30 13:30:03 2008 +0100

    x86: assign IRQs to HPET timers, fix

and

commit e3f37a54f6
Author: Balaji Rao <balajirrao@gmail.com>
Date:   Wed Jan 30 13:30:03 2008 +0100

    x86: assign IRQs to HPET timers

have been identified to cause a regression on some platforms due to
the assignement of legacy IRQs which makes the legacy devices
connected to those IRQs disfunctional.

Revert them.

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=10382

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-04 18:36:49 +02:00
Tejun Heo
e52dcc4899 libata: ATA_12/16 doesn't fall into ATAPI_MISC
SAT passthrus don't really fit into ATAPI_MISC class.  SAT passthru
commands always transfer multiple of 512 bytes and variable length
response is not allowed.  This patch creates a separate category -
ATAPI_PASS_THRU - for these.

This fixes HSM violation on "hdparm -I".

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-04 02:43:36 -04:00
Tejun Heo
436d34b362 libata: uninline atapi_cmd_type()
Uninline atapi_cmd_type().  It doesn't really have to be inline and
more case will be added which need to access unexported libata
variable.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-04-04 02:43:35 -04:00
YOSHIFUJI Hideaki
80a9492a33 [IPV4] MROUTE: Adjust include files for user-space.
<linux/mroute.h> needs <linux/types.h>.
Avoid including <linux/in.h> in user-space, which conflicts with
standard <netinet/in.h>.
Add basic struct and constant in <linux/pim.h>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-04 10:44:42 +09:00
YOSHIFUJI Hideaki
2e8046271f [IPV4] MROUTE: Move PIM definitions to <linux/pim.h>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-04 10:44:42 +09:00
David S. Miller
3bb5da3837 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-04-03 14:33:42 -07:00
Denis V. Lunev
a4aa834a91 [NETNS]: Declare init_net even without CONFIG_NET defined.
This does not look good, but there is no other choice. The compilation
without CONFIG_NET is broken and can not be fixed with ease.

After that there is no need for the following commits:
1567ca7eec
3edf8fa5cc
2d38f9a4f8

Revert them.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 13:04:33 -07:00
David S. Miller
e1ec1b8ccd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/s2io.c
2008-04-02 22:35:23 -07:00
YOSHIFUJI Hideaki
de357cc013 [IPV6] NDISC: Don't rely on node-type hint from L2 unless required.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-03 10:06:01 +09:00
YOSHIFUJI Hideaki
300aaeeaab [IPV6] SIT: Add SIOCGETPRL ioctl to get/dump PRL.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-03 10:06:00 +09:00
Templin, Fred L
fadf6bf060 [IPV6] SIT: Add PRL management for ISATAP.
This patch updates the Linux the Intra-Site Automatic Tunnel Addressing
Protocol (ISATAP) implementation. It places the ISATAP potential router
list (PRL) in the kernel and adds three new private ioctls for PRL
management.

[Add several changes of structure name, constant names etc. - yoshfuji]

Signed-off-by: Fred L. Templin <fred.l.templin@boeing.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-03 10:05:58 +09:00
Christian Borntraeger
dd135ebbd2 kvm: provide kvm.h for all architecture: fixes headers_install
Currently include/linux/kvm.h is not considered by make headers_install,
because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h.  This problem
was introduced by

commit fb56dbb31c
Author: Avi Kivity <avi@qumranet.com>
Date:   Sun Dec 2 10:50:06 2007 +0200

    KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM

    Currently, make headers_check barfs due to <asm/kvm.h>, which <linux/kvm.h>
    includes, not existing.  Rather than add a zillion <asm/kvm.h>s, export kvm.
    only if the arch actually supports it.

    Signed-off-by: Avi Kivity <avi@qumranet.com>

which makes this an 2.6.25 regression.

One way of solving the issue is to enhance Kbuild, but Avi and David conviced
me, that changing headers_install is not the way to go.  This patch changes
the definition for linux/kvm.h to unifdef-y.

If  unifdef-y is used for linux/kvm.h "make headers_check" will fail on all
architectures without asm/kvm.h.  Therefore, this patch also provides
asm/kvm.h on all architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Avi Kivity <avi@qumranet.com>
Cc: Sam Ravnborg <sam@ravnborg.org
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02 15:28:18 -07:00
Linus Torvalds
2f819ae881 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (45 commits)
  [VLAN]: Proc entry is not renamed when vlan device name changes.
  [IPV6]: Fix ICMP relookup error path dst leak
  [ATM] drivers/atm/iphase.c: compilation warning fix
  IPv6: do not create temporary adresses with too short preferred lifetime
  IPv6: only update the lifetime of the relevant temporary address
  bluetooth : __rfcomm_dlc_close lock fix
  bluetooth : use lockdep sub-classes for diffrent bluetooth protocol
  [ROSE/AX25] af_rose: rose_release() fix
  mac80211: correct use_short_preamble handling
  b43: Fix PCMCIA IRQ routing
  b43: Add DMA mapping failure messages
  mac80211: trigger ieee80211_sta_work after opening interface
  [LLC]: skb allocation size for responses
  [IP] UDP: Use SEQ_START_TOKEN.
  [NET]: Remove Documentation/networking/sk98lin.txt
  [ATM] atm/idt77252.c: Make 2 functions static
  [ATM]: Make atm/he.c:read_prom_byte() static
  [IPV6] MCAST: Ensure to check multicast listener(s).
  [LLC]: Kill llc_station_mac_sa symbol export.
  forcedeth: fix locking bug with netconsole
  ...
2008-04-02 07:46:18 -07:00
Fabio Checconi
34e6bbf23c cfq-iosched: fix rcu freeing of cfq io contexts
SLAB_DESTROY_BY_RCU is not a direct substitute for normal call_rcu()
freeing, since it'll page freeing but NOT object freeing. So change
cfq to do the freeing on its own.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-02 15:42:20 +02:00
Denis V. Lunev
c0f39322c3 [NETNS]: Do not include net/net_namespace.h from seq_file.h
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-02 00:10:28 -07:00
Dmitry Torokhov
a7097ff89c Input: make sure input interfaces pin parent input devices
Recent driver core change causes references to parent devices being
dropped early, at device_del() time, as opposed to when all children
are freed. This causes oops in evdev with grabbed devices. Take the
reference to the parent input device ourselves to ensure that it
stays around long enough.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-04-01 00:22:53 -04:00
Benjamin Marzinski
58e9fee13e [GFS2] Invalidate cache at correct point
GFS2 wasn't invalidating its cache before it called into the lock manager
with a request that could potentially drop a lock.  This was leaving a
window where the lock could be actually be held by another node, but the
file's page cache would still appear valid, causing coherency problems.
This patch moves the cache invalidation to before the lock manager call
when dropping a lock. It also adds the option to the lock_dlm lock
manager to not use conversion mode deadlock avoidance, which, on a
conversion from shared to exclusive, could internally drop the lock, and
then reacquire in. GFS2 now asks lock_dlm to not do this.  Instead, GFS2
manually drops the lock and reacquires it.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:44 +01:00
David S. Miller
3edf8fa5cc [NET]: Fix allnoconfig build on powerpc and avr32
As reported by Haavard Skinnemoen and Stephen Rothwell:

> allnoconfig fails with
>
> include/linux/netdevice.h:843: error: implicit declaration of function 'dev_net'
>
> which seems to be because the definition of dev_net is inside #ifdef
> CONFIG_NET, while next_net_device, which calls it, is not.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 00:28:14 -07:00
Linus Torvalds
a77df5cd1c Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: ATA_EHI_LPM should be ATA_EH_LPM
  pata_sil680: only enable MMIO on Cell blades
2008-03-30 14:26:27 -07:00
Linus Torvalds
62ad36a8a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ide: fix defining SUPPORT_VLB_SYNC
  Revert "ide: change master/slave IDENTIFY order"
2008-03-30 14:24:32 -07:00
Al Viro
b2ddb9019e dma_page_list ->base_address is a userland pointer
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-30 14:20:23 -07:00
Al Viro
7d61c4596d compat_sys_wait4() prototype misannotation
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-30 14:20:23 -07:00
Bartlomiej Zolnierkiewicz
729d4de96a ide: fix defining SUPPORT_VLB_SYNC
We need to check for CONFIG_{CRIS,FRV} not {CRIS,FRV}.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-03-29 19:55:17 +01:00
Tejun Heo
3ec25ebd69 libata: ATA_EHI_LPM should be ATA_EH_LPM
EH actions are ATA_EH_* not ATA_EHI_*.  Rename ATA_EHI_LPM to
ATA_EH_LPM.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-29 12:21:31 -04:00
S.Caglar Onur
9307b570a7 drivers/net/arcnet/arcnet.c: use time_* macros
The functions time_before, time_before_eq, time_after, and time_after_eq are
more robust for comparing jiffies against other values.

So use the time_after() macro, defined in linux/jiffies.h, which deals with
wrapping correctly.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: S.Caglar Onur <caglar@pardus.org.tr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-28 22:14:15 -04:00
Pavel Emelyanov
095d911201 [LIB]: Drop the pcounter itself.
The knock-out. The pcounter abstraction is not used any longer in the
kernel.

Not sure whether this should go via netdev tree, but as far as I
remember it was added via this one, and besides Eric thinks that
Andrew shouldn't mind this.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:39:58 -07:00
Matti Linnanvuori
0ef4730927 net: Comment dev_kfree_skb_irq and dev_kfree_skb_any better
Comment dev_kfree_skb_irq and dev_kfree_skb_any better.

Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:33:00 -07:00
David S. Miller
1567ca7eec [NET]: Protect device namespace inlines with CONFIG_NET
Include sites should not be bothered by whether
CONFIG_NET is set or not when trying to include
benign files like linux/etherdevice.h et al.

From a report by Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 15:53:11 -07:00
Linus Torvalds
af8be4e4b3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  [PATCH] mnt_expire is protected by namespace_sem, no need for vfsmount_lock
  [PATCH] do shrink_submounts() for all fs types
  [PATCH] sanitize locking in mark_mounts_for_expiry() and shrink_submounts()
  [PATCH] count ghost references to vfsmounts
  [PATCH] reduce stack footprint in namespace.c
2008-03-28 15:23:01 -07:00
Christoph Hellwig
5ac7ec85bc ext3: don't export ext3_fs.h and jbd.h
Neither of the headers actually compiles when included from userpsace nor
should it be made available as userspace tools should be using the libraries
or at least headers from e2fsprogs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28 14:45:22 -07:00
Harvey Harrison
3afe392598 kernel: add bit rotation helpers for 16 and 8 bit
Will replace open-coded variants elsewhere.  Done in the same
style as the 32-bit versions.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Joe Perches <joe@perches.com>
Cc: Jiri Benc <jbenc@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28 14:45:22 -07:00
Jonathan Corbet
8c703d35fa in_atomic(): document why it is unsuitable for general use
Discourage people from inappropriately using in_atomic()

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28 14:45:21 -07:00
David S. Miller
8e8e43843b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/usb/rndis_host.c
	drivers/net/wireless/b43/dma.c
	net/ipv6/ndisc.c
2008-03-27 18:48:56 -07:00
Ilpo Järvinen
419ae74ecc [NET]: uninline skb_trim, de-bloats
Allyesconfig (v2.6.24-mm1):
-10976  209 funcs, 123 +, 11099 -, diff: -10976 --- skb_trim

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
-7360  192 funcs, 131 +, 7491 -, diff: -7360 --- skb_trim
skb_trim                      |  +42

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:54:01 -07:00
Ilpo Järvinen
c2aa270ad7 [NET]: uninline skb_push, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

-21593  356 funcs, 2418 +, 24011 -, diff: -21593 --- skb_push

Without many debug related CONFIGs (v2.6.25-rc2-mm1):

-13890  341 funcs, 189 +, 14079 -, diff: -13890 --- skb_push
skb_push                      |  +46

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:52:40 -07:00
Ilpo Järvinen
f58518e678 [NET]: uninline dev_alloc_skb, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

-23668  392 funcs, 104 +, 23772 -, diff: -23668 --- dev_alloc_skb

Without many debug CONFIGs (v2.6.25-rc2-mm1):

-12178  382 funcs, 157 +, 12335 -, diff: -12178 --- dev_alloc_skb
dev_alloc_skb                 |  +37

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:51:31 -07:00
Al Viro
c35038beca [PATCH] do shrink_submounts() for all fs types
... and take it out of ->umount_begin() instances.  Call with all locks
already taken (by do_umount()) and leave calling release_mounts() to
caller (it will do release_mounts() anyway, so we can just put into
the same list).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27 20:47:58 -04:00
Al Viro
7c4b93d826 [PATCH] count ghost references to vfsmounts
make propagate_mount_busy() exclude references from the vfsmounts
that had been isolated by umount_tree() and are just waiting for
release_mounts() to dispose of their ->mnt_parent/->mnt_mountpoint.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27 20:47:46 -04:00
Ilpo Järvinen
6be8ac2fdc [NET]: uninline skb_pull, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

-28162  354 funcs, 3005 +, 31167 -, diff: -28162 --- skb_pull

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):

-9697  338 funcs, 221 +, 9918 -, diff: -9697 --- skb_pull
skb_pull                      |  +44

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:47:24 -07:00
Ilpo Järvinen
0dde3e1648 [NET]: uninline skb_put, de-bloats a lot
Allyesconfig (v2.6.24-mm1):

~500 files changed
...
 869 funcs, 198 +, 111003 -, diff: -110805 --- skb_put
  skb_put                       | +104

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):

-60744  855 funcs, 861 +, 61605 -, diff: -60744 --- skb_put
  skb_put                       |  +57

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:43:41 -07:00
David S. Miller
50fd4407b8 [NET]: Use local_irq_{save,restore}() in napi_complete().
Based upon a lockdep report.

Since ->poll() can be invoked from netpoll with interrupts
disabled, we must not unconditionally enable interrupts
in napi_complete().

Instead we must use local_irq_{save,restore}().

Noticed by Peter Zijlstra:

<irqs disabled>

  netpoll_poll()
    poll_napi()
      spin_trylock(&napi->poll_lock)
      poll_one_napi()
        napi->poll() := sky2_poll()
          napi_complete()
            local_irq_disable()
            local_irq_enable() <--- *BUG*

  <irq>
    irq_exit()
      do_softirq()
        net_rx_action()
          spin_lock(&napi->poll_lock) <--- Deadlock!

Because we still hold the lock....

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:42:50 -07:00
Rusty Russell
a6bd8e1303 lguest: comment documentation update.
Took some cycles to re-read the Lguest Journey end-to-end, fix some
rot and tighten some phrases.

Only comments change.  No new jokes, but a couple of recycled old jokes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-03-28 11:05:54 +11:00
Denis V. Lunev
0e5f8be138 [NETNS]: Compile NET /proc support only if CONFIG_NET is set.
This fix broken compilation for 'allnoconfig'. This was introduced by
Introduced by commit 1218854afa ("[NET]
NETNS: Omit seq_net_private->net without CONFIG_NET_NS.")

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 14:25:53 -07:00
Lennert Buytenhek
15a32632d9 sata_mv: mbus decode window support
Make it possible to pass mbus_dram_target_info to the sata_mv
driver via the platform data, make the sata_mv driver program
the window registers based on this data if it is passed in, and
make the Orion platform setup code use this method instead of
programming the SATA mbus window registers by hand.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Reviewed-by: Tzachi Perelstein <tzachi@marvell.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
2008-03-27 14:51:39 -04:00
Lennert Buytenhek
abc848c182 introduce mbus DRAM target info abstraction
Introduce struct mbus_dram_target_info, which will be used for
passing information about the mbus target ID of the DDR unit, and
mbus target attribute, base address and size for each of the DRAM
chip selects from the platform code to peripheral drivers.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Reviewed-by: Tzachi Perelstein <tzachi@marvell.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
2008-03-27 14:51:39 -04:00
Len Brown
86d9fc1293 Merge branches 'release', 'idle', 'redhat-bugzilla-436589', 'sbs' and 'video' into release 2008-03-26 22:50:09 -04:00
Pavel Emelyanov
67727184f2 [VLAN]: Reduce memory consumed by vlan_groups
Currently each vlan_groupd contains 8 pointers on arrays with 512
pointers on struct net_device each  :)  Such a construction "in many
cases ... wastes memory".

My proposal is to allow for some of these arrays pointers be NULL,
meaning that there are no devices in it. When a new device is added
to the vlan_group, the appropriate array is allocated.

The check in vlan_group_get_device's is safe, since the pointer
vg->vlan_devices_arrays[x] can only switch from NULL to not-NULL.
The vlan_group_prealloc_vid() is guarded with rtnl lock and is
also safe.

I've checked (I hope that) all the places, that use these arrays
and found, that the register_vlan_dev is the only place, that can
put a vlan device on an empty vlan_group.

Rough calculations shows, that after the patch a setup with a
single vlan dev (or up to 512 vlans with sequential vids) will
occupy approximately 8 times less memory.

The question I have is - does this patch makes sense, or a totally
new structures are required to store the vlan_devs?

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-03-26 16:27:22 -07:00
Denis V. Lunev
f5aa23fd49 [NETNS]: Compilation warnings under CONFIG_NET_NS.
Recent commits from YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
have been introduced a several compilation warnings
'assignment discards qualifiers from pointer target type'
due to extra const modifier in the inline call parameters of
{dev|sock|twsk}_net_set.

Drop it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 00:48:17 -07:00
Denis V. Lunev
9c2f5746b9 [NETNS]: Compilation fix for include/linux/netdevice.h.
Commit commit c346dca108
([NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS)
breaks compilation with CONFIG_NET_NS set.

Fix the typo.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 00:47:14 -07:00
Thomas Gleixner
06d8308c61 NOHZ: reevaluate idle sleep length after add_timer_on()
add_timer_on() can add a timer on a CPU which is currently in a long
idle sleep, but the timer wheel is not reevaluated by the nohz code on
that CPU. So a timer can be delayed for quite a long time. This
triggered a false positive in the clocksource watchdog code.

To avoid this we need to wake up the idle CPU and enforce the
reevaluation of the timer wheel for the next timer event.

Add a function, which checks a given CPU for idle state, marks the
idle task with NEED_RESCHED and sends a reschedule IPI to notify the
other CPU of the change in the timer wheel.

Call this function from add_timer_on().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: stable@kernel.org

--
 include/linux/sched.h |    6 ++++++
 kernel/sched.c        |   43 +++++++++++++++++++++++++++++++++++++++++++
 kernel/timer.c        |   10 +++++++++-
 3 files changed, 58 insertions(+), 1 deletion(-)
2008-03-26 08:28:55 +01:00
Yi Yang
8b78cf602f cpuidle: fix cpuidle time and usage overflow
cpuidle C-state sysfs node time and usage are very easy to overflow because
they are all of unsigned int type, time will overflow within about two hours,
usage will take longer time to overflow, but they are increasing for ever.

This patch will convert them to unsigned long long.

Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-03-26 00:45:26 -04:00
Patrick McHardy
c7f485abd6 [NETFILTER]: nf_conntrack_sip: RTP routing optimization
Optimize call routing between NATed endpoints: when an external
registrar sends a media description that contains an existing RTP
expectation from a different SNATed connection, the gatekeeper
is trying to route the call directly between the two endpoints.

We assume both endpoints can reach each other directly and
"un-NAT" the addresses, which makes the media stream go between
the two endpoints directly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:26:43 -07:00
Patrick McHardy
0d0ab0378d [NETFILTER]: nf_conntrack_sip: support multiple media channels
Add support for multiple media channels and use it to create
expectations for video streams when present.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:26:24 -07:00
Patrick McHardy
4ab9e64e5e [NETFILTER]: nf_nat_sip: split up SDP mangling
The SDP connection addresses may be contained in the payload multiple
times (in the session description and/or once per media description),
currently only the session description is properly updated. Split up
SDP mangling so the function setting up expectations only updates the
media port, update connection addresses from media descriptions while
parsing them and at the end update the session description when the
final addresses are known.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:26:08 -07:00
Patrick McHardy
a9c1d35917 [NETFILTER]: nf_conntrack_sip: create RTCP expectations
Create expectations for the RTCP connections in addition to RTP connections.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:25:49 -07:00
Patrick McHardy
0f32a40fc9 [NETFILTER]: nf_conntrack_sip: create signalling expectations
Create expectations for incoming signalling connections when seeing
a REGISTER request. This is needed when the registrar uses a
different source port number for signalling messages and for receiving
incoming calls from other endpoints than the registrar.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:25:13 -07:00
Patrick McHardy
2bbb21168a [NETFILTER]: nf_conntrack_sip: introduce URI and header parameter parsing helpers
Introduce URI and header parameter parsing helpers. These are needed
by the conntrack helper to parse expiration values in Contact: header
parameters and by the NAT helper to properly update the Via-header
rport=, received= and maddr= parameters.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:24:24 -07:00
Patrick McHardy
30f33e6dee [NETFILTER]: nf_conntrack_sip: support method specific request/response handling
Add support for per-method request/response handlers and perform SDP
parsing for INVITE/UPDATE requests and for all informational and
successful responses.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:22:20 -07:00
Patrick McHardy
624f8b7bba [NETFILTER]: nf_nat_sip: get rid of text based header translation
Use the URI parsing helper to get the numerical addresses and get rid of the
text based header translation.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:19:30 -07:00
Patrick McHardy
05e3ced297 [NETFILTER]: nf_conntrack_sip: introduce SIP-URI parsing helper
Introduce a helper function to parse a SIP-URI in a header value, optionally
iterating through all headers of this kind.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:19:13 -07:00
Patrick McHardy
ea45f12a27 [NETFILTER]: nf_conntrack_sip: parse SIP headers properly
Introduce new function for SIP header parsing that properly deals with
continuation lines and whitespace in headers and use it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:18:57 -07:00
Patrick McHardy
ac3677406d [NETFILTER]: nf_conntrack_sip: kill request URI "header" definitions
The request URI is not a header and needs to be treated differently than
real SIP headers. Add a seperate function for parsing it and get rid of
the POS_REQ_URI/POS_REG_REQ_URI definitions.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:18:40 -07:00
Patrick McHardy
3e9b4600b4 [NETFILTER]: nf_conntrack_sip: add seperate SDP header parsing function
SDP and SIP headers are quite different, SIP can have continuation lines,
leading and trailing whitespace after the colon and is mostly case-insensitive
while SDP headers always begin on a new line and are followed by an equal
sign and the value, without any whitespace.

Introduce new SDP header parsing function and convert all users that used
the SIP header parsing function. This will allow to properly deal with the
special SIP cases in the SIP header parsing function later.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:17:55 -07:00
Patrick McHardy
212440a7d0 [NETFILTER]: nf_conntrack_sip: remove redundant function arguments
The conntrack reference and ctinfo can be derived from the packet.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:17:13 -07:00
Patrick McHardy
2a6cfb22ae [NETFILTER]: nf_conntrack_sip: adjust dptr and datalen after packet mangling
After mangling the packet, the pointer to the data and the length of the data
portion may change and need to be adjusted.

Use double data pointers and a pointer to the length everywhere and add a
helper function to the NAT helper for performing the adjustments.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:16:54 -07:00
Patrick McHardy
b8beedd25d [NETFILTER]: Add nf_inet_addr_cmp()
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:09:33 -07:00
Paul Mackerras
54f53f2b94 Merge branch 'linux-2.6' 2008-03-26 08:44:18 +11:00
YOSHIFUJI Hideaki
1218854afa [NET] NETNS: Omit seq_net_private->net without CONFIG_NET_NS.
Without CONFIG_NET_NS, no namespace other than &init_net exists,
no need to store net in seq_net_private.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:56 +09:00
YOSHIFUJI Hideaki
3b1e0a655f [NET] NETNS: Omit sock->sk_net without CONFIG_NET_NS.
Introduce per-sock inlines: sock_net(), sock_net_set()
and per-inet_timewait_sock inlines: twsk_net(), twsk_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:55 +09:00
YOSHIFUJI Hideaki
c346dca108 [NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.
Introduce per-net_device inlines: dev_net(), dev_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:53 +09:00
Linus Torvalds
a4083c9271 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: Fix cut-and-paste error in rtl8150.c
  USB: ehci: stop vt6212 bus hogging
  USB: sierra: add another device id
  USB: sierra: dma fixes
  USB: add support for Motorola ROKR Z6 cellphone in mass storage mode
  USB: isd200: fix memory leak in isd200_get_inquiry_data
  USB: pl2303: another product ID
  USB: new quirk flag to avoid Set-Interface
  USB: fix gadgetfs class request delegation
2008-03-24 23:24:16 -07:00
Andrew Morton
49741c4d01 PCI: revert "pcie: utilize pcie transaction pending bit"
Revert as it is reported to cause problems for people.

commit 4348a2dc49
Author: Shaohua Li <shaohua.li@intel.com>
Date:   Wed Oct 24 10:45:08 2007 +0800

    pcie: utilize pcie transaction pending bit

    PCIE has a mechanism to wait for Non-Posted request to complete. I think
    pci_disable_device is a good place to do this.

    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Due to the regression reported at
http://bugzilla.kernel.org/show_bug.cgi?id=10065

Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Soeren Sonnenburg <kernel@nn7.de>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-24 22:38:44 -07:00
Constantin Baranov
cc36bdd47a USB: add support for Motorola ROKR Z6 cellphone in mass storage mode
Motorola ROKR Z6 cellphone has bugs in its USB, so it is impossible to use
it as mass storage. Patch describes new "unusual" USB device for it with
FIX_INQUIRY and FIX_CAPACITY flags and new BULK_IGNORE_TAG flag.
Last flag relaxes check for equality of bcs->Tag and us->tag in
usb_stor_Bulk_transport routine.

Signed-off-by: Constantin Baranov <const@tltsu.ru>
Signed-off-by: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-24 22:26:14 -07:00
Alan Stern
392e1d9817 USB: new quirk flag to avoid Set-Interface
This patch (as1057) fixes a problem with the X-Rite/Gretag-Macbeth
Eye-One Pro display colorimeter; the device crashes when it receives a
Set-Interface request.  A new quirk (USB_QUIRK_NO_SET_INTF) is
introduced and a quirks entry is created for this device.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-24 22:26:14 -07:00
Tejun Heo
aacda37538 libata: implement ata_qc_raw_nbytes()
Implement ata_qc_raw_nbytes() which determines the raw user-requested
size of a PC command.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-24 22:09:12 -04:00
YOSHIFUJI Hideaki
7cbca67c07 [IPV6]: Support Source Address Selection API (RFC5014).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:24:01 +09:00
YOSHIFUJI Hideaki
1d5d236d30 [IPV6]: Use bitfields for hop_limit and mcast_hops.
Save some bits for future extensions.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:24:01 +09:00
YOSHIFUJI Hideaki
4725474584 [IPV6]: Convert cork.hop_limit and cork.tclass into u8 instead of int.
Values of those fields are always between 0 and 255 (inclusive),
so use u8 and save some memory on 32bit systems.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:59 +09:00
YOSHIFUJI Hideaki
c8cdaf998d [IPV4,IPV6]: Share cork.rt between IPv4 and IPv6.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:59 +09:00
Joe Perches
310afe86af [NET]: include/linux/udp.h - remove duplicate include
Remove duplicate #include <linux/types.h>
Combine #ifdef __KERNEL__ blocks

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:06:51 -07:00
Joe Perches
cc32e05416 [NET]: include/linux/igmp.h - remove duplicate include
Removed duplicate #include <linux/skbuff.h>	
Combined #ifdef __KERNEL__ blocks

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:05:44 -07:00
Joe Perches
414f69d8a6 [NET]: include/linux/atalk.h - remove duplicate include
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:04:31 -07:00
David S. Miller
76fef2b6bf Merge branch 'upstream-net26' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
Conflicts:

	drivers/s390/net/qeth_main.c
2008-03-22 18:22:42 -07:00
Rusty Russell
0098b7273e [NET]: NPROTO is redundant; it's equal to AF_MAX/PF_MAX.
DaveM pointed out NPROTO exposed to userspace, so keep it around,
just make sure it stays in sync.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 17:18:47 -07:00
Darren Salt
03c086a747 PNP: increase the number of PnP memory resources from 12 to 24
Increase the number of PnP memory resources from 12 to 24.

This removes an "exceeded the max num of mem resources" warning on boot. I
also noticed the reservation of two more iomem ranges on the computer on
which this was tested.

Signed-off-by: Darren Salt <linux@youmustbejoking.demon.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-22 17:00:57 -07:00
Patrick McManus
ec3c0982a2 [TCP]: TCP_DEFER_ACCEPT updates - process as established
Change TCP_DEFER_ACCEPT implementation so that it transitions a
connection to ESTABLISHED after handshake is complete instead of
leaving it in SYN-RECV until some data arrvies. Place connection in
accept queue when first data packet arrives from slow path.

Benefits:
  - established connection is now reset if it never makes it
   to the accept queue

 - diagnostic state of established matches with the packet traces
   showing completed handshake

 - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
   enforced with reasonable accuracy instead of rounding up to next
   exponential back-off of syn-ack retry.

Signed-off-by: Patrick McManus <mcmanus@ducksong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 16:33:01 -07:00
Heiko Carstens
22e52b072d sched: add arch_update_cpu_topology hook.
Will be called each time the scheduling domains are rebuild.
Needed for architectures that don't have a static cpu topology.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-21 16:43:48 +01:00
Heiko Carstens
9aefd0abd8 sched: add exported arch_reinit_sched_domains() to header file.
Needed so it can be called from outside of sched.c.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-21 16:43:47 +01:00
Linus Torvalds
7d3628b230 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (46 commits)
  [NET] ifb: set separate lockdep classes for queue locks
  [IPV6] KCONFIG: Fix description about IPV6_TUNNEL.
  [TCP]: Fix shrinking windows with window scaling
  netpoll: zap_completion_queue: adjust skb->users counter
  bridge: use time_before() in br_fdb_cleanup()
  [TG3]: Fix build warning on sparc32.
  MAINTAINERS: bluez-devel is subscribers-only
  audit: netlink socket can be auto-bound to pid other than current->pid (v2)
  [NET]: Fix permissions of /proc/net
  [SCTP]: Fix a race between module load and protosw access
  [NETFILTER]: ipt_recent: sanity check hit count
  [NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup()
  [RT2X00] drivers/net/wireless/rt2x00/rt2x00dev.c: remove dead code, fix warning
  [IPV4]: esp_output() misannotations
  [8021Q]: vlan_dev misannotations
  xfrm: ->eth_proto is __be16
  [IPV4]: ipv4_is_lbcast() misannotations
  [SUNRPC]: net/* NULL noise
  [SCTP]: fix misannotated __sctp_rcv_asconf_lookup()
  [PKT_SCHED]: annotate cls_u32
  ...
2008-03-21 07:57:45 -07:00
Peter P Waskiewicz Jr
82cc1a7a56 [NET]: Add per-connection option to set max TSO frame size
Update: My mailer ate one of Jarek's feedback mails...  Fixed the
parameter in netif_set_gso_max_size() to be u32, not u16.  Fixed the
whitespace issue due to a patch import botch.  Changed the types from
u32 to unsigned int to be more consistent with other variables in the
area.  Also brought the patch up to the latest net-2.6.26 tree.

Update: Made gso_max_size container 32 bits, not 16.  Moved the
location of gso_max_size within netdev to be less hotpath.  Made more
consistent names between the sock and netdev layers, and added a
define for the max GSO size.

Update: Respun for net-2.6.26 tree.

Update: changed max_gso_frame_size and sk_gso_max_size from signed to
unsigned - thanks Stephen!

This patch adds the ability for device drivers to control the size of
the TSO frames being sent to them, per TCP connection.  By setting the
netdevice's gso_max_size value, the socket layer will set the GSO
frame size based on that value.  This will propogate into the TCP
layer, and send TSO's of that size to the hardware.

This can be desirable to help tune the bursty nature of TSO on a
per-adapter basis, where one may have 1 GbE and 10 GbE devices
coexisting in a system, one running multiqueue and the other not, etc.

This can also be desirable for devices that cannot support full 64 KB
TSO's, but still want to benefit from some level of segmentation
offloading.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 03:43:19 -07:00
Serge Hallyn
aedb60a67c file capabilities: remove cap_task_kill()
The original justification for cap_task_kill() was as follows:

	check_kill_permission() does appropriate uid equivalence checks.
	However with file capabilities it becomes possible for an
	unprivileged user to execute a file with file capabilities
	resulting in a more privileged task with the same uid.

However now that cap_task_kill() always returns 0 (permission
granted) when p->uid==current->uid, the whole hook is worthless,
and only likely to create more subtle problems in the corner cases
where it might still be called but return -EPERM.  Those cases
are basically when uids are different but euid/suid is equivalent
as per the check in check_kill_permission().

One example of a still-broken application is 'at' for non-root users.

This patch removes cap_task_kill().

Signed-off-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Earlier-version-tested-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-20 09:46:36 -07:00
Alex Dubov
ead7077360 memstick: automatically retrieve "INT" value from command response
MemoryStick storage cards, when in parallel mode, send several meaningful bits
of their "INT" register as part of command response.  This data is stored by
host and can be used to spare invocation of "GET_INT" TPC on each data page
transferred between host and card.

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:37 -07:00
Randy Dunlap
a6b91919e0 fs: fix kernel-doc notation warnings
Fix kernel-doc notation warnings in fs/.

Warning(mmotm-2008-0314-1449//fs/super.c:560): missing initial short description on line:
 *	mark_files_ro
Warning(mmotm-2008-0314-1449//fs/locks.c:1277): missing initial short description on line:
 *	lease_get_mtime
Warning(mmotm-2008-0314-1449//fs/locks.c:1277): missing initial short description on line:
 *	lease_get_mtime
Warning(mmotm-2008-0314-1449//fs/namei.c:1368): missing initial short description on line:
 * lookup_one_len:  filesystem helper to lookup single pathname component
Warning(mmotm-2008-0314-1449//fs/buffer.c:3221): missing initial short description on line:
 * bh_uptodate_or_lock: Test whether the buffer is uptodate
Warning(mmotm-2008-0314-1449//fs/buffer.c:3240): missing initial short description on line:
 * bh_submit_read: Submit a locked buffer for reading
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:30): missing initial short description on line:
 * writeback_acquire: attempt to get exclusive writeback access to a device
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:47): missing initial short description on line:
 * writeback_in_progress: determine whether there is writeback in progress
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:58): missing initial short description on line:
 * writeback_release: relinquish exclusive writeback access against a device.
Warning(mmotm-2008-0314-1449//include/linux/jbd.h:351): contents before sections
Warning(mmotm-2008-0314-1449//include/linux/jbd.h:561): contents before sections
Warning(mmotm-2008-0314-1449//fs/jbd/transaction.c:1935): missing initial short description on line:
 * void journal_invalidatepage()

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:36 -07:00
Paul E. McKenney
ae66be9b71 rcu: fix misplaced mb() in rcu_enter/exit_nohz()
In the process of writing up the mechanical proof of correctness for the
dynticks/preemptable-RCU interface, I noticed misplaced memory barriers in
rcu_enter_nohz() and rcu_exit_nohz().

This patch puts them in the right place and adds a comment.  The key thing to
keep in mind is that rcu_enter_nohz() is -exiting- the mode that can legally
execute RCU read-side critical sections.

The memory barrier must be between any potential RCU read-side critical
sections and the increment of the per-CPU dynticks_progress_counter, and thus
must come -before- this increment.  And vice versa for rcu_exit_nohz().

The locking in the scheduler is probably saving us for the moment.

Also, switch to smp_mb() - we don't need a barrier for uniprocessor kernels.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:36 -07:00
Ingo Molnar
33b0c4217d sched: tune multi-core idle balancing
WAKE_IDLE is too agressive on multi-core CPUs with the new
wake-affine code, keep it on for SMT/HT balancing alone
(where there's no cache affinity at all between logical CPUs).

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-19 04:27:53 +01:00
Ingo Molnar
4ae7d5cefd sched: improve affine wakeups
improve affine wakeups. Maintain the 'overlap' metric based on CFS's
sum_exec_runtime - which means the amount of time a task executes
after it wakes up some other task.

Use the 'overlap' for the wakeup decisions: if the 'overlap' is short,
it means there's strong workload coupling between this task and the
woken up task. If the 'overlap' is large then the workload is decoupled
and the scheduler will move them to separate CPUs more easily.

( Also slightly move the preempt_check within try_to_wake_up() - this has
  no effect on functionality but allows 'early wakeups' (for still-on-rq
  tasks) to be correctly accounted as well.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-19 04:27:53 +01:00
Linus Torvalds
92f53c6f1e Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Revert "unexport bio_{,un}map_user"
  relay: fix subbuf_splice_actor() adding too many pages
  The ps2esdi driver was marked as BROKEN more than two years ago due to being
2008-03-18 07:43:14 -07:00