Hi,
Below is a patch for the Large Receive Offload feature.
Please review and let us know your comments.
LRO algorithm was described in an OLS 2005 presentation, located at
ftp.s2io.com
user: linuxdocs
password: HALdocs
The same ftp site has Programming Manual for Xframe-I ASIC.
LRO feature is supported on Neterion Xframe-I, Xframe-II and
Xframe-Express 10GbE NICs.
Brief description:
The Large Receive Offload(LRO) feature is a stateless offload
that is complementary to TSO feature but on the receive path.
The idea is to combine and collapse(upto 64K maximum) in the
driver, in-sequence TCP packets belonging to the same session.
It is mainly designed to improve 1500 mtu receive performance,
since Jumbo frame performance is already close to 10GbE line
rate. Some performance numbers are attached below.
Implementation details:
1. Handle packet chains from multiple sessions(current default
MAX_LRO_SESSSIONS=32).
2. Examine each packet for eligiblity to aggregate. A packet is
considered eligible if it meets all the below criteria.
a. It is a TCP/IP packet and L2 type is not LLC or SNAP.
b. The packet has no checksum errors(L3 and L4).
c. There are no IP options. The only TCP option supported is timestamps.
d. Search and locate the LRO object corresponding to this
socket and ensure packet is in TCP sequence.
e. It's not a special packet(SYN, FIN, RST, URG, PSH etc. flags are not set).
f. TCP payload is non-zero(It's not a pure ACK).
g. It's not an IP-fragmented packet.
3. If a packet is found eligible, the LRO object is updated with
information such as next sequence number expected, current length
of aggregated packet and so on. If not eligible or max packets
reached, update IP and TCP headers of first packet in the chain
and pass it up to stack.
4. The frag_list in skb structure is used to chain packets into one
large packet.
Kernel changes required: None
Performance results:
Main focus of the initial testing was on 1500 mtu receiver, since this
is a bottleneck not covered by the existing stateless offloads.
There are couple disclaimers about the performance results below:
1. Your mileage will vary!!!! We initially concentrated on couple pci-x
2.0 platforms that are powerful enough to push 10 GbE NIC and do not
have bottlenecks other than cpu%; testing on other platforms is still
in progress. On some lower end systems we are seeing lower gains.
2. Current LRO implementation is still (for the most part) software based,
and therefore performance potential of the feature is far from being realized.
Full hw implementation of LRO is expected in the next version of Xframe ASIC.
Performance delta(with MTU=1500) going from LRO disabled to enabled:
IBM 2-way Xeon (x366) : 3.5 to 7.1 Gbps
2-way Opteron : 4.5 to 6.1 Gbps
Signed-off-by: Ravinandan Arakali <ravinandan.arakali@neterion.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
There is a problem with fragmented skb in s2io driver version 2.0.9.4
available in 2.6.16-rc1 kernel. The adapter will fail to transmit if
any scatter-gather skb arrives. This patch provides fix for the above
described problem.
Signed-off-by: Ananda Raju <ananda.raju@neterion.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
On my laptop, the b44 device is created and the carrier state defaults
to ON when created by alloc_etherdev. This means tools like NetworkManager
see the carrier as On and try and bring the device up. The correct thing
to do is mark the carrier as Off when device is created.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Since get_settings() returns a signed int and it gets checked
for < 0 to catch an error, res should be a signed int too.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
*) mark as "EXPERIMENTAL" various items that either aren't very stable or
that are actively crashing the setup of users which don't really need them
(i.e. HIGHMEM and 3-level pagetables on x86 - nobody needs either,
everybody reports "I'm using it and getting trouble").
*) move net/Kconfig near to the rest of network configurations, and
drivers/block/Kconfig near "Block layer" submenu.
*) it's useless and doesn't work well to force NETDEVICES on and to disable
the prompt like it's done. Better remove the attempt, and change that to a
simple "default y if UML".
*) drop the warning about "report problems about HPPFS" - it's redundant
anyway, as that's the usual procedure, and HPPFS users are especially
technical (i.e. they know reporting bugs is _good_).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove page refcount manipulations from cassini driver by using
another field in struct page. Needed for lockless pagecache.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
in attempting to not send the "prefetch" patch, we broke the receive code,
this patch fixes that issue.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Added e1000_mc_addr_list_update
Added e1000_read_reg_io
Added e1000_enable_pciex_master
These are not static functions, that is why we have them declared in the header.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
These functions help restore the driver to active configuration when coming out of resume for power management.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Align the prefetches to a dword to help speed them up.
Recycle skb's and early replenish.
Force memory writes to complete before fetching more descriptors.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Adds the ability to disability packet split at compile time and use the legacy receive path on PCI express hardware. Made this a CONFIG option and modified the Kconfig, to reflect the new option.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
There are a couple of problems in the DMA setup code for skge.
* In the 64 bit case, it doesn't set the consistent mask.
* In the 32 bit case, the error check is backwards!
It likely will only be visible as a bug on 64 bit platforms.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
sizeof() return is not an int, so use max_t to get the types right.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Be more careful about transmit locking, this solves a possible race
between tx_complete and transmit, that would cause a tx timeout.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Don't need to inline quite so many routines, let the compiler
decide
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Make sure and rate limit all the error messages that might occur. If a problem
occurs then a few messages are enough.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Small optimization, if dma addresses are 32 bits, then high
bits are always zero.
Signed-off-by: Stephen Hemminger <shemminger@osdl.or>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Don't need to zero out the status ring entries after processing.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Be more careful about memory barriers. The only place we really
need them is before and after updating the chip's ring interface.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Fix problems with Yukon FE rev 2 chipset. Don't cut and paste bugs in from
sk98lin driver. Change how the ram buffer is divided up, and make the math
clearer. Also, set the thresholds where rx takes precedence. The threshold
values are just guesses at this point, it might be worth tuning them later.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Need to call pci_set_consistent_dma_mask in the case of 64 bit
DMA.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Need to make sure that sky2 receive buffers are 64 bit
aligned. Also, don't need to start off with GFP_ATOMIC
on initial setup.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
e100: e100 whitespace fixes
These are whitespace only fixes.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
e100: Handle the return values from pci_* functions
This is to resolve warnings during compile time.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
e100: Fix TX hang and RMCP Ping issue (due to a microcode loading issue)
Set the end of list bit to cause the hardware's transmit state machine to
work correctly and not prevent management (BMC) traffic.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
drivers/net/cassini.c:1930: warning: long unsigned int format, different type arg (arg 4)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Improves small packet performance with large amounts of reassembly being done in the stack.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This was to fix compilation warnings. Also added log messages when pci_enable_* functions return with an error.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This is two patches, the first is adding additional bus information for the 8257{1|2|3} controllers. The second patch was orginally a community patch to print bus type/speed/width, and enhanced by us.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
The EEPROM image version is reported as a firmware version for these controllers.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Added 82571 fiber to WOL fix for dual port adapters.
Added support for 82546GB (Quad Copper).
Added PCIe typedef for x2, igp cable length 115, and extended TX CTRL registers.
Added parity error detection and PCIe CTRL registers.
Added EEPROM config registers.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Fixed an issue netpoll would error out during communication, generating the following error:
--netdump[14973]: Got toomany timeouts in handshaking, ...
Even after a kernel panic, netpoll requires two way communication to successfully transfer the crash log to the remote server.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Simplified the logic used to assign the frame_size.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Fixed VLAN support by switching control over to the firmware.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Fixed by moving code to correct location (for 82572 and 82571 controllers).
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Fixed the collision distance for 82543 controllers and newer.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>