android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Alexander Duyck	b4ded8327f	ixgbe: Update adaptive ITR algorithm The following change is meant to update the adaptive ITR algorithm to better support the needs of the network. Specifically with this change what I have done is make it so that our ITR algorithm will try to prevent either starving a socket buffer for memory in the case of Tx, or overrunning an Rx socket buffer on receive. In addition a side effect of the calculations used is that we should function better with new features such as XDP which can handle small packets at high rates without needing to lock us into NAPI polling mode. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 10:07:50 -07:00
Emil Tantilov	c3aec05dfe	ixgbe: fix the FWSM.PT check in ixgbe_mng_present() Bits other than FWSM.PT can be set in IXGBE_SWFW_MODE_MASK making the previous check invalid. Change the check for MNG present to be only based on FWSM.PT bit. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 10:05:19 -07:00
Emil Tantilov	dcfd6b839c	ixgbe: fix use of uninitialized padding This patch is resolving Coverity hits where padding in a structure could be used uninitialized. - Initialize fwd_cmd.pad/2 before ixgbe_calculate_checksum() - Initialize buffer.pad2/3 before ixgbe_hic_unlocked() Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 10:04:06 -07:00
Jesper Dangaard Brouer	86e2349422	ixgbe: add counter for times Rx pages gets allocated, not recycled The ixgbe driver have page recycle scheme based around the RX-ring queue, where a RX page is shared between two packets. Based on the refcnt, the driver can determine if the RX-page is currently only used by a single packet, if so it can then directly refill/recycle the RX-slot by with the opposite "side" of the page. While this is a clever trick, it is hard to determine when this recycling is successful and when it fails. Adding a counter, which is available via ethtool --statistics as 'alloc_rx_page'. Which counts the number of times the recycle fails and the real page allocator is invoked. When interpreting the stats, do remember that every alloc will serve two packets. The counter is collected per rx_ring, but is summed and ethtool exported as 'alloc_rx_page'. It would be relevant to know what rx_ring that cannot keep up, but that can be exported later if someone experience a need for this. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 10:02:38 -07:00
Emil Tantilov	761c2a48c7	ixgbe: split Tx/Rx ring clearing for ethtool loopback test Commit: fed21bcee7a5 ("ixgbe: Don't bother clearing buffer memory for descriptor rings) exposed some issues with the logic in the current implementation of ixgbe_clean_test_rings() that are being addressed in this patch: - Split the clearing of the Tx and Rx rings in separate loops. Previously both Tx and Rx rings were cleared in a rx_desc->wb.upper.length based loop which could lead to issues if for w/e reason packets were received outside of the frames transmitted for the loopback test. - Add check for IXGBE_TXD_STAT_DD to avoid clearing the rings if the transmits have not comlpeted by the time we enter ixgbe_clean_test_rings() - Exit early on ixgbe_check_lbtest_frame() failure. This change fixes a crash during ethtool diagnostic (ethtool -t). Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 09:41:25 -07:00
Emil Tantilov	c69be946d6	ixgbe: add error checks when initializing the PHY Ignoring errors when attempting to identify the PHY can lead to a crash. Specifically in the case of FW controlled PHYs where the PHY read/write operations are set to NULL. Removed redundant comment. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 09:38:11 -07:00
Shannon Nelson	f5a71caa17	ixgbe: restore normal RSS after last macvlan offload is removed Just like when the last VF is removed, we need to restore normal operations after the last macvlan offload is removed, else we get stuck in single queue operations. To test: ethtool -l eth1 # note the number of queues in use, ~= cpus ethtool -K eth1 l2-fwd-offload on ip link add mv1 link eth1 type macvlan mode bridge ip link set dev mv1 up ip link del mv1 ethtool -l eth1 # are we back to the same # of queues, or stuck on 1? Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 09:36:31 -07:00
Bhumika Goyal	2e033eace7	ixgbe: declare ixgbe_mac_operations structures as const Declare ixgbe_mac_operations structures as const as they are only stored in the mac_ops field of ixgbe_info structure. This field is of type const and therefore ixgbe_mac_operations structure can be made const too. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 09:34:27 -07:00
Emil Tantilov	2e22a75c55	ixgbe: Clear SWFW_SYNC register during init Added clearing of SW resource bits in the SW/FW synchronization register to ixgbe_init_swfw_sync_X540(). Updated ixgbe_acquire_swfw_sync_X540 SW Manageability host interface resource bit error case to match the error handling of the other SW resource bits. Which is to release the SW resource bits if SW times out while attempting to acquire the resource. This allows the driver to load in cases where the semaphore bits could be stuck after a reset or a crash. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 09:28:23 -07:00
John Fastabend	8e679021c5	ixgbe: incorrect XDP ring accounting in ethtool tx_frame param Changing the TX ring parameters with an XDP program attached may cause the XDP queues to be cleared and the TX rings to be incorrectly configured. Fix by doing correct ring accounting in setup call. Fixes: `33fdc82f08` ("ixgbe: add support for XDP_TX action") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 08:02:47 -07:00
Ding Tianhong	5e0fac63a6	net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag The ixgbe driver use the compile check to determine if it can send TLPs to Root Port with the Relaxed Ordering Attribute set, this is too inconvenient, now the new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added to the kernel and we could check the bit4 in the PCIe Device Control register to determine whether we should use the Relaxed Ordering Attributes or not, so use this new way in the ixgbe driver. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 07:43:06 -07:00
Ding Tianhong	f4986d250a	Revert commit `1a8b6d76dc` ("net:add one common config...") The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added to indicate that Relaxed Ordering Attributes (RO) should not be used for Transaction Layer Packets (TLP) targeted toward these affected Root Port, it will clear the bit4 in the PCIe Device Control register, so the PCIe device drivers could query PCIe configuration space to determine if it can send TLPs to Root Port with the Relaxed Ordering Attributes set. With this new flag we don't need the config ARCH_WANT_RELAX_ORDER to control the Relaxed Ordering Attributes for the ixgbe drivers just like the commit `1a8b6d76dc` ("net:add one common config...") did, so revert this commit. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 07:43:06 -07:00
Sabrina Dubroca	a39221ce96	ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register In ixgbe_clear_udp_tunnel_port(), we read the IXGBE_VXLANCTRL register and then try to mask some bits out of the value, using the logical instead of bitwise and operator. Fixes: `a21d0822ff` ("ixgbe: add support for geneve Rx offload") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 07:43:06 -07:00
Mark D Rustad	e0f06bba96	ixgbe: Return error when getting PHY address if PHY access is not supported In cases where PHY register access is not supported, don't mislead a caller into thinking that it is supported by returning a PHY address. Instead, return -EOPNOTSUPP when PHY access is not supported. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-09 07:43:06 -07:00
Jacob Keller	b74f571f59	i40e/i40evf: organize and re-number feature flags Now that we've reduced the number of flags, organize similar flags together and re-number them accordingly. Since we don't yet have more than 32 flags, we'll use a u32 for both the hw_features and flag field. Should we gain more flags in the future, we may need to convert to a u64 or separate flags out into two fields. One alternative approach considered, but not implemented here, was to use an enumeration for the flag variables, and create a macro I40E_FLAG() which used string concatenation to generate BIT_ULL values. This has the advantage of making the actual bit values compile-time dynamic so that we do not need to worry about matching the order to the bit value. However, this does produce a high level of code churn, and makes it more difficult to read a dumped flags value when debugging. Change-ID: I8653fff69453cd547d6fe98d29dfa9d8710387d1 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Jacob Keller	a5340d933e	i40e: ignore skb->xmit_more when deciding to set RS bit Since commit `6a7fded776` ("i40e: Fix RS bit update in Tx path and disable force WB workaround") we've tried to "optimize" setting the RS bit based around skb->xmit_more. This same logic was refactored in commit `1dc8b53879` ("i40e: Reorder logic for coalescing RS bits"), but ultimately was not functionally changed. Using skb->xmit_more in this way is incorrect, because in certain circumstances we may see a large number of skbs in sequence with xmit_more set. This leads to a performance loss as the hardware does not writeback anything for those packets, which delays the time it takes for us to respond to the stack transmit requests. This significantly impacts UDP performance, especially when layered with multiple devices, such as bonding, VLANs, and vnet setups. This was not noticed until now because it is difficult to create a setup which reproduces the issue. It was discovered in a UDP_STREAM test in a VM, connected using a vnet device to a bridge, which is connected to a bonded pair of X710 ports in active-backup mode with a VLAN. These layered devices seem to compound the number of skbs transmitted at once by the qdisc. Additionally, the problem can be masked by reducing the ITR value. Since the original commit does not provide strong justification for this RS bit "optimization", revert to the previous behavior of setting the RS bit every 4th packet. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Jacob Keller	0a3b4f702f	i40evf: enable support for VF VLAN tag stripping control A recent commit 809481484e5d ("i40e/i40evf: support for VF VLAN tag stripping control") added support for VFs to negotiate the control of VLAN tag stripping. This should have allowed VFs to disable the feature. Unfortunately, the flag was set only in netdev->feature flags and not in netdev->hw_features. This ultimately causes the stack to assume that it cannot change the flag, so it was unchangeable and marked as [fixed] in the ethtool -k output. Fix this by setting the feature in hw_features first, just as we do for the PF code. This enables ethtool -K to disable the feature correctly, and fully enables user control of the VLAN tag stripping feature. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Mariusz Stachura	052b93d0c2	i40e: do not enter PHY debug mode while setting LEDs behaviour Previous implementation of LED set/get functions required to enter PHY debug mode, in order to prevent access to it from FW and SW at the same time. Reset of all ports was a unwanted side effect. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Alan Brady	19b7960b2d	i40e: implement split PCI error reset handler This patch implements the PCI error handler reset_prepare and reset_done. This allows us to handle function level reset. Without this patch we are unable to perform and recover from an FLR correctly and this will cause VFs to be unable to recover from an FLR on the PF. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Filip Sadowski	013df598d6	i40e: Properly maintain flow director filters list When there is no space for more flow director filters and user requested to add a new one it is rejected by firmware and automatically removed from the filter list maintained by driver. This behaviour is correct. Afterwards existing filter can be removed making free slot for the new one. This however causes the newly added filter to be accepted by firmware but removed from driver filter list resulting in not showing after issuing 'ethtool -n <dev_name>'. This happened due to not clearing the variable pf->fd_inv which stores filter number to be removed from the list when firmware refused to add the requested filter. It caused the filter with this specific ID to be constantly removed once it was added to the list although it has been accepted by firmware and effectively applied to the NIC. It was fixed by clearing pf->fd_inv variable after removal of the filter from the list when it was rejected by firmware. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:32 -07:00
Filip Sadowski	9a858178ef	i40e: Display error message if module does not meet thermal requirements This patch causes error message to be displayed when NIC detects insertion of module that does not meet thermal requirements. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Alice Michael	7f66182263	i40e: fix merge error This patch removes some code that was accidentally added to the wrong function with a merge error. Fixes: `c53934c6d1` ("i40e: fix: do not sleep in netdev_ops") Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Jesse Brandeburg	bd6cd4e6dd	i40e/i40evf: use DECLARE_BITMAP for state When using set_bit and friends, we should be using actual bitmaps, and fix all the locations where we might access it. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Mitch Williams	0a0d9af5bc	i40e: fix incorrect register definition This register was defined incorrectly. Fix the increment value to 8, and replace the iterator with _i to make the definition consistent with other statistics registers. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Mitch Williams	60518a0489	i40e: redfine I40E_PHY_TYPE_MAX Since I40E_PHY_TYPE_MAX is used as an iterator, usually combined with some sort of bit-shifting, it should only include actual PHY types and not error cases. Move it up in the enum declaration so that loops only iterate across valid PHY types. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Alan Brady	c3d26b75c2	i40e: re-enable PTP L4 capabilities for XL710 if FW >6.0 Starting with XL710 FW 5.3 PTP L4 was disabled for XL710 due to a bug. The bug has since been resolved in XL710 FW >6.0 and PTP L4 can now be re-enabled on those devices with updated firmware. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Jacob Keller	be664cbefc	i40e/i40evf: spread CPU affinity hints across online CPUs only Currently, when setting up the IRQ for a q_vector, we set an affinity hint based on the v_idx of that q_vector. Meaning a loop iterates on v_idx, which is an incremental value, and the cpumask is created based on this value. This is a problem in systems with multiple logical CPUs per core (like in simultaneous multithreading (SMT) scenarios). If we disable some logical CPUs, by turning SMT off for example, we will end up with a sparse cpu_online_mask, i.e., only the first CPU in a core is online, and incremental filling in q_vector cpumask might lead to multiple offline CPUs being assigned to q_vectors. Example: if we have a system with 8 cores each one containing 8 logical CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is disabled, only the 1st CPU in each core remains online, so the cpu_online_mask in this case would have only 8 bits set, in a sparse way. In general case, when SMT is off the cpu_online_mask has only C bits set: 0, 1N, 2N, ..., C*(N-1) where C == # of cores; N == # of logical CPUs per core. In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set. Instead, we should only assign hints for CPUs which are online. Even better, the kernel already provides a function, cpumask_local_spread() which takes an index and returns a CPU, spreading the interrupts across local NUMA nodes first, and then remote ones if necessary. Since we generally have a 1:1 mapping between vectors and CPUs, there is no real advantage to spreading vectors to local CPUs first. In order to avoid mismatch of the default XPS hints, we'll pass -1 so that it spreads across all CPUs without regard to the node locality. Note that we don't need to change the q_vector->affinity_mask as this is initialized to cpu_possible_mask, until an actual affinity is set and then notified back to us. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Mitch Williams	64615b5418	i40e: add private flag to control source pruning By default, our devices do source pruning, that is, they drop receive packets that have the source MAC matching one of the receive filters. Unfortunately, this breaks ARP monitoring in channel bonding, as the bonding driver expects devices to receive ARPs containing their own source address. Add an ethtool private flag to control this feature. Also, remove the netif_running() check when we process our private flags. It's OK to reset when the device is closed and in most cases we need the reset the apply these changes. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Rami Rosen	ec2f25d203	i40e: fix a typo in i40e_pf documentation This patch fixes a typo in i40e_pf object documentation; num_req_vfs refers to the number of VFs requested for the PF. Signed-off-by: Rami Rosen <rami.rosen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-06 08:11:31 -07:00
Jacob Keller	3e256ac5b1	fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bw We've had support for setting both a minimum and maximum bandwidth via .ndo_set_vf_bw since commit `883a9ccbae` ("fm10k: Add support for SR-IOV to driver", 2014-09-20). Likely because we do not support minimum rates, the declaration mis-ordered the "unused" parameter, which causes warnings when analyzed with cppcheck. Fix this warning by properly declaring the min_rate and max_rate variables in the declaration and definition (rather than using "unused"). Also rename "rate" to max_rate so as to clarify that we only support setting the maximum rate. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 09:00:04 -07:00
Jacob Keller	87be98927e	fm10k: prefer %s and __func__ for diagnostic prints Don't hard code the function names in the diagnostic output when these reset related routines fail. Instead, use %s and __func__ so that future refactors don't need to change the print outs. Additionally, while we are here, add missing function header comments for the new reset_prepare and reset_done function handlers. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:55:20 -07:00
Joe Perches	c0ad8ef3df	fm10k: Fix misuse of net_ratelimit() Correct the backward logic using !net_ratelimit() Miscellanea: o Add a blank line before the error return label Signed-off-by: Joe Perches <joe@perches.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:51:27 -07:00
Jacob Keller	ef57ab791c	fm10k: bump version number Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:49:52 -07:00
Jacob Keller	1f5c27e528	fm10k: use the MAC/VLAN queue for VF<->PF MAC/VLAN requests Now that we have a working MAC/VLAN queue for handling MAC/VLAN messages from the netdev, replace the default handler for the VF<->PF messages. This new handler is very similar to the default code, but uses the MAC/VLAN queue instead of sending the message directly. Unfortunately we can't easily re-use the default code, so we'll just replace the entire function. This ensures that a VF requesting a large number of VLANs or MAC addresses does not start a reset cycle, as explained in the commit which introduced the message queue. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Ngai-mint Kwan <ngai-mint.kwan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:39:17 -07:00
Jacob Keller	fc9173682d	fm10k: introduce a message queue for MAC/VLAN messages Under some circumstances, when dealing with a large number of MAC address or VLAN updates at once, the fm10k driver, particularly the VFs can overload the mailbox with too many messages at once. This results in a mailbox timeout, which causes the driver to initiate a reset. During the reset, we re-send all the same messages that originally caused the timeout. This results in a cycle of resets each triggering a future reset. To fix or avoid this, we introduce a workqueue item which monitors a queue of MAC and VLAN requests. These requests are queued to the end of the list, and we process as a FIFO periodically. Initially we only handle requests for the netdev, but we do handle unicast MAC addresses, multicast MAC addresses, and update VLAN requests. A future patch will add support to use this queue for handling MAC update requests from the VF<->PF mailbox. The MAC/VLAN work item will keep checking to make sure that each request does not overflow the mailbox and cause a timeout. If it might, then the work item will reschedule itself a short time later. This avoids any reset cycle, since we never send the message if the mailbox is not ready. As an alternative, we tried increasing the mailbox message FIFO, but this just delays the problem and results in needless memory waste on the system. Our new message queue is dynamically allocated so only uses as much memory as it needs. Additionally, it need not be contiguous like the Tx and Rx FIFOs. Note that this patch chose to only create a queue for MAC and VLAN messages, since these are the only messages sent in a large enough volume to cause the reset loop. Other messages are very unlikely to overflow the mailbox Tx FIFO so easily. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:25:36 -07:00
Jacob Keller	8249c47c6b	fm10k: use generic PM hooks instead of legacy PCIe power hooks Replace the PCI specific legacy power management hooks with the new generic power management hooks which work properly for both suspend and hibernate. The new generic system is better and properly handles the lower level PCIe power management rather than forcing the driver to handle it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:19:33 -07:00
Jacob Keller	b4fcd43661	fm10k: use spinlock to implement mailbox lock Lets not re-invent the locking wheel. Remove our bitlock and use a proper spinlock instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:12:44 -07:00
Jacob Keller	0b40f45748	fm10k: prepare_for_reset() when we lose PCIe Link If we lose PCIe link, such as when an unannounced PFLR event occurs, or when a device is surprise removed, we currently detach the device and close the netdev. This unfortunately leaves a lot of things still active, such as the msix_mbx_pf IRQ, and Tx/Rx resources. This can cause problems because the register reads will return potentially invalid values which may result in unknown driver behavior. Begin the process of resetting using fm10k_prepare_for_reset(), much in the same way as the suspend and resume cycle does. This will attempt to shutdown as much as possible, in order to prevent possible issues. A naive implementation for this has issues, because there are now multiple flows calling the reset logic and setting a reset bit. This would cause problems, because the "re-attach" routine might call fm10k_handle_reset() prior to the reset actually finishing. Instead, we'll add state bits to indicate which flow actually initiated the reset. For the general reset flow, we'll assume that if someone else is resetting that we do not need to handle it at all, so it does not need its own state bit. For the suspend case, we will simply issue a warning indicating that we are attempting to recover from this case when resuming. For the detached subtask, we'll simply refuse to re-attach until we've actually initiated a reset as part of that flow. Finally, we'll stop attempting to manage the mailbox subtask when we're detached, since there's nothing we can do if we don't have a PCIe address. Overall this produces a much cleaner shutdown and recovery cycle for a PCIe surprise remove event. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-03 08:06:44 -07:00
David S. Miller	4efac6ff4d	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-10-02 This series contains updates to i40e and i40evf. Shannon Nelson fixes an issue where when a machine has more CPUs than queue pairs, the counting gets a "little funky" and turns off Flow Director. So to correct it, limit the number of LAN queues initially allocated to be sure there are some left for Flow Director and other features. Lihong cleans up dead code by removing a condition check which cannot ever be true. Christophe Jaillet fixes a potential NULL pointer dereference, which could happen if kzalloc() fails. Filip corrects the reporting of supported link modes, which was incorrect for some NICs. Added support for 'ethtool -m' command, which displays information about QSFP+ modules. Mariusz adds functions to read/write the LED registers to control the LEDS, instead of accessing the registers directly whenever the LEDs need to be controlled. Jake fixes a regression where we introduced a scheduling while atomic, so introduce a separate helper function which will manage its own need for the mac_filter_hash_lock. Also cleaned up the "PF" parameter in i40e_vc_disable_vf() since it is never used and is not needed. Fixed a rare case where it is possible that a reset does not occur when i40e_vc_disable_vf() is called, so modify i40e_reset_vf() to return a bool to indicate whether it reset or not so that i40e_vc_disable_vf() can wait until a reset actually occurs. Alan adds the ability for the VF to request more or less underlying allocated queues from the PF. Fixes the incorrect method for clearing the vf_states variable with a NULL assignment, when we should be using atomic bitops since we don't actually want to clear all the flags. Fixed a resource leak, where the PF driver fails to inform clients of a VF reset because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE bit. Mitch converts i40evf_map_rings_to_vectors() to a void function since it cannot fail and allows us to clean up the checks for the function return value. Scott enables the driver(s) to pass traffic with VLAN tags using the 802.1ad Ethernet protocol. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-02 15:16:03 -07:00
Scott Peterson	ab243ec940	i40e: Stop dropping 802.1ad tags - eth proto 0x88a8 Enable i40e to pass traffic with VLAN tags using the 802.1ad ethernet protocol ID (0x88a8). This requires NIC firmware providing version 1.7 of the API. With older NIC firmware 802.1ad tagged packets will continue to be dropped. No VLAN offloads nor RSS are supported for 802.1ad VLANs. Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:36 -07:00
Alan Brady	c53d11f669	i40e: fix client notify of VF reset Currently there is a bug in which the PF driver fails to inform clients of a VF reset which then causes clients to leak resources. The bug exists because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE bit. When a VF is first init we go through a reset to initialize variables and allocate resources but we don't want to inform clients of this first reset since the client isn't fully enabled yet so we set a state bit signifying we're in a "pre-enabled" client state. During the first reset we should be clearing the bit, allowing all following resets to notify the client of the reset when the bit is not set. This patch fixes the issue by negating the 'test_and_clear_bit' check to accurately reflect the behavior we want. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:36 -07:00
Alan Brady	41d0a4d0c8	i40e: fix handling of vf_states variable Currently we inappropriately clear the vf_states variable with a null assignment. This is problematic because we should be using atomic bitops on this variable and we don't actually want to clear all the flags. We should just clear the ones we know we want to clear. Additionally remove the I40E_VF_STATE_FCOEENA bit because it is no longer being used. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Mitch Williams	1b7b7596ae	i40e: make i40evf_map_rings_to_vectors void This function cannot fail, so why is it returning a value? And why are we checking it? Why shouldn't we just make it void? Why is this commit message made up of only questions? Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Alan Brady	5b36e8d04b	i40evf: Enable VF to request an alternate queue allocation Currently the VF gets a default number of allocated queues from HW on init and it could choose to enable or disable those allocated queues. This makes it such that the VF can request more or less underlying allocated queues from the PF. First the VF negotiates the number of queues it wants that can be supported by the PF and if successful asks for a reset. During reset the PF will reallocate the HW queues for the VF and will then remap the new queues. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	d43d60e5eb	i40e: ensure reset occurs when disabling VF It is possible although rare that we may not reset when i40e_vc_disable_vf() is called. This can lead to some weird circumstances with some values not being properly set. Modify i40e_reset_vf() to return a code indicating whether it reset or not. Now, i40e_vc_disable_vf() can wait until a reset actually occurs. If it fails to free up within a reasonable time frame we'll display a warning message. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	f18d20218a	i40e: make use of i40e_vc_disable_vf Replace i40e_vc_notify_vf_reset and i40e_reset_vf with a call to i40e_vc_disable_vf which does this exact thing. This matches similar code patterns throughout the driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	eeeddbb806	i40e: drop i40e_pf *pf from i40e_vc_disable_vf() It's never used, and the vf structure could get back to the PF if necessary. Lets just drop the extra unneeded parameter. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Jacob Keller	ba4e003d29	i40e: don't hold spinlock while resetting VF When we refactored handling of the PVID in commit `9af52f60b2` ("i40e: use (add\|rm)_vlan_all_mac helper functions when changing PVID") we introduced a scheduling while atomic regression. This occurred because we now held the spinlock across a call to i40e_reset_vf(), which results in a usleep_range() call that triggers a scheduling while atomic bug. This was rare as it only occurred if the user configured a VLAN on a VF and also attempted to reconfigure the VF from the host system with a port VLAN. We do need to hold the lock while calling i40e_is_vsi_in_vlan(), but we should not be holding it while we reset the VF. We'll fix this by introducing a separate helper function i40e_vsi_has_vlans which checks whether we have a PVID and whether the VSI has configured VLANs. This helper function will manage its own need for the mac_filter_hash_lock. Then, we can move the acquiring of the spinlock until after we reset the VF, which ensures that we do not sleep while holding the lock. Using a separate function like this makes the code more clear and is easier to read than attempting to release and re-acquire the spinlock when we reset the VF. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Mariusz Stachura	00f6c2f5e2	i40e: use admin queue for setting LEDs behavior Instead of accessing register directly, use newly added AQC in order to blink LEDs. Introduce and utilize a new flag to prevent excessive API version checking. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00
Filip Sadowski	9c0e5caf63	i40e: Add support for 'ethtool -m' This patch adds support for 'ethtool -m' command which displays information about (Q)SFP+ module plugged into NIC's cage. Signed-off-by: Filip Sadowski <filip.sadowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-02 12:46:35 -07:00

... 33 34 35 36 37 ...

6229 Commits