Enabled number of VFs is key for eswich manager to do flow steering
initialization and vport configurations. However, the number of
enabled VFs may come from two sources as below.
PF: num of VFs is provided by enabled SR-IOV of itself.
ECPF: num of VFs is provided by enabled SR-IOV from its peer PF. And
SR-IOV can't be enabled from ECPF itself.
Current driver handles the two cases in different stages and passing
the number of enabled VFs among a large scope of internal functions.
It is usually hard to find out where is the real number of VFs from
due to layers of argument pass-in.
This patch consolidated that number from the entry point of doing
eswitch setup, and maintained a copy so that eswitch functions can
refer to it directly.
Eswitch driver shall always use this number when referring to enabled
number of VFs, don't use other numbers such as from SR-IOV.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Devlink eswitch mode is not necessarily related to SR-IOV, e.g, ECPF
can be at offload mode when SR-IOV is not enabled.
Rename the interface and eswitch mode names to decouple from SR-IOV,
and cleanup eswitch messages accordingly.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When ECPF is eswitch manager, it has the privilege to query and
configure the mac and node guid of host PF.
While vport number of host PF is 0, the vport command should be
issued with other_vport set in this case as the cmd is issued by
ECPF vport(0xfffe).
Add a specific function to query own vport mac. Low level functions
are used by vport manager to query/modify any vport mac and node guid.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Before the offending commit, vlan will be configured if either vlan
or qos is set. After the change with new set flags, function callers
should provide flags accordingly.
Fixes: e33dfe316c ("net/mlx5: E-Switch, Allow fine tuning of eswitch vport push/pop vlan")
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
While enabling SR-IOV, PCI core already checks that if SR-IOV is already
enabled, it returns failure error code.
Hence, remove such duplicate check from mlx5_core driver.
While at it, make mlx5_device_disable_sriov() to perform cleanup of VFs in
reverse order of mlx5_device_enable_sriov().
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When ECPF eswitch manager is at offloads mode, it monitors functions
changed event from host PF side and acts according to the number of
VFs enabled/disabled.
As ECPF and host PF work in two independent hosts, it's possible that
host PF OS reboots but ECPF system is still kept on and continues
monitoring events from host PF. When kernel from host PF side is
booting, PCI iov driver does sriov_init and compute_max_vf_buses by
iterating over all valid num of VFs. This triggers FLR and generates
functions changed events, even though host PF HCA is not enabled at
this time. However, ECPF is not aware of this information, and still
handles these events as usual. ECPF system will see massive number of
reps are created, but destroyed immediately once creation finished.
To eliminate this noise, a bit is added to host parameter context to
indicate host PF is disabled. ECPF will not handle the VF changed
event if this bit is set.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
As mlx5_get_next_phys_dev is used only for PCI PF devices use case,
limit it to search only for PCI devices.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlx5_pci_init() performs pci specific initialization of the
mlx5_core_dev struct.
Hence move pci_status_mutex to pci initialization routine
mlx5_pci_init().
This allows reusing mlx5_mdev_init() to non PCI devices.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In the single IB device mode, the mapping between vport number and
rep relies on a counter. However for dynamic vport allocation, it is
desired to keep consistent map of eswitch vport and IB port.
Hence, simplify code to remove the free running counter and instead
use the available vport index during load/unload sequence from the
eswitch.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Suggested-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Driver is referring to the array index when doing rep initialization,
using vport is confusing as it's normally interpreted as vport number.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Saeed Mahameed says:
====================
mlx5e-updates-2019-06-28
This series adds some misc updates for mlx5e driver
1) Allow adding the same mac more than once in MPFS table
2) Move to HW checksumming advertising
3) Report netdevice MPLS features
4) Correct physical port name of the PF representor
5) Reduce stack usage in mlx5_eswitch_termtbl_create
6) Refresh TIR improvement for representors
7) Expose same physical switch_id for all representors
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2019-06-28
This series contains a smorgasbord of updates to many of the Intel
drivers.
Gustavo A. R. Silva updates the ice and iavf drivers to use the
strcut_size() helper where possible.
Miguel increases the pause and refresh time for flow control in the
e1000e driver during reset for certain devices.
Dann Frazier fixes a potential NULL pointer dereference in ixgbe driver
when using non-IPSec enabled devices.
Colin Ian King fixes a potential overflow during a shift in the ixgbe
driver. Also fixes a potential NULL pointer dereference in the iavf
driver by adding a check.
Venkatesh Srinivas converts the e1000 driver to use dma_wmb() instead of
wmb() for doorbell writes to avoid SFENCEs in the transmit and receive
paths.
Arjan updates the e1000e driver to improve boot time by over 100 msec by
reducing the usleep ranges suring system startup.
Artem updates the igb driver register dump in ethtool, first prepares
the register dump for future additions of registers in the dump, then
secondly, adds the RR2DCDELAY register to the dump. When dealing with
time-sensitive networks, this register is helpful in determining your
latency from the device to the ring.
Alex fixes the ixgbevf driver to use the current cached link state,
rather than trying to re-check the value from the PF.
Harshitha adds support for MACVLAN offloads in i40e by using channels as
MACVLAN interfaces.
Detlev Casanova updates the e1000e driver to use delayed work instead of
timers to run the watchdog.
Vitaly fixes an issue in e1000e, where when disconnecting and
reconnecting the physical cable connection, the NIC enters a DMoff
state. This state causes a mismatch in link and duplexing, so check the
PCIm function state and perform a PHY reset when in this state to
resolve the issue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Some firmware versions do not support this so use the silent variant
to send the message to firmware to suppress the harmless error. This
error message is unnecessarily alarming the user.
Fixes: afdc8a8484 ("bnxt_en: Add DCBNL DSCP application protocol support.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In an earlier commit to improve NQ reservations on 57500 chips, we
set the resv_irqs on the 57500 VFs to the fixed value assigned by
the PF regardless of how many are actually used. The current
code assumes that resv_irqs minus the ones used by the network driver
must be the ones for the RDMA driver. This is no longer true and
we may return more MSIX vectors than requested, causing inconsistency.
Fix it by capping the value.
Fixes: 01989c6b69 ("bnxt_en: Improve NQ reservations.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current logic assumes that the RDMA driver uses one statistics
context adjacent to the ones used by the network driver. This
assumption is not true and the statistics context used by the
RDMA driver is tied to its MSIX base vector. This wrong assumption
can cause RDMA driver failure after changing ethtool rings on the
network side. Fix the statistics reservation logic accordingly.
Fixes: 780baad44f ("bnxt_en: Reserve 1 stat_ctx for RDMA driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After ethtool loopback packet tests, we re-open the nic for the next
IRQ test. If the open fails, we must not proceed with the IRQ test
or we will crash with NULL pointer dereference. Fix it by checking
the bnxt_open_nic() return code before proceeding.
Reported-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Fixes: 67fea463fd ("bnxt_en: Add interrupt test to ethtool -t selftest.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some chips with older firmware can continue to perform DMA read from
context memory even after the memory has been freed. In the PCI shutdown
method, we need to call pci_disable_device() to shutdown DMA to prevent
this DMA before we put the device into D3hot. DMA memory request in
D3hot state will generate PCI fatal error. Similarly, in the driver
remove method, the context memory should only be freed after DMA has
been shutdown for correctness.
Fixes: 98f04cf0f1 ("bnxt_en: Check context memory requirements from firmware.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patches intended for v5.3
* Work on the new debugging framework continues;
* Update the FW API for CSI;
* Special SAR implementation for South Korea;
* Fixes in the module init error paths;
* Debugging infra work continues;
* A bunch of RF-kill fixes by Emmanuel;
* A fix for AP mode, also related to RF-kill, by Johannes.
* A few clean-ups;
* Other small fixes and improvements;
mt76 patches for 5.3
* use NAPI polling for tx cleanup on mt7603/mt7615
* various fixes for mt7615
* unify some code between mt7603 and mt7615
* fix locking issues on mt76x02
* add support for toggling edcca on mt7603
* fix reading target tx power with ext PA on mt7603/mt7615
* fix initalizing channel maximum power
* fix rate control / tx status reporting issues on mt76x02/mt7603
* add support for eeprom calibration data from mtd on mt7615
* support configuring tx power on mt7615
* fix external PA support on mt76x0
* per-chain signal reporting on mt7615
* rx/tx buffer fixes for USB devices
DMA_API_HOWTO.txt includes an example explaining when
dma_sync_single_for_device() is not needed, and that example matches
our use case. The buffer isn't changed by the CPU and direction is
DMA_FROM_DEVICE, so we can remove the call to
dma_sync_single_for_device().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Documentation/DMA-API-HOWTO.txt states:
By default, the kernel assumes that your device can address 32-bits of
DMA addressing. For a 64-bit capable device, this needs to be increased,
and for a device with limitations, it needs to be decreased.
Therefore we don't need the 32 Bit DMA fallback configuration and can
remove it.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VLAN tag is stored in the descriptor in network byte order.
Using swab16 works on little endian host systems only. Better play safe
and use ntohs or htons respectively.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a 1ms delay after reset deactivation. Otherwise the chip returns
bogus ID value. This is observed with 88E6390 (Peridot) chip.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently bnx2x ptp worker tries to read a register with timestamp
information in case of TX packet timestamping and in case it fails,
the routine reschedules itself indefinitely. This was reported as a
kworker always at 100% of CPU usage, which was narrowed down to be
bnx2x ptp_task.
By following the ioctl handler, we could narrow down the problem to
an NTP tool (chrony) requesting HW timestamping from bnx2x NIC with
RX filter zeroed; this isn't reproducible for example with ptp4l
(from linuxptp) since this tool requests a supported RX filter.
It seems NIC FW timestamp mechanism cannot work well with
RX_FILTER_NONE - driver's PTP filter init routine skips a register
write to the adapter if there's not a supported filter request.
This patch addresses the problem of bnx2x ptp thread's everlasting
reschedule by retrying the register read 10 times; between the read
attempts the thread sleeps for an increasing amount of time starting
in 1ms to give FW some time to perform the timestamping. If it still
fails after all retries, we bail out in order to prevent an unbound
resource consumption from bnx2x.
The patch also adds an ethtool statistic for accounting the skipped
TX timestamp packets and it reduces the priority of timestamping
error messages to prevent log flooding. The code was tested using
both linuxptp and chrony.
Reported-and-tested-by: Przemyslaw Hausman <przemyslaw.hausman@canonical.com>
Suggested-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Acked-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The subns increment register has 24 bits as follows:
RegBit[15:0] = Subns[23:8]; RegBit[31:24] = Subns[7:0]
Fix the same in the driver and increase sub ns resolution to the
best capable, 24 bits. This should be the case on all GEM versions
that this PTP driver supports.
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The scaled ppm parameter passed to _adjfine() contains a 16 bit
fraction. This just happens to be the same as SUBNSINCR_SIZE now.
Hence define this separately.
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds vlan offload support for the HINIC driver.
Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
iwl_mvm_send_cmd returns 0 when the command won't be sent
because RF-Kill is asserted. Do the same when we call
iwl_get_shared_mem_conf since it is not sent through
iwl_mvm_send_cmd but directly calls the transport layer.
Cc: stable@vger.kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Sometimes the register status can include interrupts that
were masked. We can, for example, get the RF-Kill bit set
in the interrupt status register although this interrupt
was masked. Then if we get the ALIVE interrupt (for example)
that was not masked, we need to *not* service the RF-Kill
interrupt.
Fix this in the MSI-X interrupt handler.
Cc: stable@vger.kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Newest devices have a new firmware load mechanism. This
mechanism is called the context info. It means that the
driver doesn't need to load the sections of the firmware.
The driver rather prepares a place in DRAM, with pointers
to the relevant sections of the firmware, and the firmware
loads itself.
At the end of the process, the firmware sends the ALIVE
interrupt. This is different from the previous scheme in
which the driver expected the FH_TX interrupt after each
section being transferred over the DMA.
In order to support this new flow, we enabled all the
interrupts. This broke the assumption that we have in the
code that the RF-Kill interrupt can't interrupt the firmware
load flow.
Change the context info flow to enable only the ALIVE
interrupt, and re-enable all the other interrupts only
after the firmware is alive. Then, we won't see the RF-Kill
interrupt until then. Getting the RF-Kill interrupt while
loading the firmware made us kill the firmware while it is
loading and we ended up dumping garbage instead of the firmware
state.
Re-enable the ALIVE | RX interrupts from the ISR when we
get the ALIVE interrupt to be able to get the RX interrupt
that comes immediately afterwards for the ALIVE
notification. This is needed for non MSI-X only.
Cc: stable@vger.kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We added code to restock the buffer upon ALIVE interrupt
when MSI-X is disabled. This was added as part of the context
info code. This code was added only if the ISR debug level
is set which is very unlikely to be related.
Move this code to run even when the ISR debug level is not
set.
Note that gen2 devices work with MSI-X in most cases so that
this path is seldom used.
Cc: stable@vger.kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In AP (and IBSS) mode, we can only set GTKs to firmware after we have
sent down the multicast station, but this we can only do after we've
enabled beaconing, etc.
However, during rfkill exit, hostapd will configure the keys before
starting the AP, and cfg80211/mac80211 accept it happily.
On earlier devices, this didn't bother us as GTK TX wasn't really
handled in firmware, we just put the key material into the TX cmd
and thus it only mattered when we actually transmitted a frame.
On newer devices, however, the firmware needs to track all of this
and that doesn't work if we add the key before the (multicast) sta
it belongs to.
To fix this, keep a list of keys to add during AP enable, and call
the function there.
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The FW API was clarified saying that this flag should only be set in
BSS client mode. Remove it from the MAC_CTXT command we send in AP
and GO modes.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Fixes: 3b5ee8dd8b ("iwlwifi: mvm: set MAC_FILTER_IN_11AX in AP mode")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The 0xF6 command used to start and stop the recording from 22560 devices
was removed. This is causing an assert when the driver tries to alter
the recording state.
Remove the use of the command.
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
From 9000 device family the FW automatically stops the debug
recording and the driver should not stop it as well.
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In ini debug mode the recording does not restart unless legacy monitor
configuration is also given.
Add dbg_ini_dest field to trans to indicate the debug monitor
destination to solve this.
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
TWT is still very new and we expect issues. Make its usage
configurable and disable it by default.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Unite iwl_trans debug related fields under iwl_trans_debug struct to
increase readability and keep iwl_trans clean.
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
There are several flows where the driver checks if it runs in ini mode.
Some of these flows are no longer used in ini mode or there is another
condition that check the ini mode in the same flow. Either way, those
conditions are redundant. Remove the redundant conditions.
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The driver should delay only in recording stop flow between writing to
DBGC_IN_SAMPLE register and DBGC_OUT_CTRL register. Any other delay is
not needed.
Change the following:
1. Remove any unnecessary delays in the flow
2. Increase the delay in the stop recording flow since 100 micro is
not enough
3. Use usleep_range instead of delay since the driver is allowed to
sleep in this flow.
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Fixes: 5cfe79c8d9 ("iwlwifi: fw: stop and start debugging using host command")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Currently we dump only the first 64 bytes of the PCI config space,
which leaves out some important things, such as the base address
registers.
Increase it to 352 for the PCI device and to 524 for the rootport to
make sure we include everything we need.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
In some buggy scenarios we could possible attempt to transmit frames larger
than maximum MSDU size. Since our devices don't know how to handle this,
it may result in asserts, hangs etc.
This can happen, for example, when we receive a large multicast frame
and try to transmit it back to the air in AP mode.
Since in a legal scenario this should never happen, drop such frames and
warn about it.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
South Korea is adding a more strict SAR limit called "Limb SAR".
Currently, WGDS SAR offset group 3 is not used (not mapped to any country).
In order to be able to comply with South Korea new restriction:
- OEM will use WGDS SAR offset group 3 to South Korea limitation.
- OEM will change WGDS revision to 1 (currently latest revision is 0)
to notify that Korea Limb SAR applied.
- Driver will read the WGDS table and pass the values to FW (as usual)
- Driver will pass to FW an indication that Korea Limb SAR is applied
in case table revision is 1.
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
When the module fails to initialize for some reason, it
doesn't clean up properly. Fix that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>