Commit Graph

935158 Commits

Author SHA1 Message Date
Petr Machata
faad0525c0 mlxsw: core_acl_flex_actions: Add L4_PORT_ACTION
Add fields related to L4_PORT_ACTION, which is used for changing of TCP and
UDP port numbers.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:32:11 -07:00
Petr Machata
3cc9a15a0b mlxsw: spectrum: Split handling of pedit mangle by chip type
Certain ACL actions are only available on some Spectrum revisions. In
particular, L4_PORT_ACTION is not available on Spectrum-1. Introduce a
new ops struct intended to hold these differences, mlxsw_sp_rulei_ops.
Prime it with a sole member, act_mangle_field, meant for handling of
pedit mangles.

Create two ops structures, one for Spectrum-1, the other for Spectrum-2
and above. Add callbacks for act_mangle_field and dispatch to the common
handler.

Invoke mlxsw_sp_rulei_ops.act_mangle_field from the field mangler
instead of calling the common handler directly.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:32:11 -07:00
Ido Schimmel
f3fe412b0a mlxsw: spectrum: Do not rely on machine endianness
The second commit cited below performed a cast of 'u32 buffsize' to
'(u16 *)' when calling mlxsw_sp_port_headroom_8x_adjust():

mlxsw_sp_port_headroom_8x_adjust(mlxsw_sp_port, (u16 *) &buffsize);

Colin noted that this will behave differently on big endian
architectures compared to little endian architectures.

Fix this by following Colin's suggestion and have the function accept
and return 'u32' instead of passing the current size by reference.

Fixes: da382875c6 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Fixes: 60833d54d5 ("mlxsw: spectrum: Adjust headroom buffers for 8x ports")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Colin Ian King <colin.king@canonical.com>
Suggested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:29:51 -07:00
David S. Miller
73f782d523 Merge branch 'Add-Marvell-88E1340S-88E1548P-support'
Maxim Kochetkov says:

====================
Add Marvell 88E1340S, 88E1548P support

This patch series add new PHY id support.
Russell King asked to use single style for referencing functions.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:28:34 -07:00
Maxim Kochetkov
f59babf95e net: phy: marvell: Add Marvell 88E1548P support
Add support for this new phy ID.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:28:34 -07:00
Maxim Kochetkov
a602ea86e9 net: phy: marvell: Add Marvell 88E1340S support
Add support for this new phy ID.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:28:34 -07:00
Maxim Kochetkov
ef0f9545cb net: phy: marvell: use a single style for referencing functions
The kernel in general does not use &func referencing format.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:28:34 -07:00
David S. Miller
3b87cfefab Merge branch 'r8169-mark-device-as-detached-in-PCI-D3-and-improve-locking'
Heiner Kallweit says:

====================
r8169: mark device as detached in PCI D3 and improve locking

Mark the netdevice as detached whenever parent is in PCI D3hot and not
accessible. This mainly applies to runtime-suspend state.
In addition take RTNL lock in suspend calls, this allows to remove
the driver-specific mutex and improve PM callbacks in general.
====================

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:02 -07:00
Heiner Kallweit
288302dab3 r8169: improve rtl8169_runtime_resume
Simplify rtl8169_runtime_resume() by calling rtl8169_resume().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:02 -07:00
Heiner Kallweit
06a14ab852 r8169: remove driver-specific mutex
Now that the critical sections are protected with RTNL lock, we don't
need a separate mutex any longer.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:02 -07:00
Heiner Kallweit
abe5fc42f9 r8169: use RTNL to protect critical sections
Most relevant ops (open, close, ethtool ops) are protected with RTNL
lock by net core. Make sure that such ops can't be interrupted by
e.g. (runtime-)suspending by taking the RTNL lock in suspend ops
and the PCI error handler.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:02 -07:00
Heiner Kallweit
567ca57faa r8169: add rtl8169_up
Factor out bringing device up to a new function rtl8169_up(), similar
to rtl8169_down() for bringing the device down.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:02 -07:00
Heiner Kallweit
ec2f204bdd r8169: remove no longer needed checks for device being runtime-active
Because the netdevice is marked as detached now when parent is not
accessible we can remove quite some checks.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:01 -07:00
Heiner Kallweit
476c4f5de3 r8169: mark device as not present when in PCI D3
Mark the netdevice as detached whenever we go into PCI D3hot.
This allows to remove some checks e.g. from ethtool ops because
dev_ethtool() checks for netif_device_present() in the beginning.

In this context move waking up the queue out of rtl_reset_work()
because in cases where netif_device_attach() is called afterwards
the queue should be woken up by the latter function only.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:01 -07:00
Heiner Kallweit
bd869245a3 net: core: try to runtime-resume detached device in __dev_open
A netdevice may be marked as detached because the parent is
runtime-suspended and not accessible whilst interface or link is down.
An example are PCI network devices that go into PCI D3hot, see e.g.
__igc_shutdown() or rtl8169_net_suspend().
If netdevice is down and marked as detached we can only open it if
we runtime-resume it before __dev_open() calls netif_device_present().

Therefore, if netdevice is detached, try to runtime-resume the parent
and only return with an error if it's still detached.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:26:01 -07:00
David S. Miller
8878adba6a Merge branch 'prepare-dwmac-meson8b-for-G12A-specific-initialization'
Martin Blumenstingl says:

====================
prepare dwmac-meson8b for G12A specific initialization

Some users are reporting that RGMII (and sometimes also RMII) Ethernet
is not working for them on G12A/G12B/SM1 boards. Upon closer inspection
of the vendor code for these SoCs new register bits are found.

It's not clear yet how these registers work. Add a new compatible string
as the first preparation step to improve Ethernet support on these SoCs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:24:10 -07:00
Martin Blumenstingl
a4f63342d0 net: stmmac: dwmac-meson8b: add a compatible string for G12A SoCs
Amlogic Meson G12A, G12B and SM1 have the same (at least as far as we
know at the time of writing) PRG_ETHERNET glue register implementation.
This implementation however is slightly different from AXG as it now has
an undocument "auto cali idx val" register in PRG_ETH1[17:16] which
seems to be related to RGMII Ethernet.

Add a new compatible string for G12A SoCs so the logic for this new
register can be implemented in the future.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:24:10 -07:00
Martin Blumenstingl
3efdb92426 dt-bindings: net: dwmac-meson: Add a compatible string for G12A onwards
Amlogic Meson G12A, G12B and SM1 have the same (at least as far as we
know at the time of writing) PRG_ETHERNET glue register implementation.
This implementation however is slightly different from AXG as it now has
an undocument "auto cali idx val" register in PRG_ETH1[17:16] which
seems to be related to RGMII Ethernet.

Add a compatible string for G12A and newer so the new registers can be
used.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:24:10 -07:00
David S. Miller
58d844e860 Merge branch 'devlink-Add-board-serial_number-field-to-info_get-cb'
Vasundhara Volam says:

====================
devlink: Add board.serial_number field to info_get cb.

This patchset adds support for board.serial_number to devlink info_get
cb and also use it in bnxt_en driver.

Sample output:

$ devlink dev info pci/0000:af:00.1
pci/0000:af:00.1:
  driver bnxt_en
  serial_number 00-10-18-FF-FE-AD-1A-00
  board.serial_number 433551F+172300000
  versions:
      fixed:
        board.id 7339763 Rev 0.
        asic.id 16D7
        asic.rev 1
      running:
        fw 216.1.216.0
        fw.psid 0.0.0
        fw.mgmt 216.1.192.0
        fw.mgmt.api 1.10.1
        fw.ncsi 0.0.0.0
        fw.roce 216.1.16.0

v2:
- Modify board_serial_number to board.serial_number for maintaining
consistency.
- Combine 2 lines in second patchset as column limit is 100 now
====================

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:15:43 -07:00
Vasundhara Volam
9bf88b9fc8 bnxt_en: Add board.serial_number field to info_get cb
Add board.serial_number field info to info_get cb via devlink,
if driver can fetch the information from the device.

Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:15:05 -07:00
Vasundhara Volam
b5872cd0e8 devlink: Add support for board.serial_number to info_get cb.
Board serial number is a serial number, often available in PCI
*Vital Product Data*.

Also, update devlink-info.rst documentation file.

Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:15:04 -07:00
Dejin Zheng
6d61f483f1 net: phy: smsc: fix printing too many logs
Commit 7ae7ad2f11 ("net: phy: smsc: use phy_read_poll_timeout()
to simplify the code") will print a lot of logs as follows when Ethernet
cable is not connected:

[    4.473105] SMSC LAN8710/LAN8720 2188000.ethernet-1:00: lan87xx_read_status failed: -110

When wait 640 ms for check ENERGYON bit, the timeout should not be
regarded as an actual error and an error message also should not be
printed. due to a hardware bug in LAN87XX device, it leads to unstable
detection of plugging in Ethernet cable when LAN87xx is in Energy Detect
Power-Down mode. the workaround for it involves, when the link is down,
and at each read_status() call:

- disable EDPD mode, forcing the PHY out of low-power mode
- waiting 640ms to see if we have any energy detected from the media
- re-enable entry to EDPD mode

This is presumably enough to allow the PHY to notice that a cable is
connected, and resume normal operations to negotiate with the partner.
The problem is that when no media is detected, the 640ms wait times
out and this commit was modified to prints an error message. it is an
inappropriate conversion by used phy_read_poll_timeout() to introduce
this bug. so fix this issue by use read_poll_timeout() to replace
phy_read_poll_timeout().

Fixes: 7ae7ad2f11 ("net: phy: smsc: use phy_read_poll_timeout() to simplify the code")
Reported-by: Kevin Groeneveld <kgroeneveld@gmail.com>
Signed-off-by: Dejin Zheng <zhengdejin5@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:08:48 -07:00
David S. Miller
406fcb5bae Merge branch 'Cosmetic-cleanup-in-SJA1105-DSA-driver'
Vladimir Oltean says:

====================
Cosmetic cleanup in SJA1105 DSA driver

This removes the sparse warnings from the sja1105 driver and makes some
structures constant.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:01:29 -07:00
Vladimir Oltean
13c832a41d net: dsa: sja1105: make the instantiations of struct sja1105_info constant
Since struct sja1105_private only holds a const pointer to one of these
structures based on device tree compatible string, the structures
themselves can be made const.

Also add an empty line between each structure definition, to appease
checkpatch.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:01:29 -07:00
Vladimir Oltean
718e44b6ea net: dsa: sja1105: make config table operation structures constant
The per-chip instantiations of struct sja1105_table_ops and struct
sja1105_dynamic_table_ops can be made constant, so do that.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:01:29 -07:00
Vladimir Oltean
be3fb56d6a net: dsa: sja1105: remove empty structures from config table ops
Sparse is complaining and giving the following warning message:
'Using plain integer as NULL pointer'.

This is not what's going on, instead {0} is used as a zero initializer
for the structure members, to indicate that the particular chip revision
does not support those particular config tables.

But since the config tables are declared globally, the unpopulated
elements are zero-initialized anyway. So, to make sparse shut up, let's
remove the zero initializers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 16:01:28 -07:00
David S. Miller
717dd44c5b Merge branch 'net-dsa-qca8k-Improve-SGMII-interface-handling'
Jonathan McDowell says:

====================
net: dsa: qca8k: Improve SGMII interface handling

This 3 patch series migrates the qca8k switch driver over to PHYLINK,
and then adds the SGMII clean-ups (i.e. the missing initialisation) on
top of that as a second patch. The final patch is a simple spelling fix
in a comment.

As before, tested with a device where the CPU connection is RGMII (i.e.
the common current use case) + one where the CPU connection is SGMII. I
don't have any devices where the SGMII interface is brought out to
something other than the CPU.

v5:
- Move spelling fix to separate patch
- Use ds directly rather than ds->priv
v4:
- Enable pcs_poll so we keep phylink updated when doing in-band
  negotiation
- Explicitly check for PHY_INTERFACE_MODE_1000BASEX when setting SGMII
  port mode.
- Address Vladimir's review comments
v3:
- Move phylink changes to separate patch
- Address rmk review comments
v2:
- Switch to phylink
- Avoid need for device tree configuration options
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:54:34 -07:00
Jonathan McDowell
a997b33701 net: dsa: qca8k: Minor comment spelling fix
Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:54:34 -07:00
Jonathan McDowell
f6dadd5598 net: dsa: qca8k: Improve SGMII interface handling
This patch improves the handling of the SGMII interface on the QCA8K
devices. Previously the driver did no configuration of the port, even if
it was selected. We now configure it up in the appropriate
PHY/MAC/Base-X mode depending on what phylink tells us we are connected
to and ensure it is enabled.

Tested with a device where the CPU connection is RGMII (i.e. the common
current use case) + one where the CPU connection is SGMII. I don't have
any devices where the SGMII interface is brought out to something other
than the CPU.

Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:54:34 -07:00
Jonathan McDowell
b3591c2a36 net: dsa: qca8k: Switch to PHYLINK instead of PHYLIB
Update the driver to use the new PHYLINK callbacks, removing the
legacy adjust_link callback.

Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:54:33 -07:00
David S. Miller
2b3445e814 Merge branch 'bonding-initial-support-for-hardware-crypto-offload'
Jarod Wilson says:

====================
bonding: initial support for hardware crypto offload

This is an initial functional implementation for doing pass-through of
hardware encryption from bonding device to capable slaves, in active-backup
bond setups. This was developed and tested using ixgbe-driven Intel x520
interfaces with libreswan and a transport mode connection, primarily using
netperf, with assorted connection failures forced during transmission. The
failover works quite well in my testing, and overall performance is right
on par with offload when running on a bare interface, no bond involved.

Caveats: this is ONLY enabled for active-backup, because I'm not sure
how one would manage multiple offload handles for different devices all
running at the same time in the same xfrm, and it relies on some minor
changes to both the xfrm code and slave device driver code to get things
to behave, and I don't have immediate access to any other hardware that
could function similarly, but the NIC driver changes are minimal and
straight-forward enough that I've included what I think ought to be
enough for mlx5 devices too.

v2: reordered patches, switched (back) to using CONFIG_XFRM_OFFLOAD
to wrap the code additions and wrapped overlooked additions.
v3: rebase w/net-next open, add proper cc list to cover letter
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:38:57 -07:00
Jarod Wilson
18cb261afd bonding: support hardware encryption offload to slaves
Currently, this support is limited to active-backup mode, as I'm not sure
about the feasilibity of mapping an xfrm_state's offload handle to
multiple hardware devices simultaneously, and we rely on being able to
pass some hints to both the xfrm and NIC driver about whether or not
they're operating on a slave device.

I've tested this atop an Intel x520 device (ixgbe) using libreswan in
transport mode, succesfully achieving ~4.3Gbps throughput with netperf
(more or less identical to throughput on a bare NIC in this system),
as well as successful failover and recovery mid-netperf.

v2: just use CONFIG_XFRM_OFFLOAD for wrapping, isolate more code with it

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:38:57 -07:00
Jarod Wilson
bf3a058de5 mlx5: become aware of when running as a bonding slave
I've been unable to get my hands on suitable supported hardware to date,
but I believe this ought to be all that is needed to enable the mlx5
driver to also work with bonding active-backup crypto offload passthru.

CC: Boris Pismenny <borisp@mellanox.com>
CC: Saeed Mahameed <saeedm@mellanox.com>
CC: Leon Romanovsky <leon@kernel.org>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:38:57 -07:00
Jarod Wilson
0dea9ea97e ixgbe_ipsec: become aware of when running as a bonding slave
Slave devices in a bond doing hardware encryption also need to be aware
that they're slaves, so we operate on the slave instead of the bonding
master to do the actual hardware encryption offload bits.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Acked-by: Jeff Kirsher <Jeffrey.t.kirsher@intel.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:38:57 -07:00
Jarod Wilson
272c2330ad xfrm: bail early on slave pass over skb
This is prep work for initial support of bonding hardware encryption
pass-through support. The bonding driver will fill in the slave_dev
pointer, and we use that to know not to skb_push() again on a given
skb that was already processed on the bond device.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:38:56 -07:00
David S. Miller
389cc2f326 Merge branch 'devlink-Support-get-set-mac-address-of-a-port-function'
Parav Pandit says:

====================
devlink: Support get,set mac address of a port function

Currently, ip link set dev <pfndev> vf <vf_num> <param> <value> has
below few limitations.

1. Command is limited to set VF parameters only.
It cannot set the default MAC address for the PCI PF.

2. It can be set only on system where PCI SR-IOV capability exists.
In smartnic based system, eswitch of a NIC resides on a different
embedded cpu which has the VF and PF representors for the SR-IOV
functions of a host system in which this smartnic is plugged-in.

3. It cannot setup the function attributes of sub-function described
in detail in comprehensive RFC [1] and [2].

This series covers the first small part to let user query and set MAC
address (hardware address) of a PCI PF/VF which is represented by
devlink port pcipf, pcivf port flavours respectively.

Whenever a devlink port manages a function connected to a devlink port,
it allows to query and set its hardware address.

Driver implements necessary get/set callback functions if it supports
port function for a given port type.

Patch summary:
Patch-1 Prepares devlink port fill routines for extack
Patch-2 and 3 extended devlink interface to get/set port function
attributes, mainly hardware address to start with.

Patch-2 Extended port dump command to query port function hardware
address
Patch-3 Introduces a command to set the hardware address of a port
function

Patch-4 to 9 refactors and implement devlink callbacks in mlx5_core
driver.
Patch-4 Constify the mac address pointer in set routines
Patch-5 Introduces eswich check helper to use in devlink facing
callbacks
Patch-6 Moves port index, port number conversion routine to eswitch
header file
Patch-7 Implements port function query devlink callback
Patch-8 Refactors mac address setting routine to uniformly use
state_lock
Patch-9 Implements port function set devlink callback

[1] https://lore.kernel.org/netdev/20200519092258.GF4655@nanopsycho/
[2] https://marc.info/?l=linux-netdev&m=158555928517777&w=2
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:30:08 -07:00
Parav Pandit
330077d14d net/mlx5: E-switch, Supporting setting devlink port function mac address
Enable user to set mac address of the PCI PF and VF port function.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
1094795ce4 net/mlx5: Split mac address setting function for using state_lock
Refactor mac address setting function to let caller hold the necessary
state_lock mutex, so that subsequent patch and use this helper routine.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
f099fde16d net/mlx5: E-switch, Support querying port function mac address
Support querying mac address of the eswitch devlink port function.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
443bf36eb5 net/mlx5: Move helper to eswitch layer
To use port number to port index conversion at eswitch level, move it to
eswitch header.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
bd93975353 net/mlx5: E-switch, Introduce and use eswitch support check helper
Introduce an helper routine to get esw from a devlink device and use it
at eswitch callbacks and in subsequent patch.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
fa997825eb net/mlx5: Constify mac address pointer
Since none of the functions need to modify the input mac address,
constify them.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
a1e8ae907c net/devlink: Support setting hardware address of port function
PCI PF and VF devlink port can manage the function represented by a
devlink port.

Allow users to set port function's hardware address.

Example of a PCI VF port which supports a port function:
$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port function set pci/0000:06:00.0/2 hw_addr 00:11:22:33:44:55

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
  function:
    hw_addr 00:11:22:33:44:55

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
2a916ecc40 net/devlink: Support querying hardware address of port function
PCI PF and VF devlink port can manage the function represented by
a devlink port.

Enable users to query port function's hardware address.

Example of a PCI VF port which supports a port function:
$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
  function:
    hw_addr 00:11:22:33:44:66

$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "enp6s0pf0vf1",
            "flavour": "pcivf",
            "pfnum": 0,
            "vfnum": 1,
            "function": {
                "hw_addr": "00:11:22:33:44:66"
            }
        }
    }
}

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:19 -07:00
Parav Pandit
a829eb0d5d net/devlink: Prepare devlink port functions to fill extack
Prepare devlink port related functions to optionally fill up
the extack information which will be used in subsequent patch by port
function attribute(s).

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-22 15:29:18 -07:00
Sean Christopherson
2dbebf7ae1 KVM: nVMX: Plumb L2 GPA through to PML emulation
Explicitly pass the L2 GPA to kvm_arch_write_log_dirty(), which for all
intents and purposes is vmx_write_pml_buffer(), instead of having the
latter pull the GPA from vmcs.GUEST_PHYSICAL_ADDRESS.  If the dirty bit
update is the result of KVM emulation (rare for L2), then the GPA in the
VMCS may be stale and/or hold a completely unrelated GPA.

Fixes: c5f983f6e8 ("nVMX: Implement emulated Page Modification Logging")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200622215832.22090-2-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-22 18:23:03 -04:00
Andrii Nakryiko
1bdb6c9a1c libbpf: Add a bunch of attribute getters/setters for map definitions
Add a bunch of getter for various aspects of BPF map. Some of these attribute
(e.g., key_size, value_size, type, etc) are available right now in struct
bpf_map_def, but this patch adds getter allowing to fetch them individually.
bpf_map_def approach isn't very scalable, when ABI stability requirements are
taken into account. It's much easier to extend libbpf and add support for new
features, when each aspect of BPF map has separate getter/setter.

Getters follow the common naming convention of not explicitly having "get" in
its name: bpf_map__type() returns map type, bpf_map__key_size() returns
key_size. Setters, though, explicitly have set in their name:
bpf_map__set_type(), bpf_map__set_key_size().

This patch ensures we now have a getter and a setter for the following
map attributes:
  - type;
  - max_entries;
  - map_flags;
  - numa_node;
  - key_size;
  - value_size;
  - ifindex.

bpf_map__resize() enforces unnecessary restriction of max_entries > 0. It is
unnecessary, because libbpf actually supports zero max_entries for some cases
(e.g., for PERF_EVENT_ARRAY map) and treats it specially during map creation
time. To allow setting max_entries=0, new bpf_map__set_max_entries() setter is
added. bpf_map__resize()'s behavior is preserved for backwards compatibility
reasons.

Map ifindex getter is added as well. There is a setter already, but no
corresponding getter. Fix this assymetry as well. bpf_map__set_ifindex()
itself is converted from void function into error-returning one, similar to
other setters. The only error returned right now is -EBUSY, if BPF map is
already loaded and has corresponding FD.

One lacking attribute with no ability to get/set or even specify it
declaratively is numa_node. This patch fixes this gap and both adds
programmatic getter/setter, as well as adds support for numa_node field in
BTF-defined map.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200621062112.3006313-1-andriin@fb.com
2020-06-23 00:01:32 +02:00
Andrii Nakryiko
4e15507fea libbpf: Forward-declare bpf_stats_type for systems with outdated UAPI headers
Systems that doesn't yet have the very latest linux/bpf.h header, enum
bpf_stats_type will be undefined, causing compilation warnings. Prevents this
by forward-declaring enum.

Fixes: 0bee106716 ("libbpf: Add support for command BPF_ENABLE_STATS")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200621031159.2279101-1-andriin@fb.com
2020-06-22 23:23:49 +02:00
Andrey Ignatov
b1b53d413f selftests/bpf: Test access to bpf map pointer
Add selftests to test access to map pointers from bpf program for all
map types except struct_ops (that one would need additional work).

verifier test focuses mostly on scenarios that must be rejected.

prog_tests test focuses on accessing multiple fields both scalar and a
nested struct from bpf program and verifies that those fields have
expected values.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/139a6a17f8016491e39347849b951525335c6eb4.1592600985.git.rdna@fb.com
2020-06-22 22:22:59 +02:00
Andrey Ignatov
2872e9ac33 bpf: Set map_btf_{name, id} for all map types
Set map_btf_name and map_btf_id for all map types so that map fields can
be accessed by bpf programs.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/a825f808f22af52b018dbe82f1c7d29dab5fc978.1592600985.git.rdna@fb.com
2020-06-22 22:22:58 +02:00