Commit Graph

82594 Commits

Author SHA1 Message Date
Shalom Toledo
1dc3c0a248 mlxsw: reg: 80 columns wrapping change
80 columns wrapping change in mlxsw_reg_ptys_eth_unpack function.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:54:36 -08:00
Shalom Toledo
e6f66f50bf mlxsw: reg: Rename p_eth_proto_adm to full name p_eth_proto_admin
Rename p_eth_proto_adm to p_eth_proto_admin in mlxsw_reg_ptys_eth_unpack
function.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:54:36 -08:00
Shalom Toledo
c5b870df69 mlxsw: spectrum: Add port type-speed operations
Add port type-speed operations in order to have different operations for
different ASICs. For now, both ASICs use the same pointer.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:54:36 -08:00
Shalom Toledo
88a4281200 mlxsw: spectrum: Rename port type-speed functions to ASIC specific
Rename port speed-type functions to be Spectrum-1 ASIC specific.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:54:36 -08:00
Shalom Toledo
1e2f66eceb mlxsw: spectrum: Query port connector type from firmware
Instead of deriving the port connector type from port admin state, query it
from firmware.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:54:36 -08:00
Shalom Toledo
475b33cb66 mlxsw: spectrum: Remove unsupported eth_proto_lp_advertise field in PTYS
Remove eth_proto_lp_advertise field in PTYS register since it is not
supported by the firmware.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:54:36 -08:00
Shalom Toledo
1531be3197 mlxsw: spectrum: Remove duplicate port link mode entry
Remove duplicate port link mode entry from mlxsw_sp_port_link_mode.

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:54:35 -08:00
Mao Wenan
4593403fa5 net: set static variable an initial value in atl2_probe()
cards_found is a static variable, but when it enters atl2_probe(),
cards_found is set to zero, the value is not consistent with last probe,
so next behavior is not our expect.

Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:47:13 -08:00
Florian Fainelli
abdf47aab4 veth: Fix -Wformat-truncation
Provide a precision hint to snprintf() in order to eliminate a
-Wformat-truncation warning provided below. A maximum of 11 characters
is allowed to reach a maximum of 32 - 1 characters given a possible
maximum value of queues using up to UINT_MAX which occupies 10
characters. Incidentally 11 is the number of characters for
"xdp_packets" which is the largest string we append.

drivers/net/veth.c: In function 'veth_get_strings':
drivers/net/veth.c:118:47: warning: '%s' directive output may be
truncated writing up to 31 bytes into a region of size between 12 and 21
[-Wformat-truncation=]
     snprintf(p, ETH_GSTRING_LEN, "rx_queue_%u_%s",
                                               ^~
drivers/net/veth.c:118:5: note: 'snprintf' output between 12 and 52
bytes into a destination of size 32
     snprintf(p, ETH_GSTRING_LEN, "rx_queue_%u_%s",
     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       i, veth_rq_stats_desc[j].desc);
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:44:58 -08:00
Florian Fainelli
135e724547 e1000e: Fix -Wformat-truncation warnings
Provide precision hints to snprintf() since we know the destination
buffer size of the RX/TX ring names are IFNAMSIZ + 5 - 1. This fixes the
following warnings:

drivers/net/ethernet/intel/e1000e/netdev.c: In function
'e1000_request_msix':
drivers/net/ethernet/intel/e1000e/netdev.c:2109:13: warning: 'snprintf'
output may be truncated before the last format character
[-Wformat-truncation=]
     "%s-rx-0", netdev->name);
             ^
drivers/net/ethernet/intel/e1000e/netdev.c:2107:3: note: 'snprintf'
output between 6 and 21 bytes into a destination of size 20
   snprintf(adapter->rx_ring->name,
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     sizeof(adapter->rx_ring->name) - 1,
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     "%s-rx-0", netdev->name);
     ~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/intel/e1000e/netdev.c:2125:13: warning: 'snprintf'
output may be truncated before the last format character
[-Wformat-truncation=]
     "%s-tx-0", netdev->name);
             ^
drivers/net/ethernet/intel/e1000e/netdev.c:2123:3: note: 'snprintf'
output between 6 and 21 bytes into a destination of size 20
   snprintf(adapter->tx_ring->name,
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     sizeof(adapter->tx_ring->name) - 1,
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     "%s-tx-0", netdev->name);
     ~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:44:57 -08:00
Florian Fainelli
3f8b86964e net: dsa: mv88e6xxx: Fix -Wformat-security warnings
We are not specifying an explicit format argument but instead passing a
string litteral which causes these two warnings to show up:

drivers/net/dsa/mv88e6xxx/chip.c: In function
'mv88e6xxx_irq_poll_setup':
drivers/net/dsa/mv88e6xxx/chip.c:483:2: warning: format not a string
literal and no format arguments [-Wformat-security]
  chip->kworker = kthread_create_worker(0, dev_name(chip->dev));
  ^~~~
drivers/net/dsa/mv88e6xxx/ptp.c: In function 'mv88e6xxx_ptp_setup':
drivers/net/dsa/mv88e6xxx/ptp.c:403:4: warning: format not a string
literal and no format arguments [-Wformat-security]
    dev_name(chip->dev));
    ^~~~~~~~
  LD [M]  drivers/net/dsa/mv88e6xxx/mv88e6xxx.o

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:44:57 -08:00
Florian Fainelli
ab2c4e2581 mlxsw: spectrum: Avoid -Wformat-truncation warnings
Give precision identifiers to the two snprintf() formatting the priority
and TC strings to avoid producing these two warnings:

drivers/net/ethernet/mellanox/mlxsw/spectrum.c: In function
'mlxsw_sp_port_get_prio_strings':
drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2132:37: warning: '%d'
directive output may be truncated writing between 1 and 3 bytes into a
region of size between 0 and 31 [-Wformat-truncation=]
   snprintf(*p, ETH_GSTRING_LEN, "%s_%d",
                                     ^~
drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2132:3: note: 'snprintf'
output between 3 and 36 bytes into a destination of size 32
   snprintf(*p, ETH_GSTRING_LEN, "%s_%d",
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     mlxsw_sp_port_hw_prio_stats[i].str, prio);
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlxsw/spectrum.c: In function
'mlxsw_sp_port_get_tc_strings':
drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2143:37: warning: '%d'
directive output may be truncated writing between 1 and 11 bytes into a
region of size between 0 and 31 [-Wformat-truncation=]
   snprintf(*p, ETH_GSTRING_LEN, "%s_%d",
                                     ^~
drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2143:3: note: 'snprintf'
output between 3 and 44 bytes into a destination of size 32
   snprintf(*p, ETH_GSTRING_LEN, "%s_%d",
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     mlxsw_sp_port_hw_tc_stats[i].str, tc);
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:44:57 -08:00
Maxime Chevallier
61a65d32fe net: phy: marvell10g: Fix Multi-G advertisement to only advertise 10G
Some Marvell Alaska PHYs support 2.5G, 5G and 10G BaseT links. Their
default behaviour is to advertise all of these modes, but at the moment,
only 10GBaseT is supported. To prevent link partners from establishing
link at that speed, clear these modes upon configuring aneg parameters.

Fixes: 20b2af32ff ("net: phy: add Marvell Alaska X 88X3310 10Gigabit PHY support")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reported-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 13:27:51 -08:00
David S. Miller
ea34a00364 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-02-23

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix a bug in BPF's LPM deletion logic to match correct prefix
   length, from Alban.

2) Fix AF_XDP teardown by not destroying umem prematurely as it
   is still needed till all outstanding skbs are freed, from Björn.

3) Fix unkillable BPF_PROG_TEST_RUN under preempt kernel by checking
   signal_pending() outside need_resched() condition which is never
   triggered there, from Stanislav.

4) Fix two nfp JIT bugs, one in code emission for K-based xor, and
   another one to explicitly clear upper bits in alu32, from Jiong.

5) Add bpf list address to maintainers file, from Daniel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 20:45:38 -08:00
YueHaibing
6ff7b06053 mdio_bus: Fix use-after-free on device_register fails
KASAN has found use-after-free in fixed_mdio_bus_init,
commit 0c692d0784 ("drivers/net/phy/mdio_bus.c: call
put_device on device_register() failure") call put_device()
while device_register() fails,give up the last reference
to the device and allow mdiobus_release to be executed
,kfreeing the bus. However in most drives, mdiobus_free
be called to free the bus while mdiobus_register fails.
use-after-free occurs when access bus again, this patch
revert it to let mdiobus_free free the bus.

KASAN report details as below:

BUG: KASAN: use-after-free in mdiobus_free+0x85/0x90 drivers/net/phy/mdio_bus.c:482
Read of size 4 at addr ffff8881dc824d78 by task syz-executor.0/3524

CPU: 1 PID: 3524 Comm: syz-executor.0 Not tainted 5.0.0-rc7+ #45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xfa/0x1ce lib/dump_stack.c:113
 print_address_description+0x65/0x270 mm/kasan/report.c:187
 kasan_report+0x149/0x18d mm/kasan/report.c:317
 mdiobus_free+0x85/0x90 drivers/net/phy/mdio_bus.c:482
 fixed_mdio_bus_init+0x283/0x1000 [fixed_phy]
 ? 0xffffffffc0e40000
 ? 0xffffffffc0e40000
 ? 0xffffffffc0e40000
 do_one_initcall+0xfa/0x5ca init/main.c:887
 do_init_module+0x204/0x5f6 kernel/module.c:3460
 load_module+0x66b2/0x8570 kernel/module.c:3808
 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x462e99
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6215c19c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003
RBP: 00007f6215c19c70 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f6215c1a6bc
R13: 00000000004bcefb R14: 00000000006f7030 R15: 0000000000000004

Allocated by task 3524:
 set_track mm/kasan/common.c:85 [inline]
 __kasan_kmalloc.constprop.3+0xa0/0xd0 mm/kasan/common.c:496
 kmalloc include/linux/slab.h:545 [inline]
 kzalloc include/linux/slab.h:740 [inline]
 mdiobus_alloc_size+0x54/0x1b0 drivers/net/phy/mdio_bus.c:143
 fixed_mdio_bus_init+0x163/0x1000 [fixed_phy]
 do_one_initcall+0xfa/0x5ca init/main.c:887
 do_init_module+0x204/0x5f6 kernel/module.c:3460
 load_module+0x66b2/0x8570 kernel/module.c:3808
 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 3524:
 set_track mm/kasan/common.c:85 [inline]
 __kasan_slab_free+0x130/0x180 mm/kasan/common.c:458
 slab_free_hook mm/slub.c:1409 [inline]
 slab_free_freelist_hook mm/slub.c:1436 [inline]
 slab_free mm/slub.c:2986 [inline]
 kfree+0xe1/0x270 mm/slub.c:3938
 device_release+0x78/0x200 drivers/base/core.c:919
 kobject_cleanup lib/kobject.c:662 [inline]
 kobject_release lib/kobject.c:691 [inline]
 kref_put include/linux/kref.h:67 [inline]
 kobject_put+0x146/0x240 lib/kobject.c:708
 put_device+0x1c/0x30 drivers/base/core.c:2060
 __mdiobus_register+0x483/0x560 drivers/net/phy/mdio_bus.c:382
 fixed_mdio_bus_init+0x26b/0x1000 [fixed_phy]
 do_one_initcall+0xfa/0x5ca init/main.c:887
 do_init_module+0x204/0x5f6 kernel/module.c:3460
 load_module+0x66b2/0x8570 kernel/module.c:3808
 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8881dc824c80
 which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 248 bytes inside of
 2048-byte region [ffff8881dc824c80, ffff8881dc825480)
The buggy address belongs to the page:
page:ffffea0007720800 count:1 mapcount:0 mapping:ffff8881f6c02800 index:0x0 compound_mapcount: 0
flags: 0x2fffc0000010200(slab|head)
raw: 02fffc0000010200 0000000000000000 0000000500000001 ffff8881f6c02800
raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8881dc824c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff8881dc824c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8881dc824d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                ^
 ffff8881dc824d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8881dc824e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 0c692d0784 ("drivers/net/phy/mdio_bus.c: call put_device on device_register() failure")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 15:34:07 -08:00
Vinod Koul
6d4cd041f0 net: phy: at803x: disable delay only for RGMII mode
Per "Documentation/devicetree/bindings/net/ethernet.txt" RGMII mode
should not have delay in PHY whereas RGMII_ID and RGMII_RXID/RGMII_TXID
can have delay in PHY.

So disable the delay only for RGMII mode and enable for other modes.
Also treat the default case as disabled delays.

Fixes: cd28d1d6e5: ("net: phy: at803x: Disable phy delay for RGMII mode")
Reported-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Niklas Cassel <niklas.cassel@linaro.org>
Tested-by: Peter Ujfalusi <peter.ujflausi@ti.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 15:30:03 -08:00
Vinod Koul
43f2ebd557 net: phy: at803x: don't inline helpers
Some helpers were declared with the "inline" function specifier.
It is preferable to let the compiler pick the right optimizations,
so drop the specifier for at803x_disable_rx_delay() and
at803x_disable_tx_delay()

Reviewed-by: Niklas Cassel <niklas.cassel@linaro.org>
Tested-by: Peter Ujfalusi <peter.ujflausi@ti.com>
Reviewed-by: Marc Gonzalez <marc.w.gonzalez@free.fr>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 15:30:03 -08:00
Michael Chan
0000b81a06 bnxt_en: Wait longer for the firmware message response to complete.
The code waits up to 20 usec for the firmware response to complete
once we've seen the valid response header in the buffer.  It turns
out that in some scenarios, this wait time is not long enough.
Extend it to 150 usec and use usleep_range() instead of udelay().

Fixes: 9751e8e714 ("bnxt_en: reduce timeout on initial HWRM calls")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 15:16:56 -08:00
Michael Chan
67681d02aa bnxt_en: Fix typo in firmware message timeout logic.
The logic that polls for the firmware message response uses a shorter
sleep interval for the first few passes.  But there was a typo so it
was using the wrong counter (larger counter) for these short sleep
passes.  The result is a slightly shorter timeout period for these
firmware messages than intended.  Fix it by using the proper counter.

Fixes: 9751e8e714 ("bnxt_en: reduce timeout on initial HWRM calls")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 15:16:56 -08:00
Jiong Wang
f036ebd9bf nfp: bpf: fix ALU32 high bits clearance bug
NFP BPF JIT compiler is doing a couple of small optimizations when jitting
ALU imm instructions, some of these optimizations could save code-gen, for
example:

  A & -1 =  A
  A |  0 =  A
  A ^  0 =  A

However, for ALU32, high 32-bit of the 64-bit register should still be
cleared according to ISA semantics.

Fixes: cd7df56ed3 ("nfp: add BPF to NFP code translator")
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-23 00:07:47 +01:00
Jiong Wang
71c190249f nfp: bpf: fix code-gen bug on BPF_ALU | BPF_XOR | BPF_K
The intended optimization should be A ^ 0 = A, not A ^ -1 = A.

Fixes: cd7df56ed3 ("nfp: add BPF to NFP code translator")
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-23 00:07:47 +01:00
Huy Nguyen
4b89251de0 net/mlx5: Support ndo bridge_setlink and getlink
Allow enabling VEPA mode on the HCA's port in legacy devlink mode.

Example:
bridge link set dev ens1f0 hwmode vepa
will turn on VEPA mode on the netdev ens1f0.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-22 13:38:25 -08:00
Huy Nguyen
8da202b249 net/mlx5: E-Switch, Add support for VEPA in legacy mode.
In Virtual Ethernet Port Aggregator (VEPA) mode, the packet skips
the system internal virtual switch and forwards to external network
switch. In Mellanox HCA case, the virtual switch is the HCA's Eswitch.

To support this, an new FDB flow table are created with level 0 and
linked to the existing FDB flow table in legacy mode. By default,
VEPA is turned off and this FDB flow table is empty. When VEPA is
turned on, two rules are created. One rule to forward on uplink vport
traffic to the legacy FDB. The other rule forward all other traffic
to uplink vport.

Other design alternatives were not chosen as explained below:
1. Create a forward rule in ACL flow table (most efficient design).
This approach is the not chosen because firmware does not support
forward rule to uplink vport (0xffff) for ACL flow table.
2. Add additional source port criteria in all the FDB rules to make the
FDB rules to be received rules only. This approach is not chosen because
it is not efficient as there can many rules in the FDB and VEPA mode
cannot be controlled per vport.
3. Add a highest prioirty flow group in the existing legacy FDB Flow
Table instead of a new flow table. This approoach does not work because the
new flow group has the same match criteria as the promiscuous flow group
and mlx5_add_flow_rules does not allow specifying flow group.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-22 13:38:24 -08:00
Eran Ben Elisha
2e5b053462 net/mlx5e: Fix mlx5e_tx_reporter_create return value
If reporter is ERR_PTR or NULL, error code shall be returned. At all other
cases it shall return success. Fix that.

Fixes: de8650a820 ("net/mlx5e: Add tx reporter support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-22 13:38:24 -08:00
Eran Ben Elisha
c7981bea48 net/mlx5e: Fix return status of TX reporter timeout recover
In case of lost interrupt recover, we shall return success. Fix that.

Fixes: 7d91126b1a ("net/mlx5e: Add tx timeout support for mlx5e tx reporter")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reported-by: Maria Pasechnik <mariap@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-22 13:38:24 -08:00
Eran Ben Elisha
2c493ae03a net/mlx5e: Re-add support for TX timeout when TX reporter is not valid
When TX reporter was introduced, it took ownership over TX timeout error
handling. this introduced a regression in case TX reporter is not valid
(NET_DEVLINK is not set, or devlink_health_reporter_create failure).

Fix mlx5e_tx_reporter_timeout function so it can be called at all times.

In addition, remove a warning print that indicates that a TX timeout won't
be handled in case of no valid TX reporter.

Fixes: 7d91126b1a ("net/mlx5e: Add tx timeout support for mlx5e tx reporter")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-22 13:38:24 -08:00
Eran Ben Elisha
772ac5e284 net/mlx5e: Fix warn print in case of TX reporter creation failure
Print warning message in case of TX reporter creation failure, only if the
return value is ERR_PTR type. NULL pointer return indicates that
NET_DEVLINK is not set, and the warning print can be skipped.

Fixes: de8650a820 ("net/mlx5e: Add tx reporter support")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-22 13:38:24 -08:00
Eli Britstein
97417f6182 net/mlx5e: Fix GRE key by controlling port tunnel entropy calculation
Flow entropy is calculated on the inner packet headers and used for
flow distribution in processing, routing etc. For GRE-type
encapsulations the entropy value is placed in the eight LSB of the key
field in the GRE header as defined in NVGRE RFC 7637. For UDP based
encapsulations the entropy value is placed in the source port of the
UDP header.
The hardware may support entropy calculation specifically for GRE and
for all tunneling protocols. With commit df2ef3bff1 ("net/mlx5e: Add
GRE protocol offloading") GRE is offloaded, but the hardware is
configured by default to calculate flow entropy so packets transmitted
on the wire have a wrong key. To support UDP based tunnels (i.e VXLAN),
GRE (i.e. no flow entropy) and NVGRE (i.e. with flow entropy) the
hardware behaviour must be controlled by the driver.

Ensure port entropy calculation is enabled for offloaded VXLAN tunnels
and disable port entropy calculation in the presence of offloaded GRE
tunnels by monitoring the presence of entropy enabling tunnels (i.e
VXLAN) and entropy disabing tunnels (i.e GRE).

Fixes: df2ef3bff1 ("net/mlx5e: Add GRE protocol offloading")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-22 13:38:23 -08:00
Eli Britstein
bfedc645de net/mlx5: Use read-modify-write when changing PCMR register values
Currently changing a PCMR field is done by setting the field in a
zeroed buffer, zeroing other unrelated fields.
Fix this behaviour by modifying only the required field after first
reading the current register values, as a pre-step towards using more
fields in PCMR register.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-22 13:38:23 -08:00
Arnd Bergmann
2547635054 Merge tag 'omap-for-v5.1/cpsw-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/late
One change to deprecate old CPSW Ethernet PHY mode selection driver

With the device tree changes configuring CPSW with a proper PHY driver,
we want to deprecate the old driver to avoid new users for it.

Note that this driver is based on the related dts changes.

* tag 'omap-for-v5.1/cpsw-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  net: ethernet: ti: cpsw: deprecate cpsw-phy-sel driver

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-22 22:13:46 +01:00
David S. Miller
1a25660856 Merge tag 'wireless-drivers-next-for-davem-2019-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:

====================
wireless-drivers-next patches for 5.1

Most likely the last set of patches for 5.1. WPA3 support to ath10k
and qtnfmac. FTM support to iwlwifi and ath10k. And of course other
new features and bugfixes.

wireless-drivers was merged due to dependency in mt76.

Major changes:

iwlwifi

* HE radiotap

* FTM (Fine Timing Measurement) initiator and responder implementation

* bump supported firmware API to 46

* VHT extended NSS support

* new PCI IDs for 9260 and 22000 series

ath10k

* change QMI interface to support the new (and backwards incompatible)
  interface from HL3.1 and used in recent HL2.0 branch firmware
  releases

* support WPA3 with WCN3990

* support for mac80211 airtime fairness based on transmit rate
  estimation, the firmware needs to support WMI_SERVICE_PEER_STATS to
  enable this

* report transmit airtime to mac80211 with firmwares having
  WMI_SERVICE_REPORT_AIRTIME feature, this to have more accurate
  airtime fairness based on real transmit time (instead of just
  estimated from transmit rate)

* support Fine Timing Measurement (FTM) responder role

* add dynamic VLAN support with firmware having WMI_SERVICE_PER_PACKET_SW_ENCRYPT

* switch to use SPDX license identifiers

ath

* add new country codes for US

brcmfmac

* support monitor frames with the hardware/ucode header

qtnfmac

* enable WPA3 SAE and OWE support

mt76

* beacon support for USB devices (mesh+ad-hoc only)

rtlwifi

* convert to use SPDX license identifiers

libertas_tf

* get the MAC address before registering the device
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 12:56:24 -08:00
Maxim Mikityanskiy
41f5f63cd1 net/mlx5e: Trust kernel regarding transport offset
After AF_PACKET is fixed to calculate the transport header offset
correctly, trust the value set by the kernel. If the offset wasn't set,
it means there is no transport header in the packet.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 12:55:32 -08:00
Maxim Mikityanskiy
3517dfe6f2 net/mlx5e: Remove the wrong assumption about transport offset
skb_transport_offset() == 0 is not a special value. The only special
value is when skb->transport_header is ~0U, and it's checked by
skb_transport_header_was_set().

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 12:55:32 -08:00
Maxim Mikityanskiy
d2aa125d62 net: Don't set transport offset to invalid value
If the socket was created with socket(AF_PACKET, SOCK_RAW, 0),
skb->protocol will be unset, __skb_flow_dissect() will fail, and
skb_probe_transport_header() will fall back to the offset_hint, making
the resulting skb_transport_offset incorrect.

If, however, there is no transport header in the packet,
transport_header shouldn't be set to an arbitrary value.

Fix it by leaving the transport offset unset if it couldn't be found, to
be explicit rather than to fill it with some wrong value. It changes the
behavior, but if some code relied on the old behavior, it would be
broken anyway, as the old one is incorrect.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 12:55:31 -08:00
David S. Miller
5328b633c9 Merge tag 'mac80211-next-for-davem-2019-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:

====================
This time we have, of note:
 * the massive patch series for multi-BSSID support, I ended up
   applying that through a side branch to record some details
 * CSA improvements
 * HE (802.11ax) updates to Draft 3.3
 * strongly typed element iteration/etc. to make such code more
   readable - this came up in particular in multi-BSSID
 * rhashtable conversion patches from Herbert
Along, as usual, with various fixes and improvements.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 12:52:23 -08:00
David S. Miller
ab01f251c9 Merge tag 'mac80211-for-davem-2019-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
Three more fixes:
 * mac80211 mesh code wasn't allocating SKB tailroom properly
   in some cases
 * tx_sk_pacing_shift should be 7 for better performance
 * mac80211_hwsim wasn't propagating genlmsg_reply() errors
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 12:51:21 -08:00
Vadim Lomovtsev
2e1c3fff5e net: thunderx: remove link change polling code and info from nicpf
Since link change polling routine was moved to nicvf side,
we don't need anymore polling function at nicpf side along
with link status info for all enabled Vfs as at VF side
this info is already tracked.

This commit is to remove unnecessary code & fields from
nicpf structure.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:43:45 -08:00
Vadim Lomovtsev
2c632ad8bc net: thunderx: move link state polling function to VF
Move the link change polling task to VF side in order to
prevent races between VF and PF while sending link change
message(s). This commit is to implement link change request
to be initiated by VF.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:43:45 -08:00
Vadim Lomovtsev
609ea65c65 net: thunderx: add mutex to protect mailbox from concurrent calls for same VF
In some cases it could happen that nicvf_send_msg_to_pf() could be called
concurrently for the same NIC VF, and thus re-writing mailbox contents and
breaking messaging sequence with PF by re-writing NICVF data.

This commit is to implement mutex for NICVF to protect mailbox registers
and NICVF messaging control data from concurrent access.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:43:45 -08:00
Vadim Lomovtsev
5354439612 net: thunderx: rework xcast message structure to make it fit into 64 bit
To communicate to PF each of ThunderX NIC VF uses mailbox which is
pair of 64 bit registers available to both VFn and PF.

This commit is to change the xcast message structure in order to
fit it into 64 bit.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:43:44 -08:00
Vadim Lomovtsev
7db730d9d2 net: thunderx: add nicvf_send_msg_to_pf result check for set_rx_mode_task
The rx_set_mode invokes number of messages to be send to PF for receive
mode configuration. In case if there any issues we need to stop sending
messages and release allocated memory.

This commit is to implement check of nicvf_msg_send_to_pf() result.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:43:44 -08:00
Vadim Lomovtsev
0dd563b9a6 net: thunderx: make CFG_DONE message to run through generic send-ack sequence
At the end of NIC VF initialization VF sends CFG_DONE message to PF without
using nicvf_msg_send_to_pf routine. This potentially could re-write data in
mailbox. This commit is to implement common way of sending CFG_DONE message
by the same way with other configuration messages by using
nicvf_send_msg_to_pf() routine.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:43:44 -08:00
Vadim Lomovtsev
2ecbe4f4a0 net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them.
Having one work queue for receive mode configuration ndo_set_rx_mode()
call for all VFs results in making each of them wait till the
set_rx_mode() call completes for another VF if any of close, set
receive mode and change flags calls being already invoked. Potentially
this could cause device state change before appropriate call of receive
mode configuration completes, so the call itself became meaningless,
corrupt data or break configuration sequence.

We don't need any delays in NIC VF configuration sequence so having delayed
work call with 0 delay has no sense.

This commit is to implement one work queue for each NIC VF for set_rx_mode
task and to let them work independently and replacing delayed_work
with work_struct.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:43:44 -08:00
Vadim Lomovtsev
f6d25aca1b net: thunderx: correct typo in macro name
Correct STREERING to STEERING at macro name for BGX steering register.

Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:43:44 -08:00
George Wilkie
8c7a77267e team: use operstate consistently for linkup
When a port is added to a team, its initial state is derived
from netif_carrier_ok rather than netif_oper_up.
If it is carrier up but operationally down at the time of being
added, the port state.linkup will be set prematurely.
port state.linkup should be set consistently using
netif_oper_up rather than netif_carrier_ok.

Fixes: f1d22a1e05 ("team: account for oper state")
Signed-off-by: George Wilkie <gwilkie@vyatta.att-mail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:40:23 -08:00
Andrew Lunn
023fb4b51f net: phy: aquantia: Use get_features for the PHYs abilities
Use the new PHY driver call to get the PHYs supported features.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[hkallweit1@gmail.com: removed new config_init callback from patch]
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:39:44 -08:00
David Chen
c286909fe5 r8152: Fix an error on RTL8153-BD MAC Address Passthrough support
RTL8153-BD is used in Dell DA300 type-C dongle.
Added RTL8153-BD support to activate MAC address pass through on DA300.
Apply correction on previously submitted patch in net.git tree.

Signed-off-by: David Chen <david.chen7@dell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:36:55 -08:00
Daniel Borkmann
7cc9f7003a ipvlan: disallow userns cap_net_admin to change global mode/flags
When running Docker with userns isolation e.g. --userns-remap="default"
and spawning up some containers with CAP_NET_ADMIN under this realm, I
noticed that link changes on ipvlan slave device inside that container
can affect all devices from this ipvlan group which are in other net
namespaces where the container should have no permission to make changes
to, such as the init netns, for example.

This effectively allows to undo ipvlan private mode and switch globally to
bridge mode where slaves can communicate directly without going through
hostns, or it allows to switch between global operation mode (l2/l3/l3s)
for everyone bound to the given ipvlan master device. libnetwork plugin
here is creating an ipvlan master and ipvlan slave in hostns and a slave
each that is moved into the container's netns upon creation event.

* In hostns:

  # ip -d a
  [...]
  8: cilium_host@bond0: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
     link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
     ipvlan  mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
     inet 10.41.0.1/32 scope link cilium_host
       valid_lft forever preferred_lft forever
  [...]

* Spawn container & change ipvlan mode setting inside of it:

  # docker run -dt --cap-add=NET_ADMIN --network cilium-net --name client -l app=test cilium/netperf
  9fff485d69dcb5ce37c9e33ca20a11ccafc236d690105aadbfb77e4f4170879c

  # docker exec -ti client ip -d a
  [...]
  10: cilium0@if4: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
      link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
      ipvlan  mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
      inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
         valid_lft forever preferred_lft forever

  # docker exec -ti client ip link change link cilium0 name cilium0 type ipvlan mode l2

  # docker exec -ti client ip -d a
  [...]
  10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
      link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
      ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
      inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
         valid_lft forever preferred_lft forever

* In hostns (mode switched to l2):

  # ip -d a
  [...]
  8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
      link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
      ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
      inet 10.41.0.1/32 scope link cilium_host
         valid_lft forever preferred_lft forever
  [...]

Same l3 -> l2 switch would also happen by creating another slave inside
the container's network namespace when specifying the existing cilium0
link to derive the actual (bond0) master:

  # docker exec -ti client ip link add link cilium0 name cilium1 type ipvlan mode l2

  # docker exec -ti client ip -d a
  [...]
  2: cilium1@if4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
      link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
      ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
  10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
      link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
      ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
      inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
         valid_lft forever preferred_lft forever

* In hostns:

  # ip -d a
  [...]
  8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
      link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
      ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
      inet 10.41.0.1/32 scope link cilium_host
         valid_lft forever preferred_lft forever
  [...]

One way to mitigate it is to check CAP_NET_ADMIN permissions of
the ipvlan master device's ns, and only then allow to change
mode or flags for all devices bound to it. Above two cases are
then disallowed after the patch.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22 11:27:19 -08:00
Linus Torvalds
168bd29830 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "Small set of three regression fixing patches, things are looking
  pretty good here.

   - Fix cxgb4 to work again with non-4k page sizes

   - NULL pointer oops in SRP during sg_reset"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  iw_cxgb4: cq/qp mask depends on bar2 pages in a host page
  cxgb4: Export sge_host_page_size to ulds
  RDMA/srp: Rework SCSI device reset handling
2019-02-22 10:32:26 -08:00
Johannes Berg
b7b14ec1eb Merge remote-tracking branch 'net-next/master' into mac80211-next
Merge net-next to resolve a conflict and to get the mac80211
rhashtable fixes so further patches can be applied on top.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-22 13:48:13 +01:00