ice: Add support for Tx hang, Tx timeout and malicious driver detection

When a malicious operation is detected, the firmware triggers an
interrupt, which is then picked up by the service task (specifically by
ice_handle_mdd_event). A reset is scheduled if required.

Tx hang detection works in a similar way, except the logic here monitors
the VSI's Tx queues and tries to revive them if stalled. If the hang is
not resolved, the kernel eventually calls ndo_tx_timeout, which is
handled by ice_tx_timeout.

Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This commit is contained in:
Sudheer Mogilappagari
2018-08-09 06:29:53 -07:00
committed by Jeff Kirsher
parent f80eaa4210
commit b3969fd727
5 changed files with 331 additions and 0 deletions

View File

@@ -134,6 +134,7 @@ enum ice_state {
__ICE_SUSPENDED, /* set on module remove path */
__ICE_RESET_FAILED, /* set by reset/rebuild */
__ICE_ADMINQ_EVENT_PENDING,
__ICE_MDD_EVENT_PENDING,
__ICE_FLTR_OVERFLOW_PROMISC,
__ICE_CFG_BUSY,
__ICE_SERVICE_SCHED,
@@ -272,6 +273,9 @@ struct ice_pf {
struct ice_hw_port_stats stats_prev;
struct ice_hw hw;
u8 stat_prev_loaded; /* has previous stats been loaded */
u32 tx_timeout_count;
unsigned long tx_timeout_last_recovery;
u32 tx_timeout_recovery_level;
char int_name[ICE_INT_NAME_STR_LEN];
};