Commit Graph

976 Commits

Author SHA1 Message Date
Ben Skeggs
92f673a12d drm/nouveau/sec2/gv100-: add missing MODULE_FIRMWARE()
ASB was failing to load on Turing GPUs when firmware is being loaded
from initramfs, leaving the GPU in an odd state and causing suspend/
resume to fail.

Add missing MODULE_FIRMWARE() lines for initramfs generators.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: <stable@vger.kernel.org> # 5.6
2020-04-16 15:34:12 +10:00
Ben Skeggs
028a12f5aa drm/nouveau/gr/gp107,gp108: implement workaround for HW hanging during init
Certain boards with GP107/GP108 chipsets hang (often, but randomly) for
unknown reasons during GR initialisation.

The first tell-tale symptom of this issue is:

nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 409800 [ TIMEOUT ]

appearing in dmesg, likely followed by many other failures being logged.

Karol found this WAR for the issue a while back, but efforts to isolate
the root cause and proper fix have not yielded success so far.  I've
modified the original patch to include a few more details, limit it to
GP107/GP108 by default, and added a config option to override this choice.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2020-04-07 14:37:50 +10:00
Ben Skeggs
b99ef12b80 drm/nouveau/gr/tu11x: initial support
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-02-17 17:19:00 +10:00
Ben Skeggs
072663f86d drm/nouveau/acr/tu11x: initial support
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-02-17 17:19:00 +10:00
Ben Skeggs
58ae5284f6 drm/nouveau/disp/gv100-: halt NV_PDISP_FE_RM_INTR_STAT_CTRL_DISP_ERROR storms
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-02-03 21:36:54 +10:00
Ben Skeggs
86e18ebd87 drm/nouveau/disp/gv100-: not all channel types support reporting error codes
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-29 15:49:56 +10:00
Ben Skeggs
0e6176c6d2 drm/nouveau/disp/nv50-: prevent oops when no channel method map provided
The implementations for most channel types contains a map of methods to
priv registers in order to provide debugging info when a disp exception
has been raised.

This info is missing from the implementation of PIO channels as they're
rather simplistic already, however, if an exception is raised by one of
them, we'd end up triggering a NULL-pointer deref.  Not ideal...

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206299
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-29 15:49:56 +10:00
Thierry Reding
90e2e96ea3 drm/nouveau/gr/gp10b: Use gp100_grctx and gp100_gr_zbc
gp10b doesn't have all the registers that gp102_gr_zbc wants to access,
which causes IBUS MMIO faults to occur. Avoid this by using the gp100
variants of grctx and gr_zbc.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-23 08:56:51 +10:00
Ben Skeggs
afa3b96b05 drm/nouveau/gr/tu10x: initial support
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:30 +10:00
Ben Skeggs
3fa8fe1572 drm/nouveau/acr/tu10x: initial support
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:30 +10:00
Ben Skeggs
9d350c5e51 drm/nouveau/secboot: remove
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:30 +10:00
Ben Skeggs
22dcda45a3 drm/nouveau/acr: implement new subdev to replace "secure boot"
ACR is responsible for managing the firmware for LS (Low Secure) falcons,
this was previously handled in the driver by SECBOOT.

This rewrite started from some test code that attempted to replicate the
procedure RM uses in order to debug early Turing ACR firmwares that were
provided by NVIDIA for development.

Compared with SECBOOT, the code is structured into more individual steps,
with the aim of making the process easier to follow/debug, whilst making
it possible to support newer firmware versions that may have a different
binary format or API interface.

The HS (High Secure) binary(s) are now booted earlier in device init, to
match the behaviour of RM, whereas SECBOOT would delay this until we try
to boot the first LS falcon.

There's also additional debugging features available, with the intention
of making it easier to solve issues during FW/HW bring-up in the future.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:29 +10:00
Ben Skeggs
7a4dde711b drm/nouveau/secboot: move code to boot LS falcons to subdevs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:29 +10:00
Ben Skeggs
e1cc579898 drm/nouveau/flcn/msgq: pass explicit message queue pointer to recv()
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:29 +10:00
Ben Skeggs
d114a1393f drm/nouveau/flcn/msgq: move handling of init message to subdevs
When the PMU/SEC2 LS FWs have booted, they'll send a message to the host
with various information, including the configuration of message/command
queues that are available.

Move the handling for this to the relevant subdevs.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:29 +10:00
Ben Skeggs
86ce2a7153 drm/nouveau/flcn/cmdq: move command generation to subdevs
This moves the code to generate commands for the ACR unit of the PMU/SEC2 LS
firmwares to those subdevs.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:29 +10:00
Ben Skeggs
22431189d6 drm/nouveau/flcn/msgq: explicitly create message queue from subdevs
Code to interface with LS firmwares is being moved to the subdevs where it
belongs, rather than living in the common falcon code.

This is an incremental step towards that goal.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:28 +10:00
Ben Skeggs
acc466ab46 drm/nouveau/flcn/cmdq: explicitly create command queue(s) from subdevs
Code to interface with LS firmwares is being moved to the subdevs where it
belongs, rather than living in the common falcon code.

This is an incremental step towards that goal.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:28 +10:00
Ben Skeggs
8763955ba7 drm/nouveau/flcn/qmgr: explicitly create queue manager from subdevs
Code to interface with LS firmwares is being moved to the subdevs where it
belongs, rather than living in the common falcon code.

This is an incremental step towards that goal.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:28 +10:00
Ben Skeggs
af696a61a2 drm/nouveau/flcn: reset sec2/gsp falcons harder
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:28 +10:00
Ben Skeggs
b826f48a1c drm/nouveau/flcn: specify queue register offsets from subdev
Also fixes the values for Turing, even though we don't use it yet.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:28 +10:00
Ben Skeggs
e938c4e723 drm/nouveau/flcn: specify debug/production register offset from subdev
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:28 +10:00
Ben Skeggs
bc3cfd18ac drm/nouveau/flcn: specify EMEM address from subdev
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
ca3190e3c7 drm/nouveau/flcn: move bind_context WAR out of common code
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
fb0a5bbe31 drm/nouveau/flcn: specify FBIF offset from subdev
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
10e43bfd2f drm/nouveau/nvenc: add a stub implementation for the GPUs where it should be supported
Mostly so we don't lose info hidden in falcon.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
a5482b9ff1 drm/nouveau/nvdec/gm107-: add missing engine instances
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
68f0244494 drm/nouveau/nvdec/gm107: rename from gp102 implementation
NVDEC is available from GM107, and we currently only have a stub
implementation anyway, let's make it explicit.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
3a900a5d9c drm/nouveau/nvdec: initialise SW state for falcon from constructor
This will allow us to register the falcon with ACR, and further customise
its behaviour by providing the nvkm_falcon_func structure directly.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
98a34d9950 drm/nouveau/nvdec: select implementation based on available fw
This will allow for further customisation of the subdev depending on what
firmware is available.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
c9af47bcbd drm/nouveau/sec2: move interrupt handler to hw-specific module
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
555a0002d3 drm/nouveau/sec2: use falcon funcs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
edd757d178 drm/nouveau/sec2: initialise SW state for falcon from constructor
This will allow us to register the falcon with ACR, and further customise
its behaviour by providing the nvkm_falcon_func structure directly.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
7adc40c593 drm/nouveau/sec2: select implementation based on available firmware
This will allow for further customisation of the subdev depending on what
firmware is available.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
e14e5e6c33 drm/nouveau/sec2/gp108: split from gp102 implementation
ACR LS FW loading is moving out of SECBOOT and into their specific subdevs,
and the available GP108/GV100 FWs differ from the other GP10x boards.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
4f556362a3 drm/nouveau/gr/gf100-: initialise SW state for falcon from constructor
This will allow us to register the falcon with ACR, and further customise
its behaviour by providing the nvkm_falcon_func structure directly.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
ef16dc278e drm/nouveau/gr/gf100-: select implementation based on available FW
This will allow for further customisation of the subdev depending on what
firmware is available.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
a096ff1981 drm/nouveau/gr/gp108: split from gp107
ACR LS FW loading is moving out of SECBOOT and into their specific subdevs,
and the available GP107/GP108 FWs have interface differences.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:27 +10:00
Ben Skeggs
00e1b5dcf7 drm/nouveau/gr/gf100-: move fecs/gpccs ucode into their substructures
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:26 +10:00
Ben Skeggs
0033f15b44 drm/nouveau/gr/gf100-: drop fuc_ prefix on sw init
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:26 +10:00
Ben Skeggs
a2bfb50e72 drm/nouveau/gr/gk20a,gm200-: use nvkm_firmware_load_blob for sw init
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:26 +10:00
Ben Skeggs
6f0add0ad6 drm/nouveau/gr/gf100-: use nvkm_blob structure for fecs/gpccs fw
It serves the exact same purpose.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:26 +10:00
Ben Skeggs
e905736c6d drm/nouveau/pmu/gp10b: split from gm20b implementation
ACR LS FW loading is moving out of SECBOOT and into their specific subdevs,
and the available GM20B/GP10B FWs have interface differences.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:26 +10:00
Ben Skeggs
67e7c6cf8f drm/nouveau/acr: add stub implementation for all GPUs currently supported by SECBOOT
PMU, SEC2 and GR will be modified to register their falcons with ACR before
the main commit switching everything over.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:26 +10:00
Ben Skeggs
31bef57f6c drm/nouveau/core: define ACR subdev
This will replace the current SECBOOT subdev for handling firmware on
secure falcons.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:26 +10:00
Ben Skeggs
409d659fe1 drm/nouveau/disp/dp: fix typo when determining failsafe link configuration
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:25 +10:00
Ben Skeggs
3c47e381d6 drm/nouveau/gr/gv100-: modify gr init to match newer version of RM
Will be used as a basis for implementing changes needed for Turing.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:25 +10:00
Ben Skeggs
7adc77aa0e drm/nouveau/gr/gk20a,gm200-: add terminators to method lists read from fw
Method init is typically ordered by class in the FW image as ThreeD,
TwoD, Compute.

Due to a bug in parsing the FW into our internal format, we've been
accidentally sending Twod + Compute methods to the ThreeD class, as
well as Compute methods to the TwoD class - oops.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:25 +10:00
Ben Skeggs
fef1c0ef70 drm/nouveau/gr/gf100-: remove dtor
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:50:25 +10:00
Thierry Reding
d7ca5ddf58 drm/nouveau/ce/gp10b: Use correct copy engine
gp10b uses the new engine enumeration mechanism introduced in the Pascal
architecture. As a result, the copy engine, which used to be at index 2
for prior Tegra GPU instantiations, has now moved to index 0. Fix up the
index and also use the gp100 variant of the copy engine class because on
gp10b the PASCAL_DMA_COPY_B class is not supported.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-01-15 10:49:59 +10:00