[why]
Removing connector reusage from DM to match the rest of the tree ended
up revealing an issue that was surprisingly subtle. The original amdgpu
code for DC that was submitted appears to have left a chunk in
dm_dp_create_fake_mst_encoder() that tries to find a "master encoder",
the likes of which isn't actually used or stored anywhere. It does so at
the wrong time as well by trying to access parts of the drm_connector
from the encoder init before it's actually been initialized. This
results in a NULL pointer deref on MST hotplugs:
[ 160.696613] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[ 160.697234] PGD 0 P4D 0
[ 160.697814] Oops: 0010 [#1] SMP PTI
[ 160.698430] CPU: 2 PID: 64 Comm: kworker/2:1 Kdump: loaded Tainted: G O 4.19.0Lyude-Test+ #2
[ 160.699020] Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.22 05/17/2018
[ 160.699672] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[ 160.700322] RIP: 0010: (null)
[ 160.700920] Code: Bad RIP value.
[ 160.701541] RSP: 0018:ffffc9000029fc78 EFLAGS: 00010206
[ 160.702183] RAX: 0000000000000000 RBX: ffff8804440ed468 RCX: ffff8804440e9158
[ 160.702778] RDX: 0000000000000000 RSI: ffff8804556c5700 RDI: ffff8804440ed000
[ 160.703408] RBP: ffff880458e21800 R08: 0000000000000002 R09: 000000005fca0a25
[ 160.704002] R10: ffff88045a077a3d R11: ffff88045a077a3c R12: ffff8804440ed000
[ 160.704614] R13: ffff880458e21800 R14: ffff8804440e9000 R15: ffff8804440e9000
[ 160.705260] FS: 0000000000000000(0000) GS:ffff88045f280000(0000) knlGS:0000000000000000
[ 160.705854] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 160.706478] CR2: ffffffffffffffd6 CR3: 000000000200a001 CR4: 00000000003606e0
[ 160.707124] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 160.707724] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 160.708372] Call Trace:
[ 160.708998] ? dm_dp_add_mst_connector+0xed/0x1d0 [amdgpu]
[ 160.709625] ? drm_dp_add_port+0x2fa/0x470 [drm_kms_helper]
[ 160.710284] ? wake_up_q+0x54/0x70
[ 160.710877] ? __mutex_unlock_slowpath.isra.18+0xb3/0x110
[ 160.711512] ? drm_dp_dpcd_access+0xe7/0x110 [drm_kms_helper]
[ 160.712161] ? drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
[ 160.712762] ? drm_dp_check_and_send_link_address+0xa3/0xd0 [drm_kms_helper]
[ 160.713408] ? drm_dp_mst_link_probe_work+0x4b/0x80 [drm_kms_helper]
[ 160.714013] ? process_one_work+0x1a1/0x3a0
[ 160.714667] ? worker_thread+0x30/0x380
[ 160.715326] ? wq_update_unbound_numa+0x10/0x10
[ 160.715939] ? kthread+0x112/0x130
[ 160.716591] ? kthread_create_worker_on_cpu+0x70/0x70
[ 160.717262] ? ret_from_fork+0x35/0x40
[ 160.717886] Modules linked in: amdgpu(O) vfat fat snd_hda_codec_generic joydev i915 chash gpu_sched ttm i2c_algo_bit drm_kms_helper snd_hda_codec_hdmi hp_wmi syscopyarea iTCO_wdt sysfillrect sparse_keymap sysimgblt fb_sys_fops snd_hda_intel usbhid wmi_bmof drm snd_hda_codec btusb snd_hda_core intel_rapl btrtl x86_pkg_temp_thermal btbcm btintel coretemp snd_pcm crc32_pclmul bluetooth psmouse snd_timer snd pcspkr i2c_i801 mei_me i2c_core soundcore mei tpm_tis wmi tpm_tis_core hp_accel ecdh_generic lis3lv02d tpm video rfkill acpi_pad input_polldev hp_wireless pcc_cpufreq crc32c_intel serio_raw tg3 xhci_pci xhci_hcd [last unloaded: amdgpu]
[ 160.720141] CR2: 0000000000000000
Somehow the connector reusage DM was using for MST connectors managed to
paper over this issue entirely; hence why this was never caught until
now.
[how]
Since this code isn't used anywhere and seems useless anyway, we can
just drop it entirely. This appears to fix the issue on my HP ZBook with
an AMD WX4150.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
It is not safe to keep existing connector while entire topology
has been removed. Could lead potential impact to uapi.
Entirely unregister all the connectors on the topology,
and use a new set of connectors when the topology is plugged back
on.
[How]
Remove the drm connector entirely each time when the
corresponding MST topology is gone.
When hotunplug a connector (e.g., DP2)
1. Remove connector from userspace.
2. Drop it's reference.
When hotplug back on:
1. Detect new topology, and create new connectors.
2. Notify userspace with sysfs hotplug event.
3. Reprobe new connectors, and reassign CRTC from old (e.g., DP2)
to new (e.g., DP3) connector.
Signed-off-by: Jerry (Fangzhi) Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
It is not correct to touch aconnector within atomic_check.
[How]
It was added as workaround before, and no longer needed.
Signed-off-by: Jerry (Fangzhi) Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar to ppfeaturemask. Allows you to selectively enable/disable
DC features.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The value is dependent on whether fbc is available.
v2: only check if num_pipes is valid
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently, SDMA page queue is not used under SR-IOV VF, and this queue will
cause ring test failure in amdgpu module reload case. So just disable it.
Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
Removing connector reusage from DM to match the rest of the tree ended
up revealing an issue that was surprisingly subtle. The original amdgpu
code for DC that was submitted appears to have left a chunk in
dm_dp_create_fake_mst_encoder() that tries to find a "master encoder",
the likes of which isn't actually used or stored anywhere. It does so at
the wrong time as well by trying to access parts of the drm_connector
from the encoder init before it's actually been initialized. This
results in a NULL pointer deref on MST hotplugs:
[ 160.696613] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[ 160.697234] PGD 0 P4D 0
[ 160.697814] Oops: 0010 [#1] SMP PTI
[ 160.698430] CPU: 2 PID: 64 Comm: kworker/2:1 Kdump: loaded Tainted: G O 4.19.0Lyude-Test+ #2
[ 160.699020] Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.22 05/17/2018
[ 160.699672] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[ 160.700322] RIP: 0010: (null)
[ 160.700920] Code: Bad RIP value.
[ 160.701541] RSP: 0018:ffffc9000029fc78 EFLAGS: 00010206
[ 160.702183] RAX: 0000000000000000 RBX: ffff8804440ed468 RCX: ffff8804440e9158
[ 160.702778] RDX: 0000000000000000 RSI: ffff8804556c5700 RDI: ffff8804440ed000
[ 160.703408] RBP: ffff880458e21800 R08: 0000000000000002 R09: 000000005fca0a25
[ 160.704002] R10: ffff88045a077a3d R11: ffff88045a077a3c R12: ffff8804440ed000
[ 160.704614] R13: ffff880458e21800 R14: ffff8804440e9000 R15: ffff8804440e9000
[ 160.705260] FS: 0000000000000000(0000) GS:ffff88045f280000(0000) knlGS:0000000000000000
[ 160.705854] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 160.706478] CR2: ffffffffffffffd6 CR3: 000000000200a001 CR4: 00000000003606e0
[ 160.707124] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 160.707724] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 160.708372] Call Trace:
[ 160.708998] ? dm_dp_add_mst_connector+0xed/0x1d0 [amdgpu]
[ 160.709625] ? drm_dp_add_port+0x2fa/0x470 [drm_kms_helper]
[ 160.710284] ? wake_up_q+0x54/0x70
[ 160.710877] ? __mutex_unlock_slowpath.isra.18+0xb3/0x110
[ 160.711512] ? drm_dp_dpcd_access+0xe7/0x110 [drm_kms_helper]
[ 160.712161] ? drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
[ 160.712762] ? drm_dp_check_and_send_link_address+0xa3/0xd0 [drm_kms_helper]
[ 160.713408] ? drm_dp_mst_link_probe_work+0x4b/0x80 [drm_kms_helper]
[ 160.714013] ? process_one_work+0x1a1/0x3a0
[ 160.714667] ? worker_thread+0x30/0x380
[ 160.715326] ? wq_update_unbound_numa+0x10/0x10
[ 160.715939] ? kthread+0x112/0x130
[ 160.716591] ? kthread_create_worker_on_cpu+0x70/0x70
[ 160.717262] ? ret_from_fork+0x35/0x40
[ 160.717886] Modules linked in: amdgpu(O) vfat fat snd_hda_codec_generic joydev i915 chash gpu_sched ttm i2c_algo_bit drm_kms_helper snd_hda_codec_hdmi hp_wmi syscopyarea iTCO_wdt sysfillrect sparse_keymap sysimgblt fb_sys_fops snd_hda_intel usbhid wmi_bmof drm snd_hda_codec btusb snd_hda_core intel_rapl btrtl x86_pkg_temp_thermal btbcm btintel coretemp snd_pcm crc32_pclmul bluetooth psmouse snd_timer snd pcspkr i2c_i801 mei_me i2c_core soundcore mei tpm_tis wmi tpm_tis_core hp_accel ecdh_generic lis3lv02d tpm video rfkill acpi_pad input_polldev hp_wireless pcc_cpufreq crc32c_intel serio_raw tg3 xhci_pci xhci_hcd [last unloaded: amdgpu]
[ 160.720141] CR2: 0000000000000000
Somehow the connector reusage DM was using for MST connectors managed to
paper over this issue entirely; hence why this was never caught until
now.
[how]
Since this code isn't used anywhere and seems useless anyway, we can
just drop it entirely. This appears to fix the issue on my HP ZBook with
an AMD WX4150.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
It is not safe to keep existing connector while entire topology
has been removed. Could lead potential impact to uapi.
Entirely unregister all the connectors on the topology,
and use a new set of connectors when the topology is plugged back
on.
[How]
Remove the drm connector entirely each time when the
corresponding MST topology is gone.
When hotunplug a connector (e.g., DP2)
1. Remove connector from userspace.
2. Drop it's reference.
When hotplug back on:
1. Detect new topology, and create new connectors.
2. Notify userspace with sysfs hotplug event.
3. Reprobe new connectors, and reassign CRTC from old (e.g., DP2)
to new (e.g., DP3) connector.
Signed-off-by: Jerry (Fangzhi) Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[why]
It is not correct to touch aconnector within atomic_check.
[How]
It was added as workaround before, and no longer needed.
Signed-off-by: Jerry (Fangzhi) Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of stack-allocated psp_xgmi_topology_info in function
amdgpu_xgmi_add_device, dynamically allocated this structure to
avoid the frame size of this function excceed 1024 bytes
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar to ppfeaturemask. Allows you to selectively enable/disable
DC features.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The value is dependent on whether fbc is available.
v2: only check if num_pipes is valid
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
RETIMER_REDRIVER_INFO shows the buffer as a decimal value with a '0x'
prefix, which is somewhat misleading.
Fix it to print hexadecimal, as was intended.
Fixes: 2f14bc89("drm/amd/display: add retimer log for HWQ tuning use.")
Cc: Charlene Liu <charlene.liu@amd.com>
Cc: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 0cafc82fae.
This breaks some apps that assume 0 is minimum brightness.
Revert for 4.20. This is fixed properly for drm-next/4.21 in:
"drm/amd: Don't fail on backlight = 0"
However, that patch depends on more extensive changes to the
backlight interface which are too invasive for -fixes.
Fixes: Bugzilla: https://bugs.freedesktop.org/108668
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Setup and tear down xgmi as part of psp.
v2:
- make psp_xgmi_terminate static
- squash in:
drm/amdgpu: only issue xgmi cmd when it is enabled
drm/amdgpu/psp: terminate xgmi ta in suspend and hw_fini phase
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
RETIMER_REDRIVER_INFO shows the buffer as a decimal value with a '0x'
prefix, which is somewhat misleading.
Fix it to print hexadecimal, as was intended.
Fixes: 2f14bc89("drm/amd/display: add retimer log for HWQ tuning use.")
Cc: Charlene Liu <charlene.liu@amd.com>
Cc: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Amgpu's backlight update status function was
returning 1 (an error value) when the backlight
property was 0. This breaks users that assume
0 is a valid backlight value (which is a
correct assumption)
If the user passes in a backlight value of 0,
tell them everything is fine, then write a value of
1 to hardware.
Signed-off-by: David Francis <David.Francis@amd.com>
Bugzilla: https://bugs.freedesktop.org/108668
Fixes: 416615ea9578 ("drm/amd/display: set backlight level limit to 1")
Cc: Suresh.Guttula@amd.com
Cc: Harry.Wentland@amd.com
Cc: Samantham@posteo.net
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>