android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Linus Torvalds	e6412f9833	Merge tag 'efi-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI changes from Ingo Molnar: - Preliminary RISC-V enablement - the bulk of it will arrive via the RISCV tree. - Relax decompressed image placement rules for 32-bit ARM - Add support for passing MOK certificate table contents via a config table rather than a EFI variable. - Add support for 18 bit DIMM row IDs in the CPER records. - Work around broken Dell firmware that passes the entire Boot#### variable contents as the command line - Add definition of the EFI_MEMORY_CPU_CRYPTO memory attribute so we can identify it in the memory map listings. - Don't abort the boot on arm64 if the EFI RNG protocol is available but returns with an error - Replace slashes with exclamation marks in efivarfs file names - Split efi-pstore from the deprecated efivars sysfs code, so we can disable the latter on !x86. - Misc fixes, cleanups and updates. * tag 'efi-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) efi: mokvar: add missing include of asm/early_ioremap.h efi: efivars: limit availability to X86 builds efi: remove some false dependencies on CONFIG_EFI_VARS efi: gsmi: fix false dependency on CONFIG_EFI_VARS efi: efivars: un-export efivars_sysfs_init() efi: pstore: move workqueue handling out of efivars efi: pstore: disentangle from deprecated efivars module efi: mokvar-table: fix some issues in new code efi/arm64: libstub: Deal gracefully with EFI_RNG_PROTOCOL failure efivarfs: Replace invalid slashes with exclamation marks in dentries. efi: Delete deprecated parameter comments efi/libstub: Fix missing-prototypes in string.c efi: Add definition of EFI_MEMORY_CPU_CRYPTO and ability to report it cper,edac,efi: Memory Error Record: bank group/address and chip id edac,ghes,cper: Add Row Extension to Memory Error Record efi/x86: Add a quirk to support command line arguments on Dell EFI firmware efi/libstub: Add efi_warn and *_once logging helpers integrity: Load certs from the EFI MOK config table integrity: Move import of MokListRT certs to a separate routine efi: Support for MOK variable config table ...	2020-10-12 13:26:49 -07:00
Linus Torvalds	ca1b66922a	Merge tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: - Extend the recovery from MCE in kernel space also to processes which encounter an MCE in kernel space but while copying from user memory by sending them a SIGBUS on return to user space and umapping the faulty memory, by Tony Luck and Youquan Song. - memcpy_mcsafe() rework by splitting the functionality into copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables support for new hardware which can recover from a machine check encountered during a fast string copy and makes that the default and lets the older hardware which does not support that advance recovery, opt in to use the old, fragile, slow variant, by Dan Williams. - New AMD hw enablement, by Yazen Ghannam and Akshay Gupta. - Do not use MSR-tracing accessors in #MC context and flag any fault while accessing MCA architectural MSRs as an architectural violation with the hope that such hw/fw misdesigns are caught early during the hw eval phase and they don't make it into production. - Misc fixes, improvements and cleanups, as always. * tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Allow for copy_mc_fragile symbol checksum to be generated x86/mce: Decode a kernel instruction to determine if it is copying from user x86/mce: Recover from poison found while copying from user space x86/mce: Avoid tail copy when machine check terminated a copy from user x86/mce: Add _ASM_EXTABLE_CPY for copy user access x86/mce: Provide method to find out the type of an exception handler x86/mce: Pass pointer to saved pt_regs to severity calculation routines x86/copy_mc: Introduce copy_mc_enhanced_fast_string() x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() x86/mce: Drop AMD-specific "DEFERRED" case from Intel severity rule list x86/mce: Add Skylake quirk for patrol scrub reported errors RAS/CEC: Convert to DEFINE_SHOW_ATTRIBUTE() x86/mce: Annotate mce_rd/wrmsrl() with noinstr x86/mce/dev-mcelog: Do not update kflags on AMD systems x86/mce: Stop mce_reign() from re-computing severity for every CPU x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR x86/mce: Increase maximum number of banks to 64 x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check() x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap RAS/CEC: Fix cec_init() prototype	2020-10-12 10:14:38 -07:00
Linus Torvalds	a9a4b7d9a6	Merge tag 'edac_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - Add Amazon's Annapurna Labs memory controller EDAC driver (Talel Shenhar) - New AMD CPUs support (Yazen Ghannam) - The usual misc fixes and cleanups all over the subsystem * tag 'edac_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/amd64: Set proper family type for Family 19h Models 20h-2Fh EDAC/mc_sysfs: Add missing newlines when printing {max,dimm}_location EDAC/aspeed: Use module_platform_driver() to simplify EDAC, sb_edac: Simplify switch statement EDAC/ti: Fix handling of platform_get_irq() error EDAC/aspeed: Fix handling of platform_get_irq() error EDAC/i5100: Fix error handling order in i5100_init_one() EDAC/highbank: Handover Calxeda Highbank maintenance to Andre Przywara EDAC/socfpga: Transfer SoCFPGA EDAC maintainership EDAC/thunderx: Make symbol lmc_dfs_ents static EDAC/al-mc-edac: Add Amazon's Annapurna Labs Memory Controller driver dt-bindings: EDAC: Add Amazon's Annapurna Labs Memory Controller binding EDAC/mce_amd: Add new error descriptions for existing types EDAC: Replace HTTP links with HTTPS ones	2020-10-12 10:12:26 -07:00
Borislav Petkov	1dc32628d6	Merge branch 'edac-drivers' into edac-updates-for-v5.10 Signed-off-by: Borislav Petkov <bp@suse.de>	2020-10-12 11:05:42 +02:00
Yazen Ghannam	b4210eab91	EDAC/amd64: Set proper family type for Family 19h Models 20h-2Fh AMD Family 19h Models 20h-2Fh use the same PCI IDs as Family 17h Models 70h-7Fh. The same family ops and number of channels also apply. Use the Family17h Model 70h family_type and ops for Family 19h Models 20h-2Fh. Update the controller name to match the system. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201009171803.3214354-1-Yazen.Ghannam@amd.com	2020-10-09 19:28:14 +02:00
Xiongfeng Wang	e6bbde8b2b	EDAC/mc_sysfs: Add missing newlines when printing {max,dimm}_location Reading those sysfs entries gives: [root@localhost /]# cat /sys/devices/system/edac/mc/mc0/max_location memory 3 [root@localhost /]# cat /sys/devices/system/edac/mc/mc0/dimm0/dimm_location memory 0 [root@localhost /]# Add newlines after the value it prints for better readability. [ bp: Make len a signed int and change the check to catch wraparound. Increment the pointer p only when the length check passes. Use scnprintf(). ] Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/1600051734-8993-1-git-send-email-wangxiongfeng2@huawei.com	2020-09-18 09:14:01 +02:00
Liu Shixin	07def58717	EDAC/aspeed: Use module_platform_driver() to simplify Use module_platform_driver() which makes the code simpler by eliminating boilerplate code. Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Joel Stanley <joel@jms.id.au> Link: https://lkml.kernel.org/r/20200914065358.3726216-1-liushixin2@huawei.com	2020-09-18 09:14:01 +02:00
Alex Kluver	612b5d506d	cper,edac,efi: Memory Error Record: bank group/address and chip id Updates to the UEFI 2.8 Memory Error Record allow splitting the bank field into bank address and bank group, and using the last 3 bits of the extended field as a chip identifier. When needed, print correct version of bank field, bank group, and chip identification. Based on UEFI 2.8 Table 299. Memory Error Record. Signed-off-by: Alex Kluver <alex.kluver@hpe.com> Reviewed-by: Russ Anderson <russ.anderson@hpe.com> Reviewed-by: Kyle Meyer <kyle.meyer@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Acked-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20200819143544.155096-3-alex.kluver@hpe.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2020-09-17 10:19:52 +03:00
Alex Kluver	9baf68cc45	edac,ghes,cper: Add Row Extension to Memory Error Record Memory errors could be printed with incorrect row values since the DIMM size has outgrown the 16 bit row field in the CPER structure. UEFI Specification Version 2.8 has increased the size of row by allowing it to use the first 2 bits from a previously reserved space within the structure. When needed, add the extension bits to the row value printed. Based on UEFI 2.8 Table 299. Memory Error Record Signed-off-by: Alex Kluver <alex.kluver@hpe.com> Tested-by: Russ Anderson <russ.anderson@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Reviewed-by: Kyle Meyer <kyle.meyer@hpe.com> Acked-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20200819143544.155096-2-alex.kluver@hpe.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2020-09-17 10:19:52 +03:00
Borislav Petkov	251c54ea26	EDAC/ghes: Check whether the driver is on the safe list correctly With CONFIG_DEBUG_TEST_DRIVER_REMOVE=y, a system would try to probe, unregister and probe again a driver. When ghes_edac is attempted to be loaded on a system which is not on the safe platforms list, ghes_edac_register() would return early. The unregister counterpart ghes_edac_unregister() would still attempt to unregister and exit early at the refcount test, leading to the refcount underflow below. In order to not do anything on the unregister path too, reuse the force_load parameter and check it on that path too, before fumbling with the refcount. ghes_edac: ghes_edac_register: entry ghes_edac: ghes_edac_register: return -ENODEV ------------[ cut here ]------------ refcount_t: underflow; use-after-free. WARNING: CPU: 10 PID: 1 at lib/refcount.c:28 refcount_warn_saturate+0xb9/0x100 Modules linked in: CPU: 10 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4+ #12 Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018 RIP: 0010:refcount_warn_saturate+0xb9/0x100 Code: 82 e8 fb 8f 4d 00 90 0f 0b 90 90 c3 80 3d 55 4c f5 00 00 75 88 c6 05 4c 4c f5 00 01 90 48 c7 c7 d0 8a 10 82 e8 d8 8f 4d 00 90 <0f> 0b 90 90 c3 80 3d 30 4c f5 00 00 0f 85 61 ff ff ff c6 05 23 4c RSP: 0018:ffffc90000037d58 EFLAGS: 00010292 RAX: 0000000000000026 RBX: ffff88840b8da000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffff8216b24f RDI: 00000000ffffffff RBP: ffff88840c662e00 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000046 R12: 0000000000000000 R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88840ee80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000800002211000 CR4: 00000000003506e0 Call Trace: ghes_edac_unregister ghes_remove platform_drv_remove really_probe driver_probe_device device_driver_attach __driver_attach ? device_driver_attach ? device_driver_attach bus_for_each_dev bus_add_driver driver_register ? bert_init ghes_init do_one_initcall ? rcu_read_lock_sched_held kernel_init_freeable ? rest_init kernel_init ret_from_fork ... ghes_edac: ghes_edac_unregister: FALSE, refcount: -1073741824 Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200911164950.GB19320@zn.tnic	2020-09-15 09:42:15 +02:00
Borislav Petkov	cd8100f1f3	EDAC/ghes: Clear scanned data on unload Commit `b972fdba86` ("EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register()") didn't clear all the information from the scanned system and, more specifically, left ghes_hw.num_dimms to its previous value. On a second load (CONFIG_DEBUG_TEST_DRIVER_REMOVE=y), the driver would use the leftover num_dimms value which is not 0 and thus the 0 check in enumerate_dimms() will get bypassed and it would go directly to the pointer deref: d = &hw->dimms[hw->num_dimms]; which is, of course, NULL: #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4+ #7 Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018 RIP: 0010:enumerate_dimms.cold+0x7b/0x375 Reset the whole ghes_hw on driver unregister so that no stale values are used on a second system scan. Fixes: `b972fdba86` ("EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register()") Cc: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200911164817.GA19320@zn.tnic	2020-09-15 09:41:28 +02:00
Tom Rix	fbd4ab7802	EDAC, sb_edac: Simplify switch statement clang static analyzer reports this problem sb_edac.c:959:2: warning: Undefined or garbage value returned to caller return type; ^~~~~~~~~~~ This is a false positive. However by initializing the type to DEV_UNKNOWN the 3 case can be removed from the switch, saving a comparison and jump. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20200907153225.7294-1-trix@redhat.com	2020-09-08 14:56:17 -07:00
Krzysztof Kozlowski	66077adb70	EDAC/ti: Fix handling of platform_get_irq() error platform_get_irq() returns a negative error number on error. In such a case, comparison to 0 would pass the check therefore check the return value properly, whether it is negative. [ bp: Massage commit message. ] Fixes: `86a18ee21e` ("EDAC, ti: Add support for TI keystone and DRA7xx EDAC") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tero Kristo <t-kristo@ti.com> Link: https://lkml.kernel.org/r/20200827070743.26628-2-krzk@kernel.org	2020-09-01 20:43:20 +02:00
Krzysztof Kozlowski	afce699694	EDAC/aspeed: Fix handling of platform_get_irq() error platform_get_irq() returns a negative error number on error. In such a case, comparison to 0 would pass the check therefore check the return value properly, whether it is negative. [ bp: Massage commit message. ] Fixes: `9b7e6242ee` ("EDAC, aspeed: Add an Aspeed AST2500 EDAC driver") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Stefan Schaeckeler <schaecsn@gmx.net> Link: https://lkml.kernel.org/r/20200827070743.26628-1-krzk@kernel.org	2020-09-01 20:41:27 +02:00
Dinghao Liu	857a3139bd	EDAC/i5100: Fix error handling order in i5100_init_one() When pci_get_device_func() fails, the driver doesn't need to execute pci_dev_put(). mci should still be freed, though, to prevent a memory leak. When pci_enable_device() fails, the error injection PCI device "einj" doesn't need to be disabled either. [ bp: Massage commit message, rename label to "bail_mc_free". ] Fixes: `52608ba205` ("i5100_edac: probe for device 19 function 0") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200826121437.31606-1-dinghao.liu@zju.edu.cn	2020-09-01 12:10:19 +02:00
Linus Torvalds	42df60fcdf	Merge tag 'edac_urgent_for_v5.9_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: "A fix to properly clear ghes_edac driver state on driver remove so that a subsequent load can probe the system properly (Shiju Jose)" * tag 'edac_urgent_for_v5.9_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register()	2020-08-30 10:47:23 -07:00
Shiju Jose	b972fdba86	EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register() After `b9cae27728` ("EDAC/ghes: Scan the system once on driver init") and with CONFIG_DEBUG_TEST_DRIVER_REMOVE enabled, ghes_hw.dimms becomes a NULL pointer after the second ->probe() (aka ghes_edac_register()) which the config option causes to be called. This happens because the static variable which holds down whether the system has been scanned already, doesn't get reset in ghes_edac_unregister(). Then, on the second probe, ghes_scan_system() doesn't get to enumerate the DIMMs, leading to ghes_hw.dimms remaining NULL. Clear the variable and rename it to something more descriptive so that a second probe succeeds. [ bp: Rewrite commit message. ] Fixes: `b9cae27728` ("EDAC/ghes: Scan the system once on driver init") Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200827140450.1620-1-shiju.jose@huawei.com	2020-08-27 18:04:07 +02:00
Gustavo A. R. Silva	df561f6688	treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-08-23 17:36:59 -05:00
Yazen Ghannam	368d188720	x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type was intended to be used by the kernel to filter out invalid error codes on a system. However, this is unnecessary after a few product releases because the hardware will only report valid error codes. Thus, there's no need for it with future systems. Remove the xec_bitmap field and all references to it. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200720145353.43924-1-Yazen.Ghannam@amd.com	2020-08-20 10:34:38 +02:00
Tony Luck	45bc6098a3	EDAC/{i7core,sb,pnd2,skx}: Fix error event severity IA32_MCG_STATUS.RIPV indicates whether the return RIP value pushed onto the stack as part of machine check delivery is valid or not. Various drivers copied a code fragment that uses the RIPV bit to determine the severity of the error as either HW_EVENT_ERR_UNCORRECTED or HW_EVENT_ERR_FATAL, but this check is reversed (marking errors where RIPV is set as "FATAL"). Reverse the tests so that the error is marked fatal when RIPV is not set. Reported-by: Gabriele Paoloni <gabriele.paoloni@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200707194324.14884-1-tony.luck@intel.com	2020-08-18 15:40:30 +02:00
Wei Yongjun	bd17e0b771	EDAC/thunderx: Make symbol lmc_dfs_ents static Symbol 'lmc_dfs_ents' is not used outside of thunderx_edac.c, so make it static: drivers/edac/thunderx_edac.c:457:22: warning: symbol 'lmc_dfs_ents' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Robert Richter <rrichter@marvell.com> Link: https://lkml.kernel.org/r/20200714142308.46612-1-weiyongjun1@huawei.com	2020-08-17 10:35:46 +02:00
Talel Shenhar	e23a7cdeb3	EDAC/al-mc-edac: Add Amazon's Annapurna Labs Memory Controller driver The Amazon's Annapurna Labs Memory Controller EDAC supports ECC capability for error detection and correction (Single bit error correction, Double detection). This driver introduces EDAC driver for that capability. [ bp: Remove "EDAC" string from Kconfig tristate as it is redundant. ] Signed-off-by: Talel Shenhar <talel@amazon.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lkml.kernel.org/r/20200816185551.19108-3-talel@amazon.com	2020-08-17 10:10:29 +02:00
Yazen Ghannam	dc7a8476cf	EDAC/mce_amd: Add new error descriptions for existing types A few existing MCA bank types will have new error types in future SMCA systems. Add the descriptions for the new error types. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200708153515.1911642-1-Yazen.Ghannam@amd.com	2020-08-17 09:36:00 +02:00
Alexander A. Klimov	7d4c1ea2be	EDAC: Replace HTTP links with HTTPS ones Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w\|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. [ bp: Merge all EDAC patches into a single one. ] Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Tero Kristo <t-kristo@ti.com> # ti_edac Link: https://lkml.kernel.org/r/20200708113546.14135-1-grandmaster@al2klimov.de	2020-08-17 09:31:19 +02:00
Linus Torvalds	6ffdcde4ee	Merge tag 'edac_updates_for_5.9_pt2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull edac fix from Tony Luck: "Fix for the ie31200 driver that missed the first pull" * tag 'edac_updates_for_5.9_pt2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/ie31200: Fallback if host bridge device is already initialized	2020-08-15 08:25:41 -07:00
Jason Baron	709ed1bcef	EDAC/ie31200: Fallback if host bridge device is already initialized The Intel uncore driver may claim some of the pci ids from ie31200 which means that the ie31200 edac driver will not initialize them as part of pci_register_driver(). Let's add a fallback for this case to 'pci_get_device()' to get a reference on the device such that it can still be configured. This is similar in approach to other edac drivers. Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: Borislav Petkov <bp@suse.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/1594923911-10885-1-git-send-email-jbaron@akamai.com	2020-08-10 11:13:06 -07:00
Linus Torvalds	f8851cb2d0	Merge tag 'edac_updates_for_5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Tony Luck: "Boris is on vacation and aske me to send you the EDAC changes" * tag 'edac_updates_for_5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC: Fix reference count leaks EDAC: Remove edac_get_dimm_by_index() EDAC/ghes: Scan the system once on driver init EDAC/ghes: Remove unused members of struct ghes_edac_pvt, rename it to ghes_pvt EDAC/ghes: Setup DIMM label from DMI and use it in error reports EDAC, {skx,i10nm}: Use CPU stepping macro to pass configurations EDAC/mc: Call edac_inc_ue_error() before panic EDAC, pnd2: Set MCE_PRIO_EDAC priority for pnd2_mce_dec notifier	2020-08-03 20:01:00 -07:00
Linus Torvalds	e53bc3ff99	Merge tag 'ras-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS updates from Ingo Molnar: "Boris is on vacation and he asked us to send you the pending RAS bits: - Print the PPIN field on CPUs that fill them out - Fix an MCE injection bug - Simplify a kzalloc in dev_mcelog_init_device()" * tag 'ras-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce, EDAC/mce_amd: Print PPIN in machine check records x86/mce/dev-mcelog: Use struct_size() helper in kzalloc() x86/mce/inject: Fix a wrong assignment of i_mce.status	2020-08-03 17:42:23 -07:00
Smita Koralahalli	bb2de0adca	x86/mce, EDAC/mce_amd: Print PPIN in machine check records Print the Protected Processor Identification Number (PPIN) on processors which support it. [ bp: Massage. ] Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200623130059.8870-1-Smita.KoralahalliChannabasappa@amd.com	2020-06-23 17:27:53 +02:00
Borislav Petkov	0f959e19fa	Merge branch 'edac-ghes' into edac-for-next	2020-06-22 15:28:01 +02:00
Borislav Petkov	ee470bb25d	EDAC/amd64: Read back the scrub rate PCI register on F15h Commit: `da92110dfd` ("EDAC, amd64_edac: Extend scrub rate support to F15hM60h") added support for F15h, model 0x60 CPUs but in doing so, missed to read back SCRCTRL PCI config register on F15h CPUs which are not model 0x60. Add that read so that doing $ cat /sys/devices/system/edac/mc/mc0/sdram_scrub_rate can show the previously set DRAM scrub rate. Fixes: `da92110dfd` ("EDAC, amd64_edac: Extend scrub rate support to F15hM60h") Reported-by: Anders Andersson <pipatron@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> #v4.4.. Link: https://lkml.kernel.org/r/CAKkunMbNWppx_i6xSdDHLseA2QQmGJqj_crY=NF-GZML5np4Vw@mail.gmail.com	2020-06-18 20:25:25 +02:00
Qiushi Wu	17ed808ad2	EDAC: Fix reference count leaks When kobject_init_and_add() returns an error, it should be handled because kobject_init_and_add() takes a reference even when it fails. If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. Therefore, replace calling kfree() and call kobject_put() and add a missing kobject_put() in the edac_device_register_sysfs_main_kobj() error path. [ bp: Massage and merge into a single patch. ] Fixes: `b2ed215a33` ("Kobject: change drivers/edac to use kobject_init_and_add") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200528202238.18078-1-wu000273@umn.edu Link: https://lkml.kernel.org/r/20200528203526.20908-1-wu000273@umn.edu	2020-06-17 15:38:35 +02:00
Borislav Petkov	b9cae27728	EDAC/ghes: Scan the system once on driver init Change the hardware scanning and figuring out how many DIMMs a machine has to a single, one-time thing which happens once on driver init. After that scanning completes, struct ghes_hw_desc contains a representation of the hardware which the driver can then use for later initialization. Then, copy the DIMM information into the respective EDAC core representation of those. Get rid of ghes_edac_dimm_fill and use a struct dimm_info array directly. This way, hw detection and further driver initialization is nicely and logically split. Further additions should all be added to ghes_scan_system() and the hw representation extended as needed. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <bp@suse.de>	2020-06-16 19:25:15 +02:00
Robert Richter	b001694d60	EDAC/ghes: Remove unused members of struct ghes_edac_pvt, rename it to ghes_pvt The struct members list and ghes of struct ghes_edac_pvt are unused, remove them. On that occasion, rename it to the shorter name struct ghes_pvt. Signed-off-by: Robert Richter <rrichter@marvell.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200519104443.15673-2-rrichter@marvell.com	2020-06-16 15:32:18 +02:00
Robert Richter	cb51a371d0	EDAC/ghes: Setup DIMM label from DMI and use it in error reports The ghes driver reports errors with 'unknown label' even if the actual DIMM label is known, e.g.: EDAC MC0: 1 CE Single-bit ECC on unknown label (node:0 card:0 module:0 rank:1 bank:0 col:13 bit_pos:16 DIMM location:N0 DIMM_A0 page:0x966a9b3 offset:0x0 grain:1 syndrome:0x0 - APEI location: node:0 card:0 module:0 rank:1 bank:0 col:13 bit_pos:16 DIMM location:N0 DIMM_A0 status(0x0000000000000400): Storage error in DRAM memory) Fix this by using struct dimm_info's label string in error reports: EDAC MC0: 1 CE Single-bit ECC on N0 DIMM_A0 (node:0 card:0 module:0 rank:1 bank:515 col:14 bit_pos:16 DIMM location:N0 DIMM_A0 page:0x99223d8 offset:0x0 grain:1 syndrome:0x0 - APEI location: node:0 card:0 module:0 rank:1 bank:515 col:14 bit_pos:16 DIMM location:N0 DIMM_A0 status(0x0000000000000400): Storage error in DRAM memory) The labels are initialized by reading the bank and device strings from DMI. Now, the label information can also read from sysfs. E.g. a ThunderX2 system will show the following: /sys/devices/system/edac/mc/mc0/dimm0/dimm_label:N0 DIMM_A0 /sys/devices/system/edac/mc/mc0/dimm1/dimm_label:N0 DIMM_B0 /sys/devices/system/edac/mc/mc0/dimm2/dimm_label:N0 DIMM_C0 /sys/devices/system/edac/mc/mc0/dimm3/dimm_label:N0 DIMM_D0 /sys/devices/system/edac/mc/mc0/dimm4/dimm_label:N0 DIMM_E0 /sys/devices/system/edac/mc/mc0/dimm5/dimm_label:N0 DIMM_F0 /sys/devices/system/edac/mc/mc0/dimm6/dimm_label:N0 DIMM_G0 /sys/devices/system/edac/mc/mc0/dimm7/dimm_label:N0 DIMM_H0 /sys/devices/system/edac/mc/mc0/dimm8/dimm_label:N1 DIMM_I0 /sys/devices/system/edac/mc/mc0/dimm9/dimm_label:N1 DIMM_J0 /sys/devices/system/edac/mc/mc0/dimm10/dimm_label:N1 DIMM_K0 /sys/devices/system/edac/mc/mc0/dimm11/dimm_label:N1 DIMM_L0 /sys/devices/system/edac/mc/mc0/dimm12/dimm_label:N1 DIMM_M0 /sys/devices/system/edac/mc/mc0/dimm13/dimm_label:N1 DIMM_N0 /sys/devices/system/edac/mc/mc0/dimm14/dimm_label:N1 DIMM_O0 /sys/devices/system/edac/mc/mc0/dimm15/dimm_label:N1 DIMM_P0 Since dimm_labels can be rewritten, that label will be used in a later error report: # echo foobar >/sys/devices/system/edac/mc/mc0/dimm0/dimm_label # # some error injection here # dmesg \| grep foobar [ 751.383533] EDAC MC0: 1 CE Single-bit ECC on foobar (node:0 card:0 module:0 rank:1 bank:259 col:3 bit_pos:16 DIMM location:N0 DIMM_A0 page:0x8c8dc74 offset:0x0 grain:1 syndrome:0x0 - APEI location: node:0 card:0 module:0 rank:1 bank:259 col:3 bit_pos:16 DIMM location:N0 DIMM_A0 status(0x0000000000000400): Storage error in DRAM memory) [ bp: Remove curly brackets around a single if-statement in dimm_setup_label(). ] Signed-off-by: Robert Richter <rrichter@marvell.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200528101307.23245-1-rrichter@marvell.com	2020-06-16 15:22:04 +02:00
Qiuxu Zhuo	8807e15597	EDAC, {skx,i10nm}: Use CPU stepping macro to pass configurations Use the X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS() macro to pass CPU stepping specific configurations to {skx,i10nm}_init(), so can delete the CPU stepping check from 10nm_init(). Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20200509010822.76331-1-qiuxu.zhuo@intel.com	2020-06-15 14:50:39 -07:00
Zhenzhong Duan	e9ff6636d3	EDAC/mc: Call edac_inc_ue_error() before panic By calling edac_inc_ue_error() before panic, we get a correct UE error count for core dump analysis. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20200610065846.3626-2-zhenzhong.duan@gmail.com	2020-06-15 11:19:52 -07:00
Zhenzhong Duan	30bf38e434	EDAC, pnd2: Set MCE_PRIO_EDAC priority for pnd2_mce_dec notifier Avoid giving it MCE_PRIO_LOWEST priority by default. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20200610065846.3626-1-zhenzhong.duan@gmail.com	2020-06-15 11:19:39 -07:00
Linus Torvalds	6adc19fd13	Merge tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - fix build rules in binderfs sample - fix build errors when Kbuild recurses to the top Makefile - covert '---help---' in Kconfig to 'help' * tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: treewide: replace '---help---' in Kconfig files with 'help' kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables samples: binderfs: really compile this sample and fix build issues	2020-06-13 13:29:16 -07:00
Masahiro Yamada	a7f7f6248d	treewide: replace '---help---' in Kconfig files with 'help' Since commit `84af7a6194` ("checkpatch: kconfig: prefer 'help' over '---help---'"), the number of '---help---' has been gradually decreasing, but there are still more than 2400 instances. This commit finishes the conversion. While I touched the lines, I also fixed the indentation. There are a variety of indentation styles found. a) 4 spaces + '---help---' b) 7 spaces + '---help---' c) 8 spaces + '---help---' d) 1 space + 1 tab + '---help---' e) 1 tab + '---help---' (correct indentation) f) 1 tab + 1 space + '---help---' g) 1 tab + 2 spaces + '---help---' In order to convert all of them to 1 tab + 'help', I ran the following commend: $ find . -name 'Kconfig' \| xargs sed -i 's/^[[:space:]]---help---/\thelp/' Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2020-06-14 01:57:21 +09:00
Thomas Gleixner	f77d26a9fc	Merge branch 'x86/entry' into ras/core to fixup conflicts in arch/x86/kernel/cpu/mce/core.c so MCE specific follow up patches can be applied without creating a horrible merge conflict afterwards.	2020-06-11 15:17:57 +02:00
Borislav Petkov	2a02ca0428	Merge branches 'edac-i10nm' and 'edac-misc' into edac-updates-for-5.8 Signed-off-by: Borislav Petkov <bp@suse.de>	2020-06-01 11:39:15 +02:00
Colin Ian King	f00eb5ff2f	EDAC/amd64: Remove redundant assignment to variable ret in hw_info_get() The variable ret is being assigned with a value that is never read and it is being updated later with a new value. The initialization is redundant so remove it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200429154847.287001-1-colin.king@canonical.com	2020-05-29 15:15:02 +02:00
Alexander Monakov	b6bea24d41	EDAC/amd64: Add AMD family 17h model 60h PCI IDs Add support for AMD Renoir (4000-series Ryzen CPUs). Signed-off-by: Alexander Monakov <amonakov@ispras.ru> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lkml.kernel.org/r/20200510204842.2603-4-amonakov@ispras.ru	2020-05-22 18:43:13 +02:00
Qiuxu Zhuo	1032095053	EDAC/skx: Use the mcmtr register to retrieve close_pg/bank_xor_enable The skx_edac driver wrongly uses the mtr register to retrieve two fields close_pg and bank_xor_enable. Fix it by using the correct mcmtr register to get the two fields. Cc: <stable@vger.kernel.org> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Reported-by: Matthew Riley <mattdr@google.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20200515210146.1337-1-tony.luck@intel.com	2020-05-19 15:11:29 -07:00
Qiuxu Zhuo	ce20670828	EDAC/i10nm: Update driver to support different bus number config register offsets The i10nm_edac driver failed to load on Ice Lake and Tremont/Jacobsville servers if their CPU stepping >= 4 and failed on Ice Lake-D servers from stepping 0. The root cause was that for Ice Lake and Tremont/Jacobsville servers with CPU stepping >=4, the offset for bus number configuration register was updated from 0xcc to 0xd0. For Ice Lake-D servers, all the steppings use the updated 0xd0 offset. Fix the issue by using the appropriate offset for bus number configuration register according to the CPU model number and stepping. Reported-by: Jerry Chen <jerry.t.chen@intel.com> Reported-and-tested-by: Jin Wen <wen.jin@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/linux-edac/20200427084022.GC11036@zn.tnic	2020-04-27 09:40:49 -07:00
Qiuxu Zhuo	ee5340abab	EDAC, {skx,i10nm}: Make some configurations CPU model specific The device ID for configuration agent PCI device and the offset for bus number configuration register can be CPU model specific. So add a new structure res_config to make them configurable and pass res_config to {skx,i10nm}_init() and skx_get_all_bus_mappings() for use. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20200427083246.GB11036@zn.tnic	2020-04-27 09:29:41 -07:00
Jason Yan	b2f9fb0d67	EDAC/amd8131: Remove defined but not used bridge_str Fix the following gcc warning: drivers/edac/amd8131_edac.c:47:21: warning: ‘bridge_str’ defined but not used [-Wunused-const-variable=] static char * const bridge_str[] = { ^~~~~~~~~~ Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Robert Richter <rrichter@marvell.com> Link: https://lkml.kernel.org/r/20200415085006.6732-1-yanaijie@huawei.com	2020-04-24 09:08:47 +02:00
Zou Wei	58d66175d4	EDAC/thunderx: Make symbols static Make a couple of symbols static, as reported by sparse. [ bp: Massage. ] Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/1587624744-97240-1-git-send-email-zou_wei@huawei.com	2020-04-23 12:07:24 +02:00
Tony Luck	7fc0b9b995	EDAC: Drop the EDAC report status checks When acpi_extlog was added, we were worried that the same error would be reported more than once by different subsystems. But in the ensuing years I've seen complaints that people could not find an error log (because this mechanism suppressed the log they were looking for). Rip it all out. People are smart enough to notice the same address from different reporting mechanisms. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200214222720.13168-8-tony.luck@intel.com	2020-04-14 16:01:01 +02:00

1 2 3 4 5 ...

1682 Commits