commit 54097309620ef0dc2d7083783dc521c6a5fef957 upstream.
Christian reported spurious module load crashes after some of Song's
module memory layout patches.
Turns out that if the very last instruction on the very last page of the
module is a 'JMP __x86_return_thunk' then __static_call_fixup() will
trip a fault and die.
And while the module rework made this slightly more likely to happen,
it's always been possible.
Fixes: ee88d363d156 ("x86,static_call: Use alternative RET encoding")
Reported-by: Christian Bricart <christian@bricart.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lkml.kernel.org/r/20230816104419.GA982867@hirez.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c27c753ea6fd1237f4f96abf8b623d7bab505513 upstream.
__static_call_fixup() invokes __static_call_transform() without holding
text_mutex, which causes lockdep to complain in text_poke_bp().
Adding the proper locking cures that, but as this is either used during
early boot or during module finalizing, it's not required to use
text_poke_bp(). Add an argument to __static_call_transform() which tells
it to use text_poke_early() for it.
Fixes: ee88d363d156 ("x86,static_call: Use alternative RET encoding")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f43b9876e857c739d407bc56df288b0ebe1a9164 upstream.
Do fine-grained Kconfig for all the various retbleed parts.
NOTE: if your compiler doesn't support return thunks this will
silently 'upgrade' your mitigation to IBPB, you might not like this.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
[cascardo: there is no CONFIG_OBJTOOL]
[cascardo: objtool calling and option parsing has changed]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
[bwh: Backported to 5.10:
- In scripts/Makefile.build, add the objtool option with an ifdef
block, same as for other options
- Adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee88d363d15617ff50ac24fab0ffec11113b2aeb upstream.
In addition to teaching static_call about the new way to spell 'RET',
there is an added complication in that static_call() is allowed to
rewrite text before it is known which particular spelling is required.
In order to deal with this; have a static_call specific fixup in the
apply_return() 'alternative' patching routine that will rewrite the
static_call trampoline to match the definite sequence.
This in turn creates the problem of uniquely identifying static call
trampolines. Currently trampolines are 8 bytes, the first 5 being the
jmp.d32/ret sequence and the final 3 a byte sequence that spells out
'SCT'.
This sequence is used in __static_call_validate() to ensure it is
patching a trampoline and not a random other jmp.d32. That is,
false-positives shouldn't be plenty, but aren't a big concern.
OTOH the new __static_call_fixup() must not have false-positives, and
'SCT' decodes to the somewhat weird but semi plausible sequence:
push %rbx
rex.XB push %r12
Additionally, there are SLS concerns with immediate jumps. Combined it
seems like a good moment to change the signature to a single 3 byte
trap instruction that is unique to this usage and will not ever get
generated by accident.
As such, change the signature to: '0x0f, 0xb9, 0xcc', which decodes
to:
ud1 %esp, %ecx
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
[cascardo: skip validation as introduced by 2105a92748e8 ("static_call,x86: Robustify trampoline patching")]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
[bwh: Backported to 5.10: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In order to use static_call() to wire up x86_pmu, we need to
initialize earlier, specifically before memory allocation works; copy
some of the tricks from jump_label to enable this.
Primarily we overload key->next to store a sites pointer when there
are no modules, this avoids having to use kmalloc() to initialize the
sites and allows us to run much earlier.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20200818135805.220737930@infradead.org
Extend the static_call infrastructure to optimize the following common
pattern:
if (func_ptr)
func_ptr(args...)
For the trampoline (which is in effect a tail-call), we patch the
JMP.d32 into a RET, which then directly consumes the trampoline call.
For the in-line sites we replace the CALL with a NOP5.
NOTE: this is 'obviously' limited to functions with a 'void' return type.
NOTE: DEFINE_STATIC_COND_CALL() only requires a typename, as opposed
to a full function.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20200818135805.042977182@infradead.org
Add the inline static call implementation for x86-64. The generated code
is identical to the out-of-line case, except we move the trampoline into
it's own section.
Objtool uses the trampoline naming convention to detect all the call
sites. It then annotates those call sites in the .static_call_sites
section.
During boot (and module init), the call sites are patched to call
directly into the destination function. The temporary trampoline is
then no longer used.
[peterz: merged trampolines, put trampoline in section]
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20200818135804.864271425@infradead.org
Add the x86 out-of-line static call implementation. For each key, a
permanent trampoline is created which is the destination for all static
calls for the given key. The trampoline has a direct jump which gets
patched by static_call_update() when the destination function changes.
[peterz: fixed trampoline, rewrote patching code]
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20200818135804.804315175@infradead.org