preempt_schedule_common() is marked notrace, but it does not use
_notrace() preempt_count functions and __schedule() is also not marked
notrace, which means that its perfectly possible to end up in the
tracer from preempt_schedule_common().
Steve says:
| Yep, there's some history to this. This was originally the issue that
| caused function tracing to go into infinite recursion. But now we have
| preempt_schedule_notrace(), which is used by the function tracer, and
| that function must not be traced till preemption is disabled.
|
| Now if function tracing is running and we take an interrupt when
| NEED_RESCHED is set, it calls
|
| preempt_schedule_common() (not traced)
|
| But then that calls preempt_disable() (traced)
|
| function tracer calls preempt_disable_notrace() followed by
| preempt_enable_notrace() which will see NEED_RESCHED set, and it will
| call preempt_schedule_notrace(), which stops the recursion, but
| still calls __schedule() here, and that means when we return, we call
| the __schedule() from preempt_schedule_common().
|
| That said, I prefer this patch. Preemption is disabled before calling
| __schedule(), and we get rid of a one round recursion with the
| scheduler.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When we warn about a preempt_count leak; reset the preempt_count to
the known good value such that the problem does not ripple forward.
This is most important on x86 which has a per cpu preempt_count that is
not saved/restored (after this series). So if you schedule with an
invalid (!2*PREEMPT_DISABLE_OFFSET) preempt_count the next task is
messed up too.
Enforcing this invariant limits the borkage to just the one task.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Assuming units of PREEMPT_DISABLE_OFFSET for preempt_count() numbers.
Now that TASK_DEAD no longer results in preempt_count() == 3 during
scheduling, we will always call context_switch() with preempt_count()
== 2.
However, we don't always end up with preempt_count() == 2 in
finish_task_switch() because new tasks get created with
preempt_count() == 1.
Create FORK_PREEMPT_COUNT and set it to 2 and use that in the right
places. Note that we cannot use INIT_PREEMPT_COUNT as that serves
another purpose (boot).
After this, preempt_count() is invariant across the context switch,
with exception of PREEMPT_ACTIVE.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
TASK_DEAD is special in that the final schedule call from do_exit()
must be done with preemption disabled.
This means we end up scheduling with a preempt_count() higher than
usual (3 instead of the 'expected' 2).
Since future patches will want to rely on an invariant
preempt_count() value during schedule, fix this up.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
So the problem this patch is trying to address is as follows:
CPU0 CPU1
context_switch(A, B)
ttwu(A)
LOCK A->pi_lock
A->on_cpu == 0
finish_task_switch(A)
prev_state = A->state <-.
WMB |
A->on_cpu = 0; |
UNLOCK rq0->lock |
| context_switch(C, A)
`-- A->state = TASK_DEAD
prev_state == TASK_DEAD
put_task_struct(A)
context_switch(A, C)
finish_task_switch(A)
A->state == TASK_DEAD
put_task_struct(A)
The argument being that the WMB will allow the load of A->state on CPU0
to cross over and observe CPU1's store of A->state, which will then
result in a double-drop and use-after-free.
Now the comment states (and this was true once upon a long time ago)
that we need to observe A->state while holding rq->lock because that
will order us against the wakeup; however the wakeup will not in fact
acquire (that) rq->lock; it takes A->pi_lock these days.
We can obviously fix this by upgrading the WMB to an MB, but that is
expensive, so we'd rather avoid that.
The alternative this patch takes is: smp_store_release(&A->on_cpu, 0),
which avoids the MB on some archs, but not important ones like ARM.
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org> # v3.1+
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: manfred@colorfullife.com
Cc: will.deacon@arm.com
Fixes: e4a52bcb9a ("sched: Remove rq->lock from the first half of ttwu()")
Link: http://lkml.kernel.org/r/20150929124509.GG3816@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit ea317b267e ("bpf: Add new bpf map type to store the pointer
to struct perf_event") added perf_event.h to the main eBPF header, so
it gets included for all users. perf_event.h is actually only needed
from array map side, so lets sanitize this a bit.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current ongoing effort to dump existing cBPF seccomp filters back
to user space requires to hold the pre-transformed instructions like
we do in case of socket filters from sk_attach_filter() side, so they
can be reloaded in original form at a later point in time by utilities
such as criu.
To prepare for this, simply extend the bpf_prog_create_from_user()
API to hold a flag that tells whether we should store the original
or not. Also, fanout filters could make use of that in future for
things like diag. While fanout filters already use bpf_prog_destroy(),
move seccomp over to them as well to handle original programs when
present.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Tested-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull irq fixes from Thomas Gleixner:
"This update contains:
- Fix for a long standing race affecting /proc/irq/NNN
- One line fix for ARM GICV3-ITS counting the wrong data
- Warning silencing in ARM GICV3-ITS. Another GCC trying to be
overly clever issue"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Count additional LPIs for the aliased devices
irqchip/gic-v3-its: Silence warning when its_lpi_alloc_chunks gets inlined
genirq: Fix race in register_irq_proc()
Its a bit odd that debugfs_create_bool() takes 'u32 *' as an argument,
when all it needs is a boolean pointer.
It would be better to update this API to make it accept 'bool *'
instead, as that will make it more consistent and often more convenient.
Over that bool takes just a byte.
That required updates to all user sites as well, in the same commit
updating the API. regmap core was also using
debugfs_{read|write}_file_bool(), directly and variable types were
updated for that to be bool as well.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Using routing realms as part of the classifier is quite useful, it
can be viewed as a tag for one or multiple routing entries (think of
an analogy to net_cls cgroup for processes), set by user space routing
daemons or via iproute2 as an indicator for traffic classifiers and
later on processed in the eBPF program.
Unlike actions, the classifier can inspect device flags and enable
netif_keep_dst() if necessary. tc actions don't have that possibility,
but in case people know what they are doing, it can be used from there
as well (e.g. via devs that must keep dsts by design anyway).
If a realm is set, the handler returns the non-zero realm. User space
can set the full 32bit realm for the dst.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As we need to add further flags to the bpf_prog structure, lets migrate
both bools to a bitfield representation. The size of the base structure
(excluding insns) remains unchanged at 40 bytes.
Add also tags for the kmemchecker, so that it doesn't throw false
positives. Even in case gcc would generate suboptimal code, it's not
being accessed in performance critical paths.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sync_cmos_clock has one use of struct timespec, which we want to
eventually replace with timespec64 or similar in the kernel. There
is no way this one can overflow, but the conversion to timespec64
is trivial and has no other dependencies.
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
There is exactly one caller of getnstime_raw_and_real in the kernel,
which is the pps_get_ts function. This changes the caller and
the implementation to work on timespec64 types rather than timespec,
to avoid the time_t overflow on 32-bit architectures.
For consistency with the other new functions (ktime_get_seconds,
ktime_get_real_*, ...), I'm renaming the function to
ktime_get_raw_and_real_ts64.
We still need to convert from the internal 64-bit type to 32 bit
types in the caller, but this conversion is now pushed out from
getnstime_raw_and_real to pps_get_ts. A follow-up patch changes
the remaining pps code to completely avoid the conversion.
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
There is only one user of the hardpps function in the kernel, so
it makes sense to atomically change it over to using 64-bit
timestamps for y2038 safety. In the hardpps implementation,
we also need to change the pps_normtime structure, which is
similar to struct timespec and also requires a 64-bit
seconds portion.
This introduces two temporary variables in pps_kc_event() to
do the conversion, they will be removed again in the next step,
which seemed preferable to having a larger patch changing it
all at the same time.
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
To cover the common case of sorting an array of pointers, Daniel
Wagner recently modified the library sort() to use a specific swap
function for size==8, in addition to the size==4 case which was
already handled. Since sizeof(long) is either 4 or 8,
ftrace_swap_ips() is redundant and we can just let sort() pick an
appropriate and fast swap callback.
Link: http://lkml.kernel.org/r/1441834023-13130-1-git-send-email-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
HV_X64_MSR_VP_RUNTIME msr used by guest to get
"the time the virtual processor consumes running guest code,
and the time the associated logical processor spends running
hypervisor code on behalf of that guest."
Calculation of this time is performed by task_cputime_adjusted()
for vcpu task.
Necessary to support loading of winhv.sys in guest, which in turn is
required to support Windows VMBus.
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Gleb Natapov <gleb@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The kernel now has kstrdup_const/kfree_const for reusing .rodata
(typically string literals) when possible; there's no reason to
duplicate that logic in the tracing system. Moreover, as the comment
above core_kernel_data states, it may not always return true for
.rodata - that is for example the case on x86_64, where we thus end up
kstrdup'ing all the passed-in strings.
Arguably, testing for .rodata explicitly (as kstrdup_const does) is
also more correct: I don't think one is supposed to be able to change
the name after creating the event_subsystem by passing the address of
a static char (but non-const) array.
Link: http://lkml.kernel.org/r/1441833841-12955-1-git-send-email-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Per-IRQ directories in procfs are created only when a handler is first
added to the irqdesc, not when the irqdesc is created. In the case of
a shared IRQ, multiple tasks can race to create a directory. This
race condition seems to have been present forever, but is easier to
hit with async probing.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Link: http://lkml.kernel.org/r/1443266636.2004.2.camel@decadent.org.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Add the tracer options to instances options directory as well. Only add the
options for tracers that are allowed to be enabled by an instance. But note,
that tracer options are global. That is, tracer options enabled in an
instance, also take affect at the top level and in other instances.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Allow instances to have their own options, at least for the core options
(non tracer specific ones). There are a few global options that should not
be added to instances, like enabling of trace_printk, and the sched comm
recording, which do not have a specific trace instance associated to them.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
In preparation for the multi buffer instances to have their own trace_flags,
the check in ftrace_trace_stack() needs to test the trace_array descriptor
flag that is for the current event, not the global_trace descriptor.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
In preparation of having the multi buffer instances having their own trace
option flags, the trace option files needs a way to not only pass in the
flag they represent, but also the trace_array descriptor.
A new field is added to the trace_array descriptor called trace_flags_index,
which is a 32 byte character array representing a bit. This array is simply
filled with the index of the array, where
index_array[n] = n;
Then the address of this array is passed to the file callbacks instead of
the index of the flag index. Then to retrieve both the flag index and the
trace_array descriptor:
data is the passed in argument.
index = *(unsigned char *)data;
data -= index;
/* Now data points to the address of the array in the trace_array */
tr = container_of(data, struct trace_array, trace_flags_index);
Suggested-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
In preparation to make trace options per instance, the global trace_flags
needs to be moved from being a global variable to a field within the trace
instance trace_array structure.
There's still more work to do, as there's some functions that use
trace_flags without passing in a way to get to the current_trace array. For
those, the global_trace is used directly (from trace.c). This includes
setting and clearing the trace_flags. This means that when a new instance is
created, it just gets the trace_flags of the global_trace and will not be
able to modify them. Depending on the functions that have access to the
trace_array, the flags of an instance may not affect parts of its trace,
where the global_trace is used. These will be fixed in future changes.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The sleep-time and graph-time options are only for the function graph tracer
and are not used by anything else. As tracer options are now visible when
the tracer is not activated, its better to move the function graph specific
tracer options into the function graph tracer.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
My system keeps crashing with below message. vmstat_update() schedules a delayed
work in current cpu and expects the work runs in the cpu.
schedule_delayed_work() is expected to make delayed work run in local cpu. The
problem is timer can be migrated with NO_HZ. __queue_work() queues work in
timer handler, which could run in a different cpu other than where the delayed
work is scheduled. The end result is the delayed work runs in different cpu.
The patch makes __queue_delayed_work records local cpu earlier. Where the timer
runs doesn't change where the work runs with the change.
[ 28.010131] ------------[ cut here ]------------
[ 28.010609] kernel BUG at ../mm/vmstat.c:1392!
[ 28.011099] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[ 28.011860] Modules linked in:
[ 28.012245] CPU: 0 PID: 289 Comm: kworker/0:3 Tainted: G W4.3.0-rc3+ #634
[ 28.013065] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014
[ 28.014160] Workqueue: events vmstat_update
[ 28.014571] task: ffff880117682580 ti: ffff8800ba428000 task.ti: ffff8800ba428000
[ 28.015445] RIP: 0010:[<ffffffff8115f921>] [<ffffffff8115f921>]vmstat_update+0x31/0x80
[ 28.016282] RSP: 0018:ffff8800ba42fd80 EFLAGS: 00010297
[ 28.016812] RAX: 0000000000000000 RBX: ffff88011a858dc0 RCX:0000000000000000
[ 28.017585] RDX: ffff880117682580 RSI: ffffffff81f14d8c RDI:ffffffff81f4df8d
[ 28.018366] RBP: ffff8800ba42fd90 R08: 0000000000000001 R09:0000000000000000
[ 28.019169] R10: 0000000000000000 R11: 0000000000000121 R12:ffff8800baa9f640
[ 28.019947] R13: ffff88011a81e340 R14: ffff88011a823700 R15:0000000000000000
[ 28.020071] FS: 0000000000000000(0000) GS:ffff88011a800000(0000)knlGS:0000000000000000
[ 28.020071] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 28.020071] CR2: 00007ff6144b01d0 CR3: 00000000b8e93000 CR4:00000000000006f0
[ 28.020071] Stack:
[ 28.020071] ffff88011a858dc0 ffff8800baa9f640 ffff8800ba42fe00ffffffff8106bd88
[ 28.020071] ffffffff8106bd0b 0000000000000096 0000000000000000ffffffff82f9b1e8
[ 28.020071] ffffffff829f0b10 0000000000000000 ffffffff81f18460ffff88011a81e340
[ 28.020071] Call Trace:
[ 28.020071] [<ffffffff8106bd88>] process_one_work+0x1c8/0x540
[ 28.020071] [<ffffffff8106bd0b>] ? process_one_work+0x14b/0x540
[ 28.020071] [<ffffffff8106c214>] worker_thread+0x114/0x460
[ 28.020071] [<ffffffff8106c100>] ? process_one_work+0x540/0x540
[ 28.020071] [<ffffffff81071bf8>] kthread+0xf8/0x110
[ 28.020071] [<ffffffff81071b00>] ?kthread_create_on_node+0x200/0x200
[ 28.020071] [<ffffffff81a6522f>] ret_from_fork+0x3f/0x70
[ 28.020071] [<ffffffff81071b00>] ?kthread_create_on_node+0x200/0x200
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v2.6.31+
Pull RCU fixes from Ingo Molnar:
"Two RCU fixes:
- work around bug with recent GCC versions.
- fix false positive lockdep splat"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Suppress lockdep false positive for rcp->exp_funnel_mutex
rcu: Change _wait_rcu_gp() to work around GCC bug 67055
In the effort to move the global trace_flags to the tracing instances, the
direct access to trace_flags must be removed from trace_printk.c
Instead, add a new trace_printk_enabled boolean that is set by a new access
function trace_printk_control(), that will enable or disable trace_printk.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Add a enum that denotes the last bit of the trace_flags and have a
BUILD_BUG_ON(last_bit > 32).
If we add more bits than we have in trace_flags, the kernel wont build.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
There are options that are unique to a specific tracer (like function and
function graph). Currently, these options are only visible in the options
directory when the tracer is enabled.
This has been a pain, especially for something like the func_stack_trace
option that if used inappropriately, could bring the system to a crawl. But
the only way to see it, is to enable the function tracer.
For example, if one had done:
# cd /sys/kernel/tracing
# echo __schedule > set_ftrace_filter
# echo 1 > options/func_stack_trace
# echo function > current_tracer
The __schedule call will be traced and a stack trace will also be recorded
there. Now when you were done, you may do...
# echo nop > current_tracer
# echo > set_ftrace_filter
But you forgot to disable the func_stack_trace. The only way to disable it
is to re-enable function tracing first. If you do not add a filter to
set_ftrace_filter and just do:
# echo function > current_tracer
Now you would be performing a stack trace on *every* function! On some
systems, that causes a live lock. Others may take a few minutes to fix your
mistake.
Having the func_stack_trace option visible allows you to check it and
disable it before enabling the funtion tracer.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Only create the stacktrace trace option when CONFIG_STACKTRACE is
configured.
Cleaned up the ftrace_trace_stack() function call a little to allow better
encapsulation of the stacktrace trace flag.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
When the function tracer is not compiled in, do not create the option files
for it.
Fix up both the sched_wakeup and irqsoff tracers to handle the change.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Some drivers might use the per-cpu interrupts and still might be built as a
module. Export request_percpu_irq an free_percpu_irq to these user, which
also make it consistent with enable/disable_percpu_irq that were exported.
Reported-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The documentation of request_percpu_irq is confusing and suggest that the
interrupt is not enabled at all, while it is actually enabled on the local
CPU.
Clarify that.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use a cute little macro trick to keep the names of the trace flags file
guaranteed to match the corresponding masks.
The macro TRACE_FLAGS is defined as a serious of enum names followed by
the string name of the file that matches it. For example:
#define TRACE_FLAGS \
C(PRINT_PARENT, "print-parent"), \
C(SYM_OFFSET, "sym-offset"), \
C(SYM_ADDR, "sym-addr"), \
C(VERBOSE, "verbose"),
Now we can define the following:
#undef C
#define C(a, b) TRACE_ITER_##a##_BIT
enum trace_iterator_bits { TRACE_FLAGS };
The above creates:
enum trace_iterator_bits {
TRACE_ITER_PRINT_PARENT_BIT,
TRACE_ITER_SYM_OFFSET_BIT,
TRACE_ITER_SYM_ADDR_BIT,
TRACE_ITER_VERBOSE_BIT,
};
Then we can redefine C as:
#undef C
#define C(a, b) TRACE_ITER_##a = (1 << TRACE_ITER_##a##_BIT)
enum trace_iterator_flags { TRACE_FLAGS };
Which creates:
enum trace_iterator_flags {
TRACE_ITER_PRINT_PARENT = (1 << TRACE_ITER_PRINT_PARENT_BIT),
TRACE_ITER_SYM_OFFSET = (1 << TRACE_ITER_SYM_OFFSET_BIT),
TRACE_ITER_SYM_ADDR = (1 << TRACE_ITER_SYM_ADDR_BIT),
TRACE_ITER_VERBOSE = (1 << TRACE_ITER_VERBOSE_BIT),
};
Then finally we can create the list of file names:
#undef C
#define C(a, b) b
static const char *trace_options[] = {
TRACE_FLAGS
NULL
};
Which creates:
static const char *trace_options[] = {
"print-parent",
"sym-offset",
"sym-addr",
"verbose",
NULL
};
The importance of this is that the strings match the bit index.
trace_options[TRACE_ITER_SYM_ADDR_BIT] == "sym-addr"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Using enums with FLAG_BIT and then defining a FLAG = (1 << FLAG_BIT), is a
bit more robust as we require that there are no bits out of order or skipped
to match the file names that represent the bits.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
There was a time where the function tracing would disable interrupts unless
specifically told not to, where it would only disable preemption. With the
new lockless code, the function tracing never disalbes interrupts and just
uses disabling of preemption. Remove the option "ftrace_preempt" as it does
nothing anyway.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
In order to facilitate making all tracer options visible even when the
tracer is not active, we need to get rid of duplicate options. Any option
that is shared between multiple tracers really should be a main option.
As the wakeup and irqsoff tracers both use the "display-graph" option, and
use it exactly the same way, move that option from the tracer options to the
main options and consolidate them.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
seq_print_user_ip() is used in only one location in one file. Turn it into a
static function. We could inject its code into the caller, but that would
make the code a bit too complex. Keep the code separate.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>