android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Adrian Hunter	a9e57009da	perf record: Fix documentation 'event_sources' -> 'event_source' Change '/sys/bus/event_sources' to the correct path which is '/sys/bus/event_source'. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Link: http://lkml.kernel.org/r/1474641528-18776-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-27 15:00:29 -03:00
Mathieu Poirier	dd60fba732	perf tools: Add infrastructure for PMU specific configuration This patch adds PMU driver specific configuration to the parser infrastructure by preceding any term with the '@' letter. As such doing something like: perf record -e some_event/@cfg1,@cfg2=config/ ... will see 'cfg1' and 'cfg2=config' being added to the list of evsel config terms. Token 'cfg1' and 'cfg2=config' are not processed in user space and are meant to be interpreted by the PMU driver. First the lexer/parser are supplemented with the required definitions to recognise the driver specific configuration. From there they are simply added to the list of event terms. The bulk of the work is done in function "parse_events_add_pmu()" where driver config event terms are added to a new list of driver config terms, which in turn spliced with the event's new driver configuration list. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1473179837-3293-4-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-13 17:09:11 -03:00
Masami Hiramatsu	428aff82e9	perf probe: Ignore vmlinux buildid if offline kernel is given Ignore the buildid of running kernel when both of --definition and --vmlinux is given because that kernel should be off-line. This also skips post-processing of kprobe event for relocating symbol and checking blacklist, because it can not be done on off-line kernel. E.g. without this fix perf shows an error as below ---- $ perf probe --vmlinux=./vmlinux-arm --definition do_sys_open ./vmlinux-arm with build id 7a1f76dd56e9c4da707cd3d6333f50748141434b not found, continuing without symbols Failed to find symbol do_sys_open in kernel Error: Failed to add events. ---- with this fix, we can get the definition ---- $ perf probe --vmlinux=./vmlinux-arm --definition do_sys_open p:probe/do_sys_open do_sys_open+0 ---- Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/147214228193.23638.12581984840822162131.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-01 09:44:14 -03:00
Masami Hiramatsu	1c20b1d154	perf probe: Show trace event definition Add --definition/-D option for showing the trace-event definition in stdout. This can be useful in debugging or combined with a shell script. e.g. ---- # perf probe --definition 'do_sys_open $params' p:probe/do_sys_open _text+2261728 dfd=%di:s32 filename=%si:u64 flags=%dx:s32 mode=%cx:u16 ---- Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/147214226712.23638.2240534040014013658.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-01 09:44:13 -03:00
Milian Wolff	893c5c798b	perf config: Show default report configuration in example and docs Signed-off-by: Milian Wolff <milian.wolff@kdab.com> LPU-Reference: 20160830134106.21240-2-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-01 09:44:13 -03:00
Masami Hiramatsu	bdca79c2bf	ftrace: kprobe: uprobe: Show u8/u16/u32/u64 types in decimal Change kprobe/uprobe-tracer to show the arguments type-casted with u8/u16/u32/u64 in decimal digits instead of hexadecimal. To minimize compatibility issue, the arguments without type casting are typed by x64 (or x32 for 32bit arch) by default. Note: all arguments set by old perf probe without types are shown in decimal by default. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Naohiro Aota <naohiro.aota@hgst.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/147151076135.12957.14684546093034343894.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-23 17:06:38 -03:00
Masami Hiramatsu	9254378725	perf probe: Support hexadecimal casting Support hexadecimal unsigned integer casting by 'x'. This allows user to explicitly specify the output format of the probe arguments as hexadecimal. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Naohiro Aota <naohiro.aota@hgst.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/147151072679.12957.4458656416765710753.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-23 17:06:37 -03:00
Masami Hiramatsu	17ce3dc7e5	ftrace: kprobe: uprobe: Add x8/x16/x32/x64 for hexadecimal types Add x8/x16/x32/x64 for hexadecimal type casting to kprobe/uprobe event tracer. These type casts can be used for integer arguments for explicitly showing them in hexadecimal digits in formatted text. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Naohiro Aota <naohiro.aota@hgst.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/147151067029.12957.11591314629326414783.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-23 15:38:09 -03:00
Arnaldo Carvalho de Melo	fa1f456592	perf report: Allow configuring the default sort order in ~/.perfconfig Allows changing the default sort order from "comm,dso,symbol" to some other default, for instance "sym,dso" may be more fitting for kernel developers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-pm1h5puxua8nsxksd68fjm8r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-23 15:37:33 -03:00
Naohiro Aota	19f00b0117	perf probe: Support signedness casting The 'perf probe' tool detects a variable's type and use the detected type to add a new probe. Then, kprobes prints its variable in hexadecimal format if the variable is unsigned and prints in decimal if it is signed. We sometimes want to see unsigned variable in decimal format (i.e. sector_t or size_t). In that case, we need to investigate the variable's size manually to specify just signedness. This patch add signedness casting support. By specifying "s" or "u" as a type, perf-probe will investigate variable size as usual and use the specified signedness. E.g. without this: $ perf probe -a 'submit_bio bio->bi_iter.bi_sector' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 $ cat trace_pipe\|head dbench-9692 [003] d..1 971.096633: submit_bio: (submit_bio+0x0/0x140) bi_sector=0x3a3d00 dbench-9692 [003] d..1 971.096685: submit_bio: (submit_bio+0x0/0x140) bi_sector=0x1a3d80 dbench-9692 [003] d..1 971.096687: submit_bio: (submit_bio+0x0/0x140) bi_sector=0x3a3d80 ... // need to investigate the variable size $ perf probe -a 'submit_bio bio->bi_iter.bi_sector:s64' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector:s64) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 With this: // just use "s" to cast its signedness $ perf probe -v -a 'submit_bio bio->bi_iter.bi_sector:s' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector:s) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 $ cat trace_pipe\|head dbench-9689 [001] d..1 1212.391237: submit_bio: (submit_bio+0x0/0x140) bi_sector=128 dbench-9689 [001] d..1 1212.391252: submit_bio: (submit_bio+0x0/0x140) bi_sector=131072 dbench-9697 [006] d..1 1212.398611: submit_bio: (submit_bio+0x0/0x140) bi_sector=30208 This commit also update perf-probe.txt to describe "types". Most parts are based on existing documentation: Documentation/trace/kprobetrace.txt Committer note: Testing using 'perf trace': # perf probe -a 'submit_bio bio->bi_iter.bi_sector' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 # trace --no-syscalls --ev probe:submit_bio 0.000 probe:submit_bio:(ffffffffac3aee00) bi_sector=0xc133c0) 3181.861 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x6cffb8) 3181.881 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x6cffc0) 3184.488 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x6cffc8) <SNIP> 4717.927 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x4dc7a88) 4717.970 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x4dc7880) ^C[root@jouet ~]# Now, using this new feature: [root@jouet ~]# perf probe -a 'submit_bio bio->bi_iter.bi_sector:s' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector:s) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 [root@jouet ~]# trace --no-syscalls --ev probe:submit_bio 0.000 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145704) 0.017 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145712) 0.019 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145720) 2.567 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145728) 5631.919 probe:submit_bio:(ffffffffac3aee00) bi_sector=0) 5631.941 probe:submit_bio:(ffffffffac3aee00) bi_sector=8) 5631.945 probe:submit_bio:(ffffffffac3aee00) bi_sector=16) 5631.948 probe:submit_bio:(ffffffffac3aee00) bi_sector=24) ^C# With callchains: # trace --no-syscalls --ev probe:submit_bio/max-stack=10/ 0.000 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662544) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa8200691 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) 0.023 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662552) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa8200691 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) 0.027 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662560) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa8200691 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) 2.593 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662568) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) journal_submit_commit_record+0xa82001ac ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa82012e8 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) ^C# Signed-off-by: Naohiro Aota <naohiro.aota@hgst.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1470710408-23515-1-git-send-email-naohiro.aota@hgst.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-09 10:52:22 -03:00
Brendan Gregg	bcdc09af3e	perf script: Add 'bpf-output' field to usage message This adds the 'bpf-output' field to the perf script usage message, and docs. Signed-off-by: Brendan Gregg <bgregg@netflix.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1470192469-11910-4-git-send-email-bgregg@netflix.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-09 10:46:43 -03:00
Jiri Olsa	b6f35ed774	perf record: Add --sample-cpu option Adding --sample-cpu option to be able to explicitly enable CPU sample type. Currently it's only enable implicitly in case the target is cpu related. It will be useful for following c2c record tool. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1470074555-24889-8-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-02 16:33:29 -03:00
Wang Nan	4ea648aec0	perf record: Add --tail-synthesize option When working with overwritable ring buffer there's a inconvenience problem: if perf dumps data after a long period after it starts, non-sample events may lost, which makes following 'perf report' unable to identify proc name and mmap layout. For example: # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output \ dd if=/dev/zero of=/dev/null send SIGUSR2 after dd runs long enough. The resuling perf.data lost correct comm and mmap events: # perf script -i perf.data.2016061522374354 perf 24478 [004] 2581325.601789: raw_syscalls:sys_exit: NR 0 = 512 ^^^^ Should be 'dd' 27b2e8 syscall_slow_exit_work+0xfe2000e3 (/lib/modules/4.6.0-rc3+/build/vmlinux) 203cc7 do_syscall_64+0xfe200117 (/lib/modules/4.6.0-rc3+/build/vmlinux) b18d83 return_from_SYSCALL_64+0xfe200000 (/lib/modules/4.6.0-rc3+/build/vmlinux) 7f47c417edf0 [unknown] ([unknown]) ^^^^^^^^^^^^ Fail to unwind This patch provides a '--tail-synthesize' option, allows perf to collect system status when finalizing output file. In resuling output file, the non-sample events reflect system status when dumping data. After this patch: # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output --tail-synthesize \ dd if=/dev/zero of=/dev/null # perf script -i perf.data.2016061600544998 dd 27364 [004] 2583244.994464: raw_syscalls:sys_enter: NR 1 (1, ... ^^ Correct comm 203a18 syscall_trace_enter_phase2+0xfe2001a8 ([kernel.kallsyms]) 203aa5 syscall_trace_enter+0xfe200055 ([kernel.kallsyms]) 203caa do_syscall_64+0xfe2000fa ([kernel.kallsyms]) b18d83 return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms]) d8e50 __GI___libc_write+0xffff01d9639f4010 (/tmp/oxygen_root-w00229757/lib64/libc-2.18.so) ^^^^^ Correct unwind This option doesn't aim to solve this problem completely. If a process terminates before SIGUSR2, we still lost its COMM and MMAP events. For example, we can't unwind correctly from the final perf.data we get from the previous example, because when perf collects the final output file (when we press C-c), 'dd' has been terminated so its '/proc/<pid>/mmap' becomes empty. However, this is a cheaper choice. To completely solve this problem we need to continously output non-sample events. To satisify the requirement of daemonization, we need to merge them periodically. It is possible but requires much more code and cycles. Automatically select --tail-synthesize when --overwrite is provided. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468485287-33422-16-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-15 17:27:52 -03:00
Wang Nan	626a6b784e	perf tools: Enable overwrite settings This patch allows following config terms and option: Globally setting events to overwrite; # perf record --overwrite ... Set specific events to be overwrite or no-overwrite. # perf record --event cycles/overwrite/ ... # perf record --event cycles/no-overwrite/ ... Add missing config terms and update the config term array size because the longest string length has changed. For overwritable events, it automatically selects attr.write_backward since perf requires it to be backward for reading. Test result: # perf record --overwrite -e syscalls:enter_nanosleep usleep 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ] # perf evlist -v syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x134, { sample_period, sample_freq }: 1, sample_type: IP\|TID\|TIME\|CPU\|PERIOD\|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, write_backward: 1 # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468485287-33422-14-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-15 17:27:51 -03:00
Masami Hiramatsu	7e9fca51fb	perf probe: Support a special SDT probe format Support a special SDT probe format which can omit the '%' prefix only if the SDT group name starts with "sdt_". So, for example both of "%sdt_libc:setjump" and "sdt_libc:setjump" are acceptable for perf probe --add. E.g. without this: # perf probe -a sdt_libc:setjmp Semantic error :There is non-digit char in line number. ... With this: # perf probe -a sdt_libc:setjmp Added new event: sdt_libc:setjmp (on %setjmp in /usr/lib64/libc-2.20.so) You can now use it in all perf tools, such as: perf record -e sdt_libc:setjmp -aR sleep 1 Suggested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146831794674.17065.13359473252168740430.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-13 23:09:09 -03:00
Masami Hiramatsu	36a009fe07	perf probe: Accept %sdt and %cached event name To improve usability, support %[PROVIDER:]SDTEVENT format to add new probes on SDT and cached events. e.g. ---- # perf probe -x /lib/libc-2.17.so %lll_lock_wait_private Added new event: sdt_libc:lll_lock_wait_private (on %lll_lock_wait_private in /usr/lib/libc-2.17.so) You can now use it in all perf tools, such as: perf record -e sdt_libc:lll_lock_wait_private -aR sleep 1 # perf probe -l \| more sdt_libc:lll_lock_wait_private (on __lll_lock_wait_private+21 in /usr/lib/libc-2.17.so) ---- Note that this is not only for SDT events, but also normal events with event-name. e.g. define "myevent" on cache (-n doesn't add the real probe) ---- # perf probe -x ./perf --cache -n --add 'myevent=dso__load $params' ---- Reuse the "myevent" from cache as below. ---- # perf probe -x ./perf %myevent ---- Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146831788372.17065.3645054540325909346.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-13 23:09:05 -03:00
Arnaldo Carvalho de Melo	175b968b81	perf report: Introduce --stdio-color to setup the color output mode selection 'perf report --stdio' will colorize entries with most hits and possibly some other aspects of its output, but those colors gets suppressed if we redirect the output to a non-tty, allow keeping the colors by adding a new option, --stdio-color, now this use case will also output escape sequences for colors: $ perf annotate --stdio-color \| more Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-3iuawqjldu4i8gziot7e3d5n@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-12 00:00:39 -03:00
Arnaldo Carvalho de Melo	53fe4ba1da	perf annotate: Introduce --stdio-color to setup the color output mode selection 'perf annotate --stdio' will colorize entries with most hits and possibly some other aspects of its output, but those colors gets suppressed if we redirect the output to a non-tty, allow keeping the colors by adding a new option, --stdio-color, now this use case will also output escape sequences for colors: $ perf annotate --stdio-color \| more Based-on-a-patch-by: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-sjrnixani5pg6qez640gaxhf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-12 00:00:39 -03:00
Chris Phlipot	3d0376113e	perf tools: Update android build documentation Update the android build documentation according to recent android build fixes. The instructions for step 1a and step 2 were updated to work with NDK version 11(oldest supported version) and NDK version 12(current version). Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1467349955-1135-5-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-04 20:27:27 -03:00
Masami Hiramatsu	6430a94ead	perf buildid-cache: Scan and import user SDT events to probe cache perf buildid-cache --add <binary> scans given binary and add the SDT events to probe cache. "sdt_" prefix is appended for all SDT providers to avoid event-name clash with other pre-defined events. It is possible to use the cached SDT events as other cached events, via perf probe --add "sdt_<provider>:<event>=<event>". e.g. ---- # perf buildid-cache --add /lib/libc-2.17.so # perf probe --cache --list \| head -n 5 /usr/lib/libc-2.17.so (a6fb821bdf53660eb2c29f778757aef294d3d392): sdt_libc:setjmp=setjmp sdt_libc:longjmp=longjmp sdt_libc:longjmp_target=longjmp_target sdt_libc:memory_heap_new=memory_heap_new # perf probe -x /usr/lib/libc-2.17.so \ -a sdt_libc:memory_heap_new=memory_heap_new Added new event: sdt_libc:memory_heap_new (on memory_heap_new in /usr/lib/libc-2.17.so) You can now use it in all perf tools, such as: perf record -e sdt_libc:memory_heap_new -aR sleep 1 # perf probe -l sdt_libc:memory_heap_new (on new_heap+183 in /usr/lib/libc-2.17.so) ---- Note that SDT event entries in probe-cache file is somewhat different from normal cached events. Normal one starts with "#", but SDTs are starting with "%". Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146736025058.27797.13043265488541434502.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-04 19:39:00 -03:00
Masami Hiramatsu	8d993d9690	perf probe: Add group name support Allow user to set group name for adding new event. Note that user must ensure that the group name doesn't conflict with existing group name carefully. E.g. Existing group name can conflict with other events. Especially, using the group name reserved for kernel modules can hide kernel embedded events when loading modules. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146736024091.27797.9471545190066268995.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-04 19:39:00 -03:00
Masami Hiramatsu	4a0f65c102	perf probe: Remove caches when --cache is given 'perf probe --del' removes caches when '--cache' is given. Note that the delete pattern is not the same as for normal events. If you cached probes with event name, --del "eventname" works as expected. However, if you skipped it, the cached probes doesn't have actual event name. In that case --del "probe-desc" is required (wildcard is acceptable). For example a cache entry has the probe-desc "vfs_read $params", you can remove it with --del 'vfs_read'. ----- # perf probe --cache --list /[kernel.kallsyms] (1466a0a250b5d0070c6d0f03c5fed30b237970a1): vfs_read $params /usr/lib64/libc-2.17.so (c31ffe7942bfd77b2fca8f9bd5709d387a86d3bc): getaddrinfo $params # perf probe --cache --del vfs_read\ Removed cached event: probe:vfs_read # perf probe --cache --list /[kernel.kallsyms] (1466a0a250b5d0070c6d0f03c5fed30b237970a1): /usr/lib64/libc-2.17.so (c31ffe7942bfd77b2fca8f9bd5709d387a86d3bc): getaddrinfo $params ----- Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146736021651.27797.10250879847070772920.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-01 11:34:57 -03:00
Masami Hiramatsu	1f3736c9c8	perf probe: Show all cached probes perf probe --list shows all cached probes when --cache is given. Each caches are shown with on which binary that probed. E.g.: ----- # perf probe --cache vfs_read \$params # perf probe --cache -x /lib64/libc-2.17.so getaddrinfo \$params # perf probe --cache --list [kernel.kallsyms] (1466a0a250b5d0070c6d0f03c5fed30b237970a1): vfs_read $params /usr/lib64/libc-2.17.so (c31ffe7942bfd77b2fca8f9bd5709d387a86d3bc): getaddrinfo $params ----- Note that $params requires debuginfo. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146736020674.27797.13488316780383460180.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-01 11:34:57 -03:00
Jiri Olsa	7fa9b8fba0	perf test: Add -F/--dont-fork option Adding -F/--dont-fork option to bypass forking for each test. It's useful for debugging test. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Nilay Vaish <nilayvaish@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1467113345-12669-1-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-30 18:27:45 -03:00
Andi Kleen	d4897e1935	perf tools: Add documentation for perf.data on disk format Add some documentation for the on disk format of perf.data. This is not documenting the actual perf events -- which are documented in perf_event.h -- but just the additional headers that perf record adds around them when writing the data to disk. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1466800885-12974-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-29 10:07:23 -03:00
Wang Nan	9e1a7ea19f	perf data ctf: Add '--all' option for 'perf data convert' After this patch, 'perf data convert' convert comm events to output CTF stream. Result: # perf record -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.378 MB perf.data (73 samples) ] # perf data convert --to-ctf ./out.ctf [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ] [ perf data convert: Converted and wrote 0.003 MB (73 samples) ] # babeltrace --clock-seconds ./out.ctf/ [10627.402515791] (+?.?????????) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81065AF4, perf_tid = 0, perf_pid = 0, perf_period = 1 } [10627.402518972] (+0.000003181) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81065AF4, perf_tid = 0, perf_pid = 0, perf_period = 1 } ... // only sample event is converted # perf data convert --all --to-ctf ./out.ctf [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ] [ perf data convert: Converted and wrote 0.023 MB (73 samples, 384 non-samples) ] # babeltrace --clock-seconds ./out.ctf/ [ 0.000000000] (+?.?????????) perf_comm: { cpu_id = 0 }, { pid = 1, tid = 1, comm = "init" } [ 0.000000000] (+0.000000000) perf_comm: { cpu_id = 0 }, { pid = 2, tid = 2, comm = "kthreadd" } [ 0.000000000] (+0.000000000) perf_comm: { cpu_id = 0 }, { pid = 3, tid = 3, comm = "ksoftirqd/0" } ... // comm events are converted [10627.402515791] (+10627.402515791) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81065AF4, perf_tid = 0, perf_pid = 0, perf_period = 1 } [10627.402518972] (+0.000003181) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81065AF4, perf_tid = 0, perf_pid = 0, perf_period = 1 } ... // samples are also converted Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1466767332-114472-7-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-28 10:54:57 -03:00
Adrian Hunter	e216708d98	perf script: Add callindent option Based on patches from Andi Kleen. When printing PT instruction traces with perf script it is rather useful to see some indentation for the call tree. This patch adds a new callindent field to perf script that prints spaces for the function call stack depth. We already have code to track the function call stack for PT, that we can reuse with minor modifications. The resulting output is not quite as nice as ftrace yet, but a lot better than what was there before. Note there are some corner cases when the thread stack gets code confused and prints incorrect indentation. Even with that it is fairly useful. When displaying kernel code traces it is recommended to run as root, as otherwise perf doesn't understand the kernel addresses properly, and may not reset the call stack correctly on kernel boundaries. Example output: sudo perf-with-kcore record eg2 -a -e intel_pt// -- sleep 1 sudo perf-with-kcore script eg2 --ns -F callindent,time,comm,pid,sym,ip,addr,flags,cpu --itrace=cre \| less ... swapper 0 [000] 5830.389116586: call irq_exit ffffffff8104d620 smp_call_function_single_interrupt+0x30 => ffffffff8107e720 irq_exit swapper 0 [000] 5830.389116586: call idle_cpu ffffffff8107e769 irq_exit+0x49 => ffffffff810a3970 idle_cpu swapper 0 [000] 5830.389116586: return idle_cpu ffffffff810a39b7 idle_cpu+0x47 => ffffffff8107e76e irq_exit swapper 0 [000] 5830.389116586: call tick_nohz_irq_exit ffffffff8107e7bd irq_exit+0x9d => ffffffff810f2fc0 tick_nohz_irq_exit swapper 0 [000] 5830.389116919: call __tick_nohz_idle_enter ffffffff810f2fe0 tick_nohz_irq_exit+0x20 => ffffffff810f28d0 __tick_nohz_idle_enter swapper 0 [000] 5830.389116919: call ktime_get ffffffff810f28f1 __tick_nohz_idle_enter+0x21 => ffffffff810e9ec0 ktime_get swapper 0 [000] 5830.389116919: call read_tsc ffffffff810e9ef6 ktime_get+0x36 => ffffffff81035070 read_tsc swapper 0 [000] 5830.389116919: return read_tsc ffffffff81035084 read_tsc+0x14 => ffffffff810e9efc ktime_get swapper 0 [000] 5830.389116919: return ktime_get ffffffff810e9f46 ktime_get+0x86 => ffffffff810f28f6 __tick_nohz_idle_enter swapper 0 [000] 5830.389116919: call sched_clock_idle_sleep_event ffffffff810f290b __tick_nohz_idle_enter+0x3b => ffffffff810a7380 sched_clock_idle_sleep_event swapper 0 [000] 5830.389116919: call sched_clock_cpu ffffffff810a738b sched_clock_idle_sleep_event+0xb => ffffffff810a72e0 sched_clock_cpu swapper 0 [000] 5830.389116919: call sched_clock ffffffff810a734d sched_clock_cpu+0x6d => ffffffff81035750 sched_clock swapper 0 [000] 5830.389116919: call native_sched_clock ffffffff81035754 sched_clock+0x4 => ffffffff81035640 native_sched_clock swapper 0 [000] 5830.389116919: return native_sched_clock ffffffff8103568c native_sched_clock+0x4c => ffffffff81035759 sched_clock swapper 0 [000] 5830.389116919: return sched_clock ffffffff8103575c sched_clock+0xc => ffffffff810a7352 sched_clock_cpu swapper 0 [000] 5830.389116919: return sched_clock_cpu ffffffff810a7356 sched_clock_cpu+0x76 => ffffffff810a7390 sched_clock_idle_sleep_event swapper 0 [000] 5830.389116919: return sched_clock_idle_sleep_event ffffffff810a7391 sched_clock_idle_sleep_event+0x11 => ffffffff810f2910 __tick_nohz_idle_enter ... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1466689258-28493-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-23 17:04:26 -03:00
Adrian Hunter	055cd33d93	perf script: Print sample flags more nicely The flags field is synthesized and may have a value when Instruction Trace decoding. The flags are "bcrosyiABEx" which stand for branch, call, return, conditional, system, asynchronous, interrupt, transaction abort, trace begin, trace end, and in transaction, respectively. Change the display so that known combinations of flags are printed more nicely e.g.: "call" for "bc", "return" for "br", "jcc" for "bo", "jmp" for "b", "int" for "bci", "iret" for "bri", "syscall" for "bcs", "sysret" for "brs", "async" for "by", "hw int" for "bcyi", "tx abrt" for "bA", "tr strt" for "bB", "tr end" for "bE". However the "x" flag will be displayed separately in those cases e.g. "jcc (x)" for a condition branch within a transaction. Example: perf record -e intel_pt//u ls perf script --ns -F comm,cpu,pid,tid,time,ip,addr,sym,dso,symoff,flags ... ls 3689/3689 [001] 2062.020965237: jcc 7f06a958847a _dl_sysdep_start+0xfa (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a9588450 _dl_sysdep_start+0xd0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965237: jmp 7f06a9588461 _dl_sysdep_start+0xe1 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a95885a0 _dl_sysdep_start+0x220 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965237: jmp 7f06a95885a4 _dl_sysdep_start+0x224 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a9588470 _dl_sysdep_start+0xf0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965904: call 7f06a95884c3 _dl_sysdep_start+0x143 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a9589140 brk+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965904: syscall 7f06a958914a brk+0xa (/lib/x86_64-linux-gnu/ld-2.19.so) => 0 [unknown] ([unknown]) ls 3689/3689 [001] 2062.020966237: tr strt 0 [unknown] ([unknown]) => 7f06a958914c brk+0xc (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: return 7f06a9589165 brk+0x25 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a95884c8 _dl_sysdep_start+0x148 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: jcc 7f06a95884d7 _dl_sysdep_start+0x157 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: call 7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a958ac50 strlen+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: jcc 7f06a958ac6e strlen+0x1e (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a958ac60 strlen+0x10 (/lib/x86_64-linux-gnu/ld-2.19.so) ... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1466689258-28493-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-23 16:36:59 -03:00
Wang Nan	0aab21363f	perf record: Add --dry-run option to check cmdline options With '--dry-run', 'perf record' doesn't do reall recording. Combine with llvm.dump-obj option, --dry-run can be used to help compile BPF objects for embedded platform. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1466064161-48553-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-21 13:18:35 -03:00
Adrian Hunter	cbb0bba9f3	perf script: Fix documentation of '-f' when it should be '-F' The documentation for perf script mixes up '-f' and '-F'. Fix it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/None Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-21 13:18:33 -03:00
Masami Hiramatsu	2fd457a345	perf probe: Add --cache option to cache the probe definitions Add --cache option to cache the probe definitions. This just saves the result of the dwarf analysis to probe cache. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160615032840.31330.44412.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-15 14:34:42 -03:00
Jiri Olsa	b0d745b3c3	perf mem: Add --ldlat option Adding --ldlat option to specify desired latency for loads event. Specify 50 as loads event latency: $ perf mem record -e ldlat-loads -v --ldlat 50 true calling: record -W -d -e cpu/mem-loads,ldlat=50/P true Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1465928361-2442-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-15 10:35:27 -03:00
Andi Kleen	44b1e60ab5	perf stat: Basic support for TopDown in perf stat Add basic plumbing for TopDown in perf stat TopDown is intended to replace the frontend cycles idle/ backend cycles idle metrics in standard perf stat output. These metrics are not reliable in many workloads, due to out of order effects. This implements a new --topdown mode in perf stat (similar to --transaction) that measures the pipe line bottlenecks using standardized formulas. The measurement can be all done with 5 counters (one fixed counter) The result are four metrics: FrontendBound, BackendBound, BadSpeculation, Retiring that describe the CPU pipeline behavior on a high level. The full top down methology has many hierarchical metrics. This implementation only supports level 1 which can be collected without multiplexing. A full implementation of top down on top of perf is available in pmu-tools toplev. (http://github.com/andikleen/pmu-tools) The current version works on Intel Core CPUs starting with Sandy Bridge, and Atom CPUs starting with Silvermont. In principle the generic metrics should be also implementable on other out of order CPUs. TopDown level 1 uses a set of abstracted metrics which are generic to out of order CPU cores (although some CPUs may not implement all of them): topdown-total-slots Available slots in the pipeline topdown-slots-issued Slots issued into the pipeline topdown-slots-retired Slots successfully retired topdown-fetch-bubbles Pipeline gaps in the frontend topdown-recovery-bubbles Pipeline gaps during recovery from misspeculation These metrics then allow to compute four useful metrics: FrontendBound, BackendBound, Retiring, BadSpeculation. Add a new --topdown options to enable events. When --topdown is specified set up events for all topdown events supported by the kernel. Add topdown-* as a special case to the event parser, as is needed for all events containing -. The actual code to compute the metrics is in follow-on patches. v2: Use standard sysctl read function. v3: Move x86 specific code to arch/ v4: Enable --metric-only implicitly for topdown. v5: Add --single-thread option to not force per core mode v6: Fix output order of topdown metrics v7: Allow combining with -d v8: Remove --single-thread again v9: Rename functions, adding arch_ and topdown_. v10: Expand man page and describe TopDown better Paste intro into commit description. Print error when malloc fails. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1464119559-17203-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-06 17:04:15 -03:00
Andi Kleen	508be0dfe6	perf report: Add srcline_from/to branch sort keys Add "srcline_from" and "srcline_to" branch sort keys that allow to show the source lines of a branch. That makes it much easier to track down where particular branches happen in the program, for example to examine branch mispredictions, or to associate it with cycle counts: % perf record -b -e cycles:p ./tcall % perf report --sort srcline_from,srcline_to,mispredict ... 15.10% tcall.c:18 tcall.c:10 N 14.83% tcall.c:11 tcall.c:5 N 14.12% tcall.c:7 tcall.c:12 N 14.04% tcall.c:12 tcall.c:5 N 12.42% tcall.c:17 tcall.c:18 N 12.39% tcall.c:7 tcall.c:13 N 12.27% tcall.c:13 tcall.c:17 N ... % perf report --sort srcline_from,srcline_to,cycles ... 17.12% tcall.c:18 tcall.c:11 1 17.01% tcall.c:12 tcall.c:6 1 16.98% tcall.c:11 tcall.c:6 1 15.91% tcall.c:17 tcall.c:18 1 6.38% tcall.c:7 tcall.c:17 7 4.80% tcall.c:7 tcall.c:12 8 4.21% tcall.c:7 tcall.c:17 8 2.67% tcall.c:7 tcall.c:12 7 2.62% tcall.c:7 tcall.c:12 10 2.10% tcall.c:7 tcall.c:17 9 1.58% tcall.c:7 tcall.c:12 6 1.44% tcall.c:7 tcall.c:12 5 1.38% tcall.c:7 tcall.c:12 9 1.06% tcall.c:7 tcall.c:17 13 1.05% tcall.c:7 tcall.c:12 4 1.01% tcall.c:7 tcall.c:17 6 Open issues: - Some kernel symbols get misresolved. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1463775308-32748-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-05-23 11:25:16 -03:00
Arnaldo Carvalho de Melo	fe176085a4	perf tools: Fix usage of max_stack sysctl We cannot limit processing stacks from the current value of the sysctl, as we may be processing perf.data files, possibly from other machines. Instead use the old PERF_MAX_STACK_DEPTH, the sysctl default, that can be overriden using --max-stack or equivalent. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Fixes: `4cb93446c5` ("perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack") Link: http://lkml.kernel.org/n/tip-eqeutsr7n7wy0c36z24ytvii@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-05-20 11:43:56 -03:00
Wang Nan	0c1d46a879	perf record: Disable buildid cache options by default in switch output mode The cost of buildid cache processing is high: reading all events in output perf.data, opening each elf file to read buildids then copying them into ~/.debug directory. In switch output mode, these heavy works block perf from receiving perf events for too long. Enable no-buildid and no-buildid-cache by default if --switch-output is provided. Still allow user use --no-no-buildid to explicitly enable buildid in this case. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-6-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Updated man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:59 -03:00
Wang Nan	eca857ab38	perf record: Force enable --timestamp-filename when --switch-output is provided Without this patch, the last output doesn't have timestamp appended if --timestamp-filename is not explicitly provided. For example: # perf record -a --switch-output & [1] 11224 # kill -s SIGUSR2 11224 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622372823 ] # fg perf record -a --switch-output ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.027 MB perf.data (540 samples) ] # ls -l total 836 -rw------- 1 root root 33256 Dec 26 22:37 perf.data <---- Odd -rw------- 1 root root 817156 Dec 26 22:37 perf.data.2015122622372823 Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-5-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Updated man page, that also got an entry for --timestamp-filename ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:59 -03:00
Wang Nan	3c1cb7e372	perf record: Split output into multiple files via '--switch-output' Allow 'perf record' to split its output into multiple files. For example: # ~/perf record -a --timestamp-filename --switch-output & [1] 10763 # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622314468 ] # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622314762 ] # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] #[ perf record: Dump perf.data.2015122622315171 ] # fg perf record -a --timestamp-filename --switch-output ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2015122622315513 ] [ perf record: Captured and wrote 0.014 MB perf.data.<timestamp> (296 samples) ] # ls -l total 920 -rw------- 1 root root 797692 Dec 26 22:31 perf.data.2015122622314468 -rw------- 1 root root 59960 Dec 26 22:31 perf.data.2015122622314762 -rw------- 1 root root 59912 Dec 26 22:31 perf.data.2015122622315171 -rw------- 1 root root 19220 Dec 26 22:31 perf.data.2015122622315513 Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-4-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Added man page entry, used the re-synthesize patch in this series as a fixup ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:59 -03:00
Arnaldo Carvalho de Melo	4cb93446c5	perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack There is an upper limit to what tooling considers a valid callchain, and it was tied to the hardcoded value in the kernel, PERF_MAX_STACK_DEPTH (127), now that this can be tuned via a sysctl, make it read it and use that as the upper limit, falling back to PERF_MAX_STACK_DEPTH for kernels where this sysctl isn't present. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-yjqsd30nnkogvj5oyx9ghir9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-27 10:29:07 -03:00
Arnaldo Carvalho de Melo	f3e459d16a	perf trace: Bump --mmap-pages when --call-graph is used by the root user To reduce the chances we'll overflow the mmap buffer, manual fine tuning trumps this. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wxygbxmp1v9mng1ea28wet02@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-15 17:52:34 -03:00
Arnaldo Carvalho de Melo	0561499326	perf trace: Make --(min,max}-stack imply "--call-graph dwarf" If one uses: # perf trace --min-stack 16 Then it implicitly means that callgraphs should be enabled, and the best option in terms of widespread availability is "dwarf". Further work needed to choose a better alternative, LBR, in capable systems. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xtjmnpkyk42npekxz3kynzmx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-15 16:41:19 -03:00
Arnaldo Carvalho de Melo	5cf9c84e21	perf trace: Introduce --min-stack filter Counterpart to --max-stack, to help focusing on deeply nested calls. Can be combined with --duration, etc. E.g.: System wide syscall tracing looking for call stacks longer than 66: # trace --mmap-pages 32768 --filter-pid 2711 --call-graph dwarf,16384 --min-stack 66 Or more compactly: # trace -m 32768 --filt 2711 --call dwarf,16384 --min-st 66 363.027 ( 0.002 ms): gnome-shell/2287 poll(ufds: 0x7ffc5ea24230, nfds: 1, timeout_msecs: 4294967295 ) = 1 [0xf6fdd] (/usr/lib64/libc-2.22.so) _xcb_conn_wait+0x92 (/usr/lib64/libxcb.so.1.1.0) _xcb_out_send+0x4d (/usr/lib64/libxcb.so.1.1.0) xcb_writev+0x45 (/usr/lib64/libxcb.so.1.1.0) _XSend+0x19e (/usr/lib64/libX11.so.6.3.0) _XReply+0x82 (/usr/lib64/libX11.so.6.3.0) XSync+0x4d (/usr/lib64/libX11.so.6.3.0) dri3_bind_tex_image+0x42 (/usr/lib64/libGL.so.1.2.0) _cogl_winsys_texture_pixmap_x11_update+0x117 (/usr/lib64/libcogl.so.20.4.1) _cogl_texture_pixmap_x11_update+0x67 (/usr/lib64/libcogl.so.20.4.1) _cogl_texture_pixmap_x11_pre_paint+0x13 (/usr/lib64/libcogl.so.20.4.1) _cogl_pipeline_layer_pre_paint+0x5e (/usr/lib64/libcogl.so.20.4.1) _cogl_rectangles_validate_layer_cb+0x1b (/usr/lib64/libcogl.so.20.4.1) cogl_pipeline_foreach_layer+0xbe (/usr/lib64/libcogl.so.20.4.1) _cogl_framebuffer_draw_multitextured_rectangles+0x77 (/usr/lib64/libcogl.so.20.4.1) cogl_framebuffer_draw_multitextured_rectangle+0x51 (/usr/lib64/libcogl.so.20.4.1) paint_clipped_rectangle+0xb6 (/usr/lib64/libmutter.so.0.0.0) meta_shaped_texture_paint+0x3e3 (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2) meta_window_actor_paint+0x14b (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2) meta_window_group_paint+0x19f (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) [0x3d970] (/usr/lib64/gnome-shell/libgnome-shell.so) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_stage_paint+0x3a (/usr/lib64/libclutter-1.0.so.0.2400.2) meta_stage_paint+0x45 (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0x164 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) _clutter_stage_do_paint+0x17b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_stage_cogl_redraw+0x496 (/usr/lib64/libclutter-1.0.so.0.2400.2) _clutter_stage_do_update+0x117 (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_clock_dispatch+0x169 (/usr/lib64/libclutter-1.0.so.0.2400.2) g_main_context_dispatch+0x15a (/usr/lib64/libglib-2.0.so.0.4600.2) g_main_context_iterate.isra.29+0x1e0 (/usr/lib64/libglib-2.0.so.0.4600.2) g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2) meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0) main+0x3f7 (/usr/bin/gnome-shell) __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so) [0x2909] (/usr/bin/gnome-shell) 363.038 ( 0.006 ms): gnome-shell/2287 writev(fd: 5<socket:[32540]>, vec: 0x7ffc5ea243a0, vlen: 3 ) = 4 __GI___writev+0x2d (/usr/lib64/libc-2.22.so) _xcb_conn_wait+0x359 (/usr/lib64/libxcb.so.1.1.0) _xcb_out_send+0x4d (/usr/lib64/libxcb.so.1.1.0) xcb_writev+0x45 (/usr/lib64/libxcb.so.1.1.0) _XSend+0x19e (/usr/lib64/libX11.so.6.3.0) _XReply+0x82 (/usr/lib64/libX11.so.6.3.0) XSync+0x4d (/usr/lib64/libX11.so.6.3.0) dri3_bind_tex_image+0x42 (/usr/lib64/libGL.so.1.2.0) _cogl_winsys_texture_pixmap_x11_update+0x117 (/usr/lib64/libcogl.so.20.4.1) _cogl_texture_pixmap_x11_update+0x67 (/usr/lib64/libcogl.so.20.4.1) _cogl_texture_pixmap_x11_pre_paint+0x13 (/usr/lib64/libcogl.so.20.4.1) _cogl_pipeline_layer_pre_paint+0x5e (/usr/lib64/libcogl.so.20.4.1) _cogl_rectangles_validate_layer_cb+0x1b (/usr/lib64/libcogl.so.20.4.1) cogl_pipeline_foreach_layer+0xbe (/usr/lib64/libcogl.so.20.4.1) _cogl_framebuffer_draw_multitextured_rectangles+0x77 (/usr/lib64/libcogl.so.20.4.1) cogl_framebuffer_draw_multitextured_rectangle+0x51 (/usr/lib64/libcogl.so.20.4.1) paint_clipped_rectangle+0xb6 (/usr/lib64/libmutter.so.0.0.0) meta_shaped_texture_paint+0x3e3 (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2) meta_window_actor_paint+0x14b (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2) meta_window_group_paint+0x19f (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) [0x3d970] (/usr/lib64/gnome-shell/libgnome-shell.so) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_stage_paint+0x3a (/usr/lib64/libclutter-1.0.so.0.2400.2) meta_stage_paint+0x45 (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0x164 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) _clutter_stage_do_paint+0x17b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_stage_cogl_redraw+0x496 (/usr/lib64/libclutter-1.0.so.0.2400.2) _clutter_stage_do_update+0x117 (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_clock_dispatch+0x169 (/usr/lib64/libclutter-1.0.so.0.2400.2) g_main_context_dispatch+0x15a (/usr/lib64/libglib-2.0.so.0.4600.2) g_main_context_iterate.isra.29+0x1e0 (/usr/lib64/libglib-2.0.so.0.4600.2) g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2) meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0) main+0x3f7 (/usr/bin/gnome-shell) __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so) [0x2909] (/usr/bin/gnome-shell) 363.086 ( 0.042 ms): gnome-shell/2287 poll(ufds: 0x7ffc5ea24250, nfds: 1, timeout_msecs: 4294967295 ) = 1 [0xf6fdd] (/usr/lib64/libc-2.22.so) _xcb_conn_wait+0x92 (/usr/lib64/libxcb.so.1.1.0) wait_for_reply+0xb7 (/usr/lib64/libxcb.so.1.1.0) xcb_wait_for_reply+0x61 (/usr/lib64/libxcb.so.1.1.0) _XReply+0x127 (/usr/lib64/libX11.so.6.3.0) XSync+0x4d (/usr/lib64/libX11.so.6.3.0) dri3_bind_tex_image+0x42 (/usr/lib64/libGL.so.1.2.0) _cogl_winsys_texture_pixmap_x11_update+0x117 (/usr/lib64/libcogl.so.20.4.1) _cogl_texture_pixmap_x11_update+0x67 (/usr/lib64/libcogl.so.20.4.1) _cogl_texture_pixmap_x11_pre_paint+0x13 (/usr/lib64/libcogl.so.20.4.1) _cogl_pipeline_layer_pre_paint+0x5e (/usr/lib64/libcogl.so.20.4.1) _cogl_rectangles_validate_layer_cb+0x1b (/usr/lib64/libcogl.so.20.4.1) cogl_pipeline_foreach_layer+0xbe (/usr/lib64/libcogl.so.20.4.1) _cogl_framebuffer_draw_multitextured_rectangles+0x77 (/usr/lib64/libcogl.so.20.4.1) cogl_framebuffer_draw_multitextured_rectangle+0x51 (/usr/lib64/libcogl.so.20.4.1) paint_clipped_rectangle+0xb6 (/usr/lib64/libmutter.so.0.0.0) meta_shaped_texture_paint+0x3e3 (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2) meta_window_actor_paint+0x14b (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_real_paint+0x20 (/usr/lib64/libclutter-1.0.so.0.2400.2) meta_window_group_paint+0x19f (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) [0x3d970] (/usr/lib64/gnome-shell/libgnome-shell.so) _g_closure_invoke_va+0xb2 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_stage_paint+0x3a (/usr/lib64/libclutter-1.0.so.0.2400.2) meta_stage_paint+0x45 (/usr/lib64/libmutter.so.0.0.0) _g_closure_invoke_va+0x164 (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit_valist+0xc0d (/usr/lib64/libgobject-2.0.so.0.4600.2) g_signal_emit+0x8f (/usr/lib64/libgobject-2.0.so.0.4600.2) clutter_actor_continue_paint+0x2bb (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_actor_paint.part.41+0x47b (/usr/lib64/libclutter-1.0.so.0.2400.2) _clutter_stage_do_paint+0x17b (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_stage_cogl_redraw+0x496 (/usr/lib64/libclutter-1.0.so.0.2400.2) _clutter_stage_do_update+0x117 (/usr/lib64/libclutter-1.0.so.0.2400.2) clutter_clock_dispatch+0x169 (/usr/lib64/libclutter-1.0.so.0.2400.2) g_main_context_dispatch+0x15a (/usr/lib64/libglib-2.0.so.0.4600.2) g_main_context_iterate.isra.29+0x1e0 (/usr/lib64/libglib-2.0.so.0.4600.2) g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2) meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0) main+0x3f7 (/usr/bin/gnome-shell) __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so) [0x2909] (/usr/bin/gnome-shell) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-jncuxju9fibq2rl6olhqwjw6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-15 13:14:20 -03:00
Arnaldo Carvalho de Melo	c6d4a494a2	perf trace: Add --max-stack knob Similar to the one in the other tools (report, script, top). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-lh7kk5a5t3erwxw31ah0cgar@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-14 19:46:58 -03:00
Arnaldo Carvalho de Melo	6125cc8dac	perf script: Add --max-stack knob Works just like with 'perf report'. In some cases we may want to have more than 127 entries, the default maximum. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-mqkz2p5ok2978gztb0vsnocc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-14 19:46:57 -03:00
Jiri Olsa	73643bb6a2	perf sched map: Display only given cpus Introducing --cpus option that will display only given cpus. Could be used together with color-cpus option. $ perf sched map --cpus 0,1 A0 309999.786924 secs A0 => rcu_sched:7 . 309999.786930 secs B0 . 309999.786931 secs B0 => rcuos/2:25 B0 A0 309999.786947 secs Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-9-git-send-email-jolsa@kernel.org [ Added entry to man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-13 10:11:52 -03:00
Jiri Olsa	cf294f24f8	perf sched map: Color given cpus Adding --color-cpus option to display selected cpus with background color (red by default). It helps on navigating through the perf sched map output. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-8-git-send-email-jolsa@kernel.org [ Added entry to man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-13 10:11:51 -03:00
Jiri Olsa	a151a37a76	perf sched map: Color given pids Adding --color-pids option to display selected pids in color (blue by default). It helps on navigating through the 'perf sched map' output. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-7-git-send-email-jolsa@kernel.org [ Added entry to man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-13 10:11:51 -03:00
Jiri Olsa	99623c628f	perf sched: Add compact display option Add compact map display that does not output the whole cpu matrix, only cpus that got event. $ perf sched map --compact A0 1082427.094098 secs A0 => perf:19404 (CPU 2) A0 . 1082427.094127 secs . => swapper:0 (CPU 1) A0 . B0 1082427.094174 secs B0 => rcuos/2:25 (CPU 3) A0 . . 1082427.094177 secs C0 . . 1082427.094187 secs C0 => migration/2:21 C0 A0 . 1082427.094193 secs . A0 . 1082427.094195 secs D0 A0 . 1082427.094402 secs D0 => rngd:968 . A0 . 1082427.094406 secs . E0 . 1082427.095221 secs E0 => kworker/1:1:5333 . E0 *F0 1082427.095227 secs F0 => xterm:3342 It helps to display sane output for small thread loads on big cpu servers. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460467771-26532-4-git-send-email-jolsa@kernel.org [ Add entry in 'perf sched' man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-13 10:11:51 -03:00
Arnaldo Carvalho de Melo	44621819dd	perf trace: Exclude the kernel part of the callchain leading to a syscall The kernel parts are not that useful: # trace -m 512 -e nanosleep --call dwarf usleep 1 0.065 ( 0.065 ms): usleep/18732 nanosleep(rqtp: 0x7ffc4ee4e200) = 0 syscall_slow_exit_work ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) __nanosleep (/usr/lib64/libc-2.22.so) usleep (/usr/lib64/libc-2.22.so) main (/usr/bin/usleep) __libc_start_main (/usr/lib64/libc-2.22.so) _start (/usr/bin/usleep) # So lets just use perf_event_attr.exclude_callchain_kernel to avoid collecting it in the ring buffer: # trace -m 512 -e nanosleep --call dwarf usleep 1 0.063 ( 0.063 ms): usleep/19212 nanosleep(rqtp: 0x7ffc3df10fb0) = 0 __nanosleep (/usr/lib64/libc-2.22.so) usleep (/usr/lib64/libc-2.22.so) main (/usr/bin/usleep) __libc_start_main (/usr/lib64/libc-2.22.so) _start (/usr/bin/usleep) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-qctu3gqhpim0dfbcp9d86c91@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-11 22:18:19 -03:00
Milian Wolff	566a08859f	perf trace: Add support for printing call chains on sys_exit events. Now, one can print the call chain for every encountered sys_exit event, e.g.: $ perf trace -e nanosleep --call-graph dwarf path/to/ex_sleep 1005.757 (1000.090 ms): ex_sleep/13167 nanosleep(...) = 0 syscall_slow_exit_work ([kernel.kallsyms]) syscall_return_slowpath ([kernel.kallsyms]) int_ret_from_sys_call ([kernel.kallsyms]) __nanosleep (/usr/lib/libc-2.23.so) [unknown] (/usr/lib/libQt5Core.so.5.6.0) QThread::sleep (/usr/lib/libQt5Core.so.5.6.0) main (path/to/ex_sleep) __libc_start_main (/usr/lib/libc-2.23.so) _start (path/to/ex_sleep) Note that it is advised to increase the number of mmap pages to prevent event losses when using this new feature. Often, adding `-m 10M` to the `perf trace` invocation is enough. This feature is also available in strace when built with libunwind via `strace -k`. Performance wise, this solution is much better: $ time find path/to/linux &> /dev/null real 0m0.051s user 0m0.013s sys 0m0.037s $ time perf trace -m 800M --call-graph dwarf find path/to/linux &> /dev/null real 0m2.624s user 0m1.203s sys 0m1.333s $ time strace -k find path/to/linux &> /dev/null real 0m35.398s user 0m10.403s sys 0m23.173s Note that it is currently not possible to configure the print output. Adding such a feature, similar to what is available in `perf script` via its `--fields` knob can be added later on. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> LPU-Reference: 1460115255-17648-1-git-send-email-milian.wolff@kdab.com [ Split from a larger patch, do not print the IP, left align, remove dup call symbol__init(), added man page entry ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-11 22:18:16 -03:00

... 3 4 5 6 7 ...

727 Commits