The per-browser screen refresh routine (ui_browser->refresh()) should
return the first row that should be cleaned after the rows just printed,
in case not all rows available on the screen gets filled.
When moving the extra title lines logic from the hists browser to the
generic ui_browser class, one piece of that logic remained in the hists
browser and then when going back from the annotate browser to the hists
browser in a case where fewer lines were displayed in the hists browser,
for instance when filtering the entries per substring, one line of the
annotate browser would remain on the screen, fix that.
Example of the screen artifact:
================================================================================
Samples: 73K of event 'cycles:ppp', 4000 Hz, Event count (approx.): 45172901394
Overhead Shared O Symbol
0.30% [kernel] [k] __indirect_thunk_start
0.09% [kernel] [k] __x86_indirect_thunk_r10
│ lfence
================================================================================
Here from 'perf top' the view was zoomed with '/thunk' to functions
having that substring, then the first was annotated and from the
annotate browser ESC was pressed, then the first lines were overwritten,
but the 'lfence' line remained due to the off by one bug fixed in this
cset.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: ef9ff6017e ("perf ui browser: Move the extra title lines from the hists browser")
Link: https://lkml.kernel.org/n/tip-odryfso74eaarm0z3e4v9owx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that we can have extra title lines we should use ui_browser->rows
and not ->height when drawing lines, as it will use ui_browser__gotorc()
and that will take the extra title lines into account, which was causing
an off by one at the end of the vertical line drawn by
__ui_browser__vline(), fix it.
The visual effect was that the last line, with status messages, was
being overwritten by the vertical line, looking like:
Press 'h' for help on│key bindings
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: ef9ff6017e ("perf ui browser: Move the extra title lines from the hists browser")
Link: https://lkml.kernel.org/n/tip-08y1ln3xjn76zvizz1i1dsvn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This will be useful for the annotate browser as well, that wants to have
extra title lines, i.e. the current ui_browser unconditionally reserves
the first line for a browser title and the last one for status messages.
But some browsers, like the buckets one (hists browser) needs extra
lines to show headers, allowing it to be shown or not, press 'H' in
'perf top' or 'perf report' to see this feature.
So move that logic to the core ui_browser used by the hists_browser
('perf top' and 'perf report' main interface) so that it can be used by
the annotate browser too.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935
Link: https://lkml.kernel.org/n/tip-r38xm3ut37ulbg1o5tn5iise@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For instance:
entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux
5.50 │ → callq do_syscall_64
14.56 │ mov 0x58(%rsp),%rcx
7.44 │ mov 0x80(%rsp),%r11
0.32 │ cmp %rcx,%r11
│ → jne swapgs_restore_regs_and_return_to_usermode
0.32 │ shl $0x10,%rcx
0.32 │ sar $0x10,%rcx
3.24 │ cmp %rcx,%r11
│ → jne swapgs_restore_regs_and_return_to_usermode
2.27 │ cmpq $0x33,0x88(%rsp)
1.29 │ → jne swapgs_restore_regs_and_return_to_usermode
│ mov 0x30(%rsp),%r11
8.74 │ cmp %r11,0x90(%rsp)
│ → jne swapgs_restore_regs_and_return_to_usermode
0.32 │ test $0x10100,%r11
│ → jne swapgs_restore_regs_and_return_to_usermode
0.32 │ cmpq $0x2b,0xa0(%rsp)
0.65 │ → jne swapgs_restore_regs_and_return_to_usermode
It'll behave just like a "call" instruction, i.e. press enter or right
arrow over one such line and the browser will navigate to the annotated
disassembly of that function, which when exited, via left arrow or esc,
will come back to the calling function.
Now to support jump to an offset on a different function...
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-78o508mqvr8inhj63ddtw7mo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Because they all really check if we can access data structures/visual
constructs where a "jump" instruction targets code in the same function,
i.e. things like:
__pthread_mutex_lock /usr/lib64/libpthread-2.26.so
1.95 │ mov __pthread_force_elision,%ecx
│ ┌──test %ecx,%ecx
0.07 │ ├──je 60
│ │ test $0x300,%esi
│ │↓ jne 60
│ │ or $0x100,%esi
│ │ mov %esi,0x10(%rdi)
│ 42:│ mov %esi,%edx
│ │ lea 0x16(%r8),%rsi
│ │ mov %r8,%rdi
│ │ and $0x80,%edx
│ │ add $0x8,%rsp
│ │→ jmpq __lll_lock_elision
│ │ nop
0.29 │ 60:└─→and $0x80,%esi
0.07 │ mov $0x1,%edi
0.29 │ xor %eax,%eax
2.53 │ lock cmpxchg %edi,(%r8)
And not things like that "jmpq __lll_lock_elision", that instead should behave
like a "call" instruction and "jump" to the disassembly of "___lll_lock_elision".
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-3cwx39u3h66dfw9xjrlt7ca2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just like we have in the histograms browser used as the main screen for
'perf top --tui' and 'perf report --tui', to print the current
annotation to a file with a named composed by the symbol name and the
".annotation" suffix.
Here is one example of pressing 'A' on 'perf top' to live annotate a
kernel function and then press 'P' to dump that annotation, the
resulting file:
# cat _raw_spin_lock_irqsave.annotation
_raw_spin_lock_irqsave() /proc/kcore
Event: cycles:ppp
7.14 nop
21.43 push %rbx
7.14 pushfq
pop %rax
nop
mov %rax,%rbx
cli
nop
xor %eax,%eax
mov $0x1,%edx
64.29 lock cmpxchg %edx,(%rdi)
test %eax,%eax
↓ jne 2b
mov %rbx,%rax
pop %rbx
← retq
2b: mov %eax,%esi
→ callq queued_spin_lock_slowpath
mov %rbx,%rax
pop %rbx
← retq
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zzmnrwugb5vtk7bvg0rbx150@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When recently using 'perf report --stat' it was not clear to me from the
output whether a particular statistics field (LOST_SAMPLES) was not
present, or just zero:
fomalhaut:~> perf report --stat
Aggregated stats:
TOTAL events: 495984
MMAP events: 85
COMM events: 3389
EXIT events: 1605
THROTTLE events: 2
UNTHROTTLE events: 2
FORK events: 3377
SAMPLE events: 472629
MMAP2 events: 14753
FINISHED_ROUND events: 139
THREAD_MAP events: 1
CPU_MAP events: 1
TIME_CONV events: 1
I had to check the output several times to ascertain that I'm not
misreading the output, that the field didn't change and that I didn't
misremember the name. In fact I had to look into the perf source to make
sure that zero fields are indeed not shown.
With the patch applied:
fomalhaut:~> perf report --stat
Aggregated stats:
TOTAL events: 495984
MMAP events: 85
LOST events: 0
COMM events: 3389
EXIT events: 1605
THROTTLE events: 2
UNTHROTTLE events: 2
FORK events: 3377
READ events: 0
SAMPLE events: 472629
MMAP2 events: 14753
AUX events: 0
ITRACE_START events: 0
LOST_SAMPLES events: 0
SWITCH events: 0
SWITCH_CPU_WIDE events: 0
NAMESPACES events: 0
ATTR events: 0
EVENT_TYPE events: 0
TRACING_DATA events: 0
BUILD_ID events: 0
FINISHED_ROUND events: 139
ID_INDEX events: 0
AUXTRACE_INFO events: 0
AUXTRACE events: 0
AUXTRACE_ERROR events: 0
THREAD_MAP events: 1
CPU_MAP events: 1
STAT_CONFIG events: 0
STAT events: 0
STAT_ROUND events: 0
EVENT_UPDATE events: 0
TIME_CONV events: 1
FEATURE events: 0
It's pretty clear at a glance that LOST_SAMPLES is present but zero.
The original output can still be gotten via:
fomalhaut:~> perf report --stat | grep -vw 0
Aggregated stats:
TOTAL events: 495984
MMAP events: 85
COMM events: 3389
EXIT events: 1605
THROTTLE events: 2
UNTHROTTLE events: 2
FORK events: 3377
SAMPLE events: 472629
MMAP2 events: 14753
FINISHED_ROUND events: 139
THREAD_MAP events: 1
CPU_MAP events: 1
TIME_CONV events: 1
So I don't think there's any real loss in functionality.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180307152430.7e5h7e657b7bgd7q@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This first happened with a gcc function, _cpp_lex_token, that has the
usual jumps:
│1159e6c: ↓ jne 115aa32 <_cpp_lex_token@@Base+0xf92>
I.e. jumps to a label inside that function (_cpp_lex_token), and those
works, but also this kind:
│1159e8b: ↓ jne c469be <cpp_named_operator2name@@Base+0xa72>
I.e. jumps to another function, outside _cpp_lex_token, which are not
being correctly handled generating as a side effect references to
ab->offset[] entries that are set to NULL, so to make this code more
robust, check that here.
A proper fix for will be put in place, looking at the function name
right after the '<' token and probably treating this like a 'call'
instruction.
For now just don't draw the arrow.
Reported-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: https://lkml.kernel.org/n/tip-5tzvb875ep2sel03aeefgmud@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,
1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'
Percent│ IPC Cycle
│ if (flag)
1.37 │0.4┌── 1 ↓ je 82
│ │ x += x / y + y / x;
0.00 │0.4│ 1310 movsd (%rsp),%xmm0
0.00 │0.4│ 565 movsd 0x8(%rsp),%xmm4
│0.4│ movsd 0x8(%rsp),%xmm1
│0.4│ movsd (%rsp),%xmm3
│0.4│ divsd %xmm4,%xmm0
0.00 │0.4│ 579 divsd %xmm3,%xmm1
│0.4│ movsd (%rsp),%xmm2
│0.4│ addsd %xmm1,%xmm0
│0.4│ addsd %xmm2,%xmm0
0.00 │0.4│ movsd %xmm0,(%rsp)
│ │ volatile double x = 1212121212, y = 121212;
│ │
│ │ s_randseed = time(0);
│ │ srand(s_randseed);
│ │
│ │ for (i = 0; i < 2000000000; i++) {
1.37 │0.4└─→ 82: sub $0x1,%ebx
28.21 │0.48 17 ↑ jne 38
The jump arrow in above example is not correct. It should add the
width of IPC and Cycle.
With this patch, the result is:
Percent│ IPC Cycle
│ if (flag)
1.37 │0.48 1 ┌──je 82
│ │ x += x / y + y / x;
0.00 │0.48 1310 │ movsd (%rsp),%xmm0
0.00 │0.48 565 │ movsd 0x8(%rsp),%xmm4
│0.48 │ movsd 0x8(%rsp),%xmm1
│0.48 │ movsd (%rsp),%xmm3
│0.48 │ divsd %xmm4,%xmm0
0.00 │0.48 579 │ divsd %xmm3,%xmm1
│0.48 │ movsd (%rsp),%xmm2
│0.48 │ addsd %xmm1,%xmm0
│0.48 │ addsd %xmm2,%xmm0
0.00 │0.48 │ movsd %xmm0,(%rsp)
│ │ volatile double x = 1212121212, y = 121212;
│ │
│ │ s_randseed = time(0);
│ │ srand(s_randseed);
│ │
│ │ for (i = 0; i < 2000000000; i++) {
1.37 │0.48 82:└─→sub $0x1,%ebx
28.21 │0.48 17 ↑ jne 38
Committer notes:
Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
Skylake is that we'll have the cycles counts in each branch record
entry, so to see the Cycles and IPC columns, and be able to test this
patch, one need a capable hardware.
While applying this I first tested it on a Broadwell class machine and
couldn't get those columns, will add code to the annotate browser to
warn the user about that, i.e. you have branch records, but no cycles,
use a more recent hardware to get the cycles and IPC columns.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>