Adding the check for maximum allowed frequency rate defined in following
file:
/proc/sys/kernel/perf_event_max_sample_rate
When we cross the maximum value we fail and display detailed error
message with advise.
$ perf record -F 3000 ls
Maximum frequency rate (2000) reached.
Please use -F freq option with lower value or consider
tweaking /proc/sys/kernel/perf_event_max_sample_rate.
In case user does not specify the frequency and the default value cross
the maximum, we display warning and set the frequency value to the
current maximum.
$ perf record ls
Lowering default frequency rate to 2000.
Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
Same messages are used for 'perf top'.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383660887-1734-4-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_evlist__mmap_read used 'union perf_event' as a placeholder for
event crossing the mmap boundary.
This is ok for sample shorter than ~PATH_MAX. However we could grow up
to the maximum sample size which is 16 bits max.
I hit this overflow issue when using 'perf top -G dwarf' which produces
sample with the size around 8192 bytes. We could configure any valid
sample size here using: '-G dwarf,size'.
Using array with sample max size instead for the event placeholder. Also
adding another safe check for the dynamic size of the user stack.
TODO: The 'struct perf_mmap' is quite big now, maybe we could use some
lazy allocation for event_copy size.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1380721599-24285-1-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I am getting segfaults *after* the time sorting of perf samples where
the event type is off the charts:
(gdb) bt
\#0 0x0807b1b2 in hists__inc_nr_events (hists=0x80a99c4, type=1163281902) at util/hist.c:1225
\#1 0x08070795 in perf_session_deliver_event (session=0x80a9b90, event=0xf7a6aff8, sample=0xffffc318, tool=0xffffc520,
file_offset=0) at util/session.c:884
\#2 0x0806f9b9 in flush_sample_queue (s=0x80a9b90, tool=0xffffc520) at util/session.c:555
\#3 0x0806fc53 in process_finished_round (tool=0xffffc520, event=0x0, session=0x80a9b90) at util/session.c:645
This is bizarre because the event has already been processed once --
before it was added to the samples queue -- and the event was found to
be sane at that time.
There seem to be 2 causes:
1. perf_evlist__mmap_read updates the read location even though there
are outstanding references to events sitting in the mmap buffers via the
ordered samples queue.
2. There is a single evlist->event_copy for all evlist entries.
event_copy is used to handle an event wrapping at the mmap buffer
boundary.
This patch addresses the second problem - making event_copy local to
each perf_mmap. With this change my highly repeatable use case no longer
fails.
The first problem is much more complicated and will be the subject of a
future patch.
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1360098762-61827-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just like was done for parse_events__set_leader.
Also we need to have the list_entry set_leader method in evlist.c so that we
don't grow another dep in the python binding:
# ~acme/git/linux/tools/perf/python/twatch.py
Traceback (most recent call last):
File "/home/acme/git/linux/tools/perf/python/twatch.py", line 16, in <module>
import perf
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: parse_events__set_leader
And also remove a pr_debug from evsel.c so that we avoid this one too:
# ~acme/git/linux/tools/perf/python/twatch.py
Traceback (most recent call last):
File "/home/acme/git/linux/tools/perf/python/twatch.py", line 16, in <module>
import perf
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-0hk9dazg9pora9jylkqngovm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds a functionality that allows to create event groups
based on the way they are specified on the command line. Adding
functionality to the '{}' group syntax introduced in earlier patch.
The current '--group/-g' option behaviour remains intact. If you
specify it for record/stat/top command, all the specified events
become members of a single group with the first event as a group
leader.
With the new '{}' group syntax you can create group like:
# perf record -e '{cycles,faults}' ls
resulting in single event group containing 'cycles' and 'faults'
events, with cycles event as group leader.
All groups are created with regards to threads and cpus. Thus
recording an event group within a 2 threads on server with
4 CPUs will create 8 separate groups.
Examples (first event in brackets is group leader):
# 1 group (cpu-clock,task-clock)
perf record --group -e cpu-clock,task-clock ls
perf record -e '{cpu-clock,task-clock}' ls
# 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
perf record -e '{cpu-clock,task-clock},{minor-faults,major-faults}' ls
# 1 group (cpu-clock,task-clock,minor-faults,major-faults)
perf record --group -e cpu-clock,task-clock -e minor-faults,major-faults ls
perf record -e '{cpu-clock,task-clock,minor-faults,major-faults}' ls
# 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
perf record -e '{cpu-clock,task-clock} -e '{minor-faults,major-faults}' \
-e instructions ls
# 1 group
# (cpu-clock,task-clock,minor-faults,major-faults,instructions)
perf record --group -e cpu-clock,task-clock \
-e minor-faults,major-faults -e instructions ls perf record -e
'{cpu-clock,task-clock,minor-faults,major-faults,instructions}' ls
It's possible to use standard event modifier for a group, which spans
over all events in the group and updates each event modifier settings,
for example:
# perf record -r '{faults:k,cache-references}:p'
resulting in ':kp' modifier being used for 'faults' and ':p' modifier
being used for 'cache-references' event.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/n/tip-ho42u0wcr8mn1otkalqi13qp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The pevent thing is per perf.data file, so I made it stop being static
and become a perf_session member, so tools processing perf.data files
use perf_session and _there_ we read the trace events description into
session->pevent and then change everywhere to stop using that single
global pevent variable and use the per session one.
Note that it _doesn't_ fall backs to trace__event_id, as we're not
interested at all in what is present in the
/sys/kernel/debug/tracing/events in the workstation doing the analysis,
just in what is in the perf.data file.
This patch also introduces perf_session__set_tracepoints_handlers that
is the perf perf.data/session way to associate handlers to tracepoint
events by resolving their IDs using the events descriptions stored in a
perf.data file. Make 'perf sched' use it.
Reported-by: Dmitry Antipov <dmitry.antipov@linaro.org>
Tested-by: Dmitry Antipov <dmitry.antipov@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linaro-dev@lists.linaro.org
Cc: patches@linaro.org
Link: http://lkml.kernel.org/r/20120625232016.GA28525@infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When no event is specified the tools use perf_evlist__add_default(), that will
call event_attr_init to initialize the KVM exclusion bits.
When the change was made to the tools so that by default guest samples would be
excluded, the changes were made just to the parsing routines and to
perf_evlist__add_default(), not to perf_evlist__add_attrs, that is used so far
just by perf stat to add multiple events, according to the level of detail
specified.
Recently the tools were changed to reconstruct the event name from all the
details in perf_event_attr, not just from .type and .config, but taking into
account all the feature bits (.exclude_{guest,host,user,kernel,etc},
.precise_ip, etc).
That is when we noticed that the default for perf stat wasn't the one for the
rest of the tools, i.e. the .exclude_guest bit wasn't being set.
I.e. the default, that doesn't call event_attr_init was showing the :HG
modifier:
$ perf stat usleep 1
Performance counter stats for 'usleep 1':
0.942119 task-clock # 0.454 CPUs utilized
1 context-switches # 0.001 M/sec
0 CPU-migrations # 0.000 K/sec
126 page-faults # 0.134 M/sec
693,193 cycles:HG # 0.736 GHz [40.11%]
407,461 stalled-cycles-frontend:HG # 58.78% frontend cycles idle [72.29%]
365,403 stalled-cycles-backend:HG # 52.71% backend cycles idle
465,982 instructions:HG # 0.67 insns per cycle
# 0.87 stalled cycles per insn
89,760 branches:HG # 95.275 M/sec
6,178 branch-misses:HG # 6.88% of all branches
0.002077228 seconds time elapsed
While if one explicitely specifies the same events, which will make the parsing code
to be called and thus event_attr_init is called:
$ perf stat -e task-clock,context-switches,migrations,page-faults,cycles,stalled-cycles-frontend,stalled-cycles-backend,instructions,branches,branch-misses usleep 1
Performance counter stats for 'usleep 1':
1.040349 task-clock # 0.500 CPUs utilized
2 context-switches # 0.002 M/sec
0 CPU-migrations # 0.000 K/sec
127 page-faults # 0.122 M/sec
587,966 cycles # 0.565 GHz [13.18%]
459,167 stalled-cycles-frontend # 78.09% frontend cycles idle
390,249 stalled-cycles-backend # 66.37% backend cycles idle
504,006 instructions # 0.86 insns per cycle
# 0.91 stalled cycles per insn
96,455 branches # 92.714 M/sec
6,522 branch-misses # 6.76% of all branches [96.12%]
0.002078681 seconds time elapsed
Fix it by introducing a perf_evlist__add_default_attrs method that will call
evlist_attr_init in all the perf_event_attr entries before adding the events.
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4eysr236r0pgiyum9epwxw7s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The __perf_evsel__open routing was grouping just the threads for that
specific events per cpu when we want to group all threads in all events
to the first fd opened on that cpu.
So pass the xyarray with the first event, where the other events will be
able to get that first per cpu fd.
At some point top and record will switch to using perf_evlist__open that
takes care of this detail and probably will also handle the fallback
from hw to soft counters, etc.
Reported-by: Deng-Cheng Zhu <dczhu@mips.com>
Tested-by: Deng-Cheng Zhu <dczhu@mips.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ebm34rh098i9y9v4cytfdp0x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>