Add support for "perf record" and "perf report" to WA #971

markbellamy · 2019-04-25T18:06:46Z

No description provided.

setrofim

Class-level description should also be updated to describe the difference between the two modes and when one should be used vs the other.

setrofim · 2019-04-26T07:31:06Z

wa/instruments/perf.py

@@ -77,6 +77,17 @@ class PerfInstrument(Instrument):
                  description="""Provides labels for pref output. If specified, the number of
                  labels must match the number of ``optionstring``\ s.
                  """),
+        Parameter('perf_mode', kind=str, default='stat', allowed_values=['stat', 'record'],


Parameter can be renamed to just "mode", given that it is for perf instrument, calling it perf_mode is redundant.

Parameter can be renamed to just "mode"

... or (sub)command, given that it's how perf itself calls these?

$ perf help usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS] The most commonly used perf commands are: [...] See 'perf help COMMAND' for more information on a specific command.

Thanks, changed in latest version.

setrofim · 2019-04-26T07:35:49Z

wa/instruments/perf.py

+                  """),
+        Parameter('report_optionstring', kind=list_or_string, default='',
+                  global_alias='report_perf_options',
+                  description="""Specifies options to be used for the 'perf report' command used


Please format consistently with the other parameters.

Thanks, changed in latest version.

setrofim · 2019-04-26T07:37:54Z

wa/instruments/perf.py

+                            count = temp.strip('%')
+                            context.add_metric(metric, count, '%')
+
+    def update_output(self, context):


Should go above internal methods.

Thanks, changed in latest version.

setrofim · 2019-04-26T07:40:46Z

wa/instruments/perf.py

+                            context.add_metric(metric, count, '%')
+
+    def update_output(self, context):
+        self.logger.info('Extracting reports from target...')


Given the logging inside the _stat and _output methods, this is redundant. Please keep info-level logging to a minimum. Should be converted to debug.

Thanks, changed in latest version.

setrofim · 2019-04-26T07:42:15Z

wa/instruments/perf.py

@@ -134,5 +146,38 @@ def update_output(self, context):
                                metric = '{}_{}'.format(label, match.group(3))
                                context.add_metric(metric, count, classifiers=classifiers)

+    def _update_output_record(self, context):
+        outdir = os.path.join(context.output_directory, 'perf')
+        self.collector.get_trace(outdir)


Given that the above two lines happen in both _stat and record cases, they should be moved out into update_output.

Thanks, changed in latest version.

setrofim · 2019-04-26T07:42:50Z

wa/instruments/perf.py

+
+        self.logger.info('Processing perf report reports.')
+        for host_file in os.listdir(outdir):
+            label = host_file.split('.rpt')[0]


use os.path.splitext instead.

Thanks, changed in latest version.

setrofim · 2019-04-26T07:44:50Z

wa/instruments/perf.py

+                                event_type = words[5]
+                                event_type = event_type.strip("'")
+                        if words[0] != '#' and '%' in words[0] and len(words) == 5:
+                            metric = 'perf/{}/{}/{}'.format(event_type, words[1], words[2].strip("[]"))


Consider using classifiers rather than packing everything into the metric name.

I've added the classifiers but also left it packed in the metric name. It seems that some of the output processors, e.g. csv, default to classifier output off and that will result to a lot of meaningless/duplicate entries in the results file by default options. Which doesn't seem likely friendly behavior.

I am very open to peoples thoughts on what is the right behavior here thou.

I had a thought today that change the metric to / which should be unique key and most critical info. Then use classifier for the other entries.

setrofim · 2019-04-26T07:48:20Z

wa/instruments/perf.py

+                        if words[0] != '#' and '%' in words[0] and len(words) == 5:
+                            metric = 'perf/{}/{}/{}'.format(event_type, words[1], words[2].strip("[]"))
+                            temp = words[0]
+                            count = temp.strip('%')


temp not necessary here. Also, a metric value should be a number, not string.

count = numeric(words[0].strip('%'))

(note: numeric must be imported from wa.utils.types.

Thanks, changed in latest version. (I initially thought about doing this but expected it to be written out as a string again so didn't seem necessary.)

marcbonnici · 2019-04-26T09:57:13Z

wa/instruments/perf.py

+        self.collector.get_trace(outdir)
+
+        self.logger.info('Processing perf report reports.')
+        for host_file in os.listdir(outdir):


This is currently attempting to process every file in the perf directory, since we are now pulling the raw trace data to this directory this fails trying to parse binary data. Should this be updated to only process .rpt files?

Thanks, changed in latest version.

ptosi

As stated in the corresponding devlib PR (ARM-software/devlib#382):

this "review" is posted as comments and should only be seen as so.

These include comments about details but also general comments about the architecture of this instrument and the way its eventual growth seems to be shaped. As an end user, I would personally prefer the perf instrument to be as generic and flexible as possible (in terms of running perf commands, not only stat or record nor having to manually patch/inject redundant code for each subcommand).

However, I believe that this should not influence this PR. Instead, it could be merged for now and another (heavier) PR could be open later on to refactor the perf instrument. Therefore, I leave it to the author of the PR and the maintainers to decide which of my comments add value to this PR and which ones may be ignored.

ptosi · 2019-04-29T10:57:20Z

wa/instruments/perf.py

@@ -77,6 +77,17 @@ class PerfInstrument(Instrument):
                  description="""Provides labels for pref output. If specified, the number of
                  labels must match the number of ``optionstring``\ s.
                  """),
+        Parameter('perf_mode', kind=str, default='stat', allowed_values=['stat', 'record'],


Parameter can be renamed to just "mode"

... or (sub)command, given that it's how perf itself calls these?

$ perf help usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS] The most commonly used perf commands are: [...] See 'perf help COMMAND' for more information on a specific command.

ptosi · 2019-04-29T13:44:20Z

wa/instruments/perf.py

+            label = host_file.split('.rpt')[0]
+            host_file_path = os.path.join(outdir, host_file)
+            context.add_artifact(label, host_file_path, 'raw')
+            with open(host_file_path) as fh:


This parser is extremely specific. I understand we don't want to spend the effort writing a perfectly generic parser for perf report that works in all cases but this implementation is definitely too narrow. As an example, running on my host machine:

Following sudo perf record -a sleep 1:

$ sudo perf report 2>/dev/null | head -n 20 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1K of event 'cycles:ppp' # Event count (approx.): 218123588 # # Overhead Command Shared Object Symbol # ........ ............... .......................... ............................ # 41.86% swapper [kernel.kallsyms] [k] intel_idle 2.06% perf [kernel.kallsyms] [k] smp_call_function_single 0.92% Chrome_ChildIOT libpthread-2.27.so [.] __pthread_mutex_lock 0.69% swapper [kernel.kallsyms] [k] calc_global_load 0.52% swapper [kernel.kallsyms] [k] ioread32 0.49% CompositorTileW libc-2.27.so [.] __memset_avx2_erms 0.41% IPDL Background [kernel.kallsyms] [k] cpuacct_charge 0.41% IPDL Background [vdso] [.] __vdso_clock_gettime 0.41% IPDL Background libxul.so [.] 0x0000000000879a17

$ sudo perf report --sort=dso 2>/dev/null | head -n 20 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1K of event 'cycles:ppp' # Event count (approx.): 218123588 # # Overhead Shared Object # ........ .......................... # 67.43% [kernel.kallsyms] 10.79% libxul.so 10.40% slack 2.29% libpthread-2.27.so 1.67% libc-2.27.so 1.47% firefox 1.15% libglib-2.0.so.0.5600.3 0.84% nouveau_dri.so 0.60% [vdso]

$ sudo perf report --sort=cpu 2>/dev/null | head -n 20 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1K of event 'cycles:ppp' # Event count (approx.): 218123588 # # Overhead CPU # ........ ... # 15.64% 006 9.37% 003 8.75% 011 8.75% 000 7.14% 009 7.07% 007 5.44% 004 5.29% 015 5.02% 012

Following sudo perf record -g sleep 1:

$ sudo perf report -g 2>/dev/null | head -n 30 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 14 of event 'cycles:ppp' # Event count (approx.): 1311183 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................. .................................. # 32.80% 0.00% sleep [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe | ---entry_SYSCALL_64_after_hwframe do_syscall_64 | |--22.57%--sys_exit_group | do_group_exit | do_exit | mmput | exit_mmap | tlb_finish_mmu | arch_tlb_finish_mmu | tlb_flush_mmu_free | free_pages_and_swap_cache | release_pages | --10.23%--sys_execve do_execveat_common.isra.34 search_binary_handler

Obviously, I don't expect this parser to handle all outputs of perf report (e.g. --call-graph, --children, --header, ...) but handling the extremely useful --sort would be a nice start; I believe that parsing the table(s) of samples into pandas.DataFrame(s) would make the following logic easier to work with. I actually have implemented such a parser (currently internal) which I could integrate here.

This is just a first implementation to have it parse the basic output. As you say handling all the output options is difficult. If I update this to handle new report formats I'll likely see about committing them but I am not planning to write a more generic/complete one. If you have one that could be integrate here that would be very good.

ptosi · 2019-04-29T14:21:10Z

wa/instruments/perf.py

+        # Extract data for 'perf record'
+        elif self.perf_mode == 'record':
+            self._update_output_record(context)
+


Suggested change

# else: fail silently

You could use a dictionary of functions (method-local dictionary of instance methods, class-attribute dictionary of class methods, ... your call), accessed with self.perf_mode as a key: if it's not present, a KeyError (IIRC) is raised. Furthermore, update_output becomes "subcommand-agnostic". This would even (depending on the implementation) allow assert-based unit testing by comparing the allowed_values of perf_mode and the keys() of this dictionary, allowing to programmatically express the link between the supported perf subcommands and the need for an implemented _update_output_<...> method, currently implicit. Finally, it's more pythonic than this switch-like use of if-elif-elif-... .

(I know that self.perf_mode has been sanitized but yet, I believe there's value in this change).

markbellamy · 2019-04-30T16:39:57Z

As an end user, I would personally prefer the perf instrument to be as generic and flexible as possible (in terms of running perf commands, not only stat or record nor having to manually patch/inject redundant code for each subcommand).

I agree, I was surprised that perf support was limited to just perf stat and I think a much more generic implementation is the right direction. This implementation is just a "quick" one to add 'perf record' support to the current implementation. as I wanted to use it. It seem the right thing to see about committing it back as might be useful to others as stop gap until wider perf support is added.

Add support for "perf record" and "perf report" to WA

7c094ba

setrofim requested changes Apr 26, 2019

View reviewed changes

marcbonnici reviewed Apr 26, 2019

View reviewed changes

Updates based on feedback

bbe97eb

ptosi reviewed Apr 30, 2019

View reviewed changes

setrofim approved these changes May 2, 2019

View reviewed changes

marcbonnici approved these changes May 3, 2019

View reviewed changes

ptosi mentioned this pull request May 15, 2019

Add generic perf Instrument #979

Closed

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for "perf record" and "perf report" to WA #971

Add support for "perf record" and "perf report" to WA #971

markbellamy commented Apr 25, 2019

setrofim left a comment

setrofim Apr 26, 2019

ptosi Apr 29, 2019

markbellamy Apr 29, 2019

setrofim Apr 26, 2019

markbellamy Apr 29, 2019

setrofim Apr 26, 2019

markbellamy Apr 29, 2019

setrofim Apr 26, 2019

markbellamy Apr 29, 2019

setrofim Apr 26, 2019

markbellamy Apr 29, 2019

setrofim Apr 26, 2019

markbellamy Apr 29, 2019

setrofim Apr 26, 2019

markbellamy Apr 29, 2019

markbellamy Apr 30, 2019

setrofim Apr 26, 2019

markbellamy Apr 29, 2019

marcbonnici Apr 26, 2019

markbellamy Apr 29, 2019

ptosi left a comment

ptosi Apr 29, 2019

ptosi Apr 29, 2019

markbellamy Apr 30, 2019

ptosi Apr 29, 2019

markbellamy commented Apr 30, 2019

Add support for "perf record" and "perf report" to WA #971

Are you sure you want to change the base?

Add support for "perf record" and "perf report" to WA #971

Conversation

markbellamy commented Apr 25, 2019

setrofim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ptosi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markbellamy commented Apr 30, 2019