| 1 | perf-report(1) |
| 2 | ============== |
| 3 | |
| 4 | NAME |
| 5 | ---- |
| 6 | perf-report - Read perf.data (created by perf record) and display the profile |
| 7 | |
| 8 | SYNOPSIS |
| 9 | -------- |
| 10 | [verse] |
| 11 | 'perf report' [-i <file> | --input=file] |
| 12 | |
| 13 | DESCRIPTION |
| 14 | ----------- |
| 15 | This command displays the performance counter profile information recorded |
| 16 | via perf record. |
| 17 | |
| 18 | OPTIONS |
| 19 | ------- |
| 20 | -i:: |
| 21 | --input=:: |
| 22 | Input file name. (default: perf.data unless stdin is a fifo) |
| 23 | |
| 24 | -v:: |
| 25 | --verbose:: |
| 26 | Be more verbose. (show symbol address, etc) |
| 27 | |
| 28 | -d:: |
| 29 | --dsos=:: |
| 30 | Only consider symbols in these dsos. CSV that understands |
| 31 | file://filename entries. |
| 32 | -n:: |
| 33 | --show-nr-samples:: |
| 34 | Show the number of samples for each symbol |
| 35 | |
| 36 | --showcpuutilization:: |
| 37 | Show sample percentage for different cpu modes. |
| 38 | |
| 39 | -T:: |
| 40 | --threads:: |
| 41 | Show per-thread event counters |
| 42 | -c:: |
| 43 | --comms=:: |
| 44 | Only consider symbols in these comms. CSV that understands |
| 45 | file://filename entries. |
| 46 | -S:: |
| 47 | --symbols=:: |
| 48 | Only consider these symbols. CSV that understands |
| 49 | file://filename entries. |
| 50 | |
| 51 | --symbol-filter=:: |
| 52 | Only show symbols that match (partially) with this filter. |
| 53 | |
| 54 | -U:: |
| 55 | --hide-unresolved:: |
| 56 | Only display entries resolved to a symbol. |
| 57 | |
| 58 | -s:: |
| 59 | --sort=:: |
| 60 | Sort histogram entries by given key(s) - multiple keys can be specified |
| 61 | in CSV format. Following sort keys are available: |
| 62 | pid, comm, dso, symbol, parent, cpu, srcline, weight, local_weight. |
| 63 | |
| 64 | Each key has following meaning: |
| 65 | |
| 66 | - comm: command (name) of the task which can be read via /proc/<pid>/comm |
| 67 | - pid: command and tid of the task |
| 68 | - dso: name of library or module executed at the time of sample |
| 69 | - symbol: name of function executed at the time of sample |
| 70 | - parent: name of function matched to the parent regex filter. Unmatched |
| 71 | entries are displayed as "[other]". |
| 72 | - cpu: cpu number the task ran at the time of sample |
| 73 | - srcline: filename and line number executed at the time of sample. The |
| 74 | DWARF debugging info must be provided. |
| 75 | - weight: Event specific weight, e.g. memory latency or transaction |
| 76 | abort cost. This is the global weight. |
| 77 | - local_weight: Local weight version of the weight above. |
| 78 | - transaction: Transaction abort flags. |
| 79 | |
| 80 | By default, comm, dso and symbol keys are used. |
| 81 | (i.e. --sort comm,dso,symbol) |
| 82 | |
| 83 | If --branch-stack option is used, following sort keys are also |
| 84 | available: |
| 85 | dso_from, dso_to, symbol_from, symbol_to, mispredict. |
| 86 | |
| 87 | - dso_from: name of library or module branched from |
| 88 | - dso_to: name of library or module branched to |
| 89 | - symbol_from: name of function branched from |
| 90 | - symbol_to: name of function branched to |
| 91 | - mispredict: "N" for predicted branch, "Y" for mispredicted branch |
| 92 | - in_tx: branch in TSX transaction |
| 93 | - abort: TSX transaction abort. |
| 94 | |
| 95 | And default sort keys are changed to comm, dso_from, symbol_from, dso_to |
| 96 | and symbol_to, see '--branch-stack'. |
| 97 | |
| 98 | -p:: |
| 99 | --parent=<regex>:: |
| 100 | A regex filter to identify parent. The parent is a caller of this |
| 101 | function and searched through the callchain, thus it requires callchain |
| 102 | information recorded. The pattern is in the exteneded regex format and |
| 103 | defaults to "\^sys_|^do_page_fault", see '--sort parent'. |
| 104 | |
| 105 | -x:: |
| 106 | --exclude-other:: |
| 107 | Only display entries with parent-match. |
| 108 | |
| 109 | -w:: |
| 110 | --column-widths=<width[,width...]>:: |
| 111 | Force each column width to the provided list, for large terminal |
| 112 | readability. |
| 113 | |
| 114 | -t:: |
| 115 | --field-separator=:: |
| 116 | Use a special separator character and don't pad with spaces, replacing |
| 117 | all occurrences of this separator in symbol names (and other output) |
| 118 | with a '.' character, that thus it's the only non valid separator. |
| 119 | |
| 120 | -D:: |
| 121 | --dump-raw-trace:: |
| 122 | Dump raw trace in ASCII. |
| 123 | |
| 124 | -g [type,min[,limit],order[,key]]:: |
| 125 | --call-graph:: |
| 126 | Display call chains using type, min percent threshold, optional print |
| 127 | limit and order. |
| 128 | type can be either: |
| 129 | - flat: single column, linear exposure of call chains. |
| 130 | - graph: use a graph tree, displaying absolute overhead rates. |
| 131 | - fractal: like graph, but displays relative rates. Each branch of |
| 132 | the tree is considered as a new profiled object. + |
| 133 | |
| 134 | order can be either: |
| 135 | - callee: callee based call graph. |
| 136 | - caller: inverted caller based call graph. |
| 137 | |
| 138 | key can be: |
| 139 | - function: compare on functions |
| 140 | - address: compare on individual code addresses |
| 141 | |
| 142 | Default: fractal,0.5,callee,function. |
| 143 | |
| 144 | -G:: |
| 145 | --inverted:: |
| 146 | alias for inverted caller based call graph. |
| 147 | |
| 148 | --ignore-callees=<regex>:: |
| 149 | Ignore callees of the function(s) matching the given regex. |
| 150 | This has the effect of collecting the callers of each such |
| 151 | function into one place in the call-graph tree. |
| 152 | |
| 153 | --pretty=<key>:: |
| 154 | Pretty printing style. key: normal, raw |
| 155 | |
| 156 | --stdio:: Use the stdio interface. |
| 157 | |
| 158 | --tui:: Use the TUI interface, that is integrated with annotate and allows |
| 159 | zooming into DSOs or threads, among other features. Use of --tui |
| 160 | requires a tty, if one is not present, as when piping to other |
| 161 | commands, the stdio interface is used. |
| 162 | |
| 163 | --gtk:: Use the GTK2 interface. |
| 164 | |
| 165 | -k:: |
| 166 | --vmlinux=<file>:: |
| 167 | vmlinux pathname |
| 168 | |
| 169 | --kallsyms=<file>:: |
| 170 | kallsyms pathname |
| 171 | |
| 172 | -m:: |
| 173 | --modules:: |
| 174 | Load module symbols. WARNING: This should only be used with -k and |
| 175 | a LIVE kernel. |
| 176 | |
| 177 | -f:: |
| 178 | --force:: |
| 179 | Don't complain, do it. |
| 180 | |
| 181 | --symfs=<directory>:: |
| 182 | Look for files with symbols relative to this directory. |
| 183 | |
| 184 | -C:: |
| 185 | --cpu:: Only report samples for the list of CPUs provided. Multiple CPUs can |
| 186 | be provided as a comma-separated list with no space: 0,1. Ranges of |
| 187 | CPUs are specified with -: 0-2. Default is to report samples on all |
| 188 | CPUs. |
| 189 | |
| 190 | -M:: |
| 191 | --disassembler-style=:: Set disassembler style for objdump. |
| 192 | |
| 193 | --source:: |
| 194 | Interleave source code with assembly code. Enabled by default, |
| 195 | disable with --no-source. |
| 196 | |
| 197 | --asm-raw:: |
| 198 | Show raw instruction encoding of assembly instructions. |
| 199 | |
| 200 | --show-total-period:: Show a column with the sum of periods. |
| 201 | |
| 202 | -I:: |
| 203 | --show-info:: |
| 204 | Display extended information about the perf.data file. This adds |
| 205 | information which may be very large and thus may clutter the display. |
| 206 | It currently includes: cpu and numa topology of the host system. |
| 207 | |
| 208 | -b:: |
| 209 | --branch-stack:: |
| 210 | Use the addresses of sampled taken branches instead of the instruction |
| 211 | address to build the histograms. To generate meaningful output, the |
| 212 | perf.data file must have been obtained using perf record -b or |
| 213 | perf record --branch-filter xxx where xxx is a branch filter option. |
| 214 | perf report is able to auto-detect whether a perf.data file contains |
| 215 | branch stacks and it will automatically switch to the branch view mode, |
| 216 | unless --no-branch-stack is used. |
| 217 | |
| 218 | --objdump=<path>:: |
| 219 | Path to objdump binary. |
| 220 | |
| 221 | --group:: |
| 222 | Show event group information together. |
| 223 | |
| 224 | --demangle:: |
| 225 | Demangle symbol names to human readable form. It's enabled by default, |
| 226 | disable with --no-demangle. |
| 227 | |
| 228 | --percent-limit:: |
| 229 | Do not show entries which have an overhead under that percent. |
| 230 | (Default: 0). |
| 231 | |
| 232 | SEE ALSO |
| 233 | -------- |
| 234 | linkperf:perf-stat[1], linkperf:perf-annotate[1] |