Example below shows how to do CPU profiling using perf call-graph.
# uname -rs
Linux 3.10
# perf record -g -a
<Run your test case>
# perf report -g
<snip>
Warning:
Processed 1361987 events and lost 79 chunks!
Check IO/CPU overload!
# ========
# captured on: Tue Feb 28 15:00:56 2017
# hostname : <name>
# os release : 3.10
# perf version : 4.9.rc8.ge8ccd65
# arch : mips64
<snip>
#
# Samples: 1M of event 'cycles'
# Event count (approx.): 466296876065
#
# Overhead Command Shared Object
# ........ ............... ......................... ......................................................................................................................................
#
39.30% swapper [kernel.kallsyms] [k] __r4k_wait
|
:
|
--- __r4k_wait
cpu_startup_entry
|
|--66.22%-- start_kernel
--33.78%-- [...]
26.63% MyProgram [kernel.kallsyms] [k] check_leaf.isra.6
|
--- check_leaf.isra.6
fib_table_lookup
|
|--99.99%-- fib4_rule_action
| fib_rules_lookup
| __fib_lookup
| |
| |--99.93%-- ip_route_input_noref
| | |
| | |--99.99%-- ip_rcv_finish
| | | __netif_receive_skb_core
| | | netif_receive_skb
| | | |
| | | |--100.00%-- MY_DeliverPacket
| | | | MY_ProcessIPv4UnicastPacket
| | | | MY_RouteIncomingIPv4Packet
| | | | MY_RoutePacket
| | | | __netif_receive_skb_core
| | | | netif_receive_skb
| | | | |
| | | | |--99.98%-- mygreRcv
| | | | | ip_local_deliver_finish
| | | | | __netif_receive_skb_core
| | | | | netif_receive_skb
| | | | | MY_DeliverPacket
| | | | | MY_ProcessIPv4UnicastPacket
| | | | | MY_RouteIncomingIPv4Packet
| | | | | MY_RoutePacket
| | | | | __netif_receive_skb_core
| | | | | netif_receive_skb
| | | | | mynam_lro_receive_skb
:q
<snip>