This series adds thread-stack and synthesized callchain support for Arm CoreSight, which comes from older series [1] but heavily rewritten.
CS ETM previously kept last-branch state in a per-trace-queue buffer. That effectively makes the state per CPU, while the call/return history belongs to a thread. This series moves branch tracking to the common thread-stack code.
The series records CoreSight branches with thread_stack__event(), uses thread_stack__br_sample() for last branch entries, flushes thread stacks after decoder resets.
A decoder reset between AUX trace buffers is treated as a global trace discontinuity, so all thread stacks are flushed, so avoids carrying stale call/return history across a trace discontinuity.
One limitation remains for instructions emulated by the kernel. In that case the exception return address may not match the return address stored in the thread stack, because after exception return can be one instruction ahead. The stack can still recover when a later return matches an upper caller. Given emulated instructions are not the common target for performance callchain analysis. Supporting this would require extending the common thread-stack path to accept both the real target address and an adjusted address for stack matching, so this series leaves that extra complexity out.
The series has been tested on Orion6 board:
perf test 150 -vvv
150: Check Arm CoreSight synthesized callchain: --- start --- test child forked, pid 13528 Test callchain push: PASS Test callchain pop: PASS ---- end(0) ---- 150: Check Arm CoreSight synthesized callchain : Ok
perf script --itrace=g16i10il64
callchain_test 17468 [005] 1031003.229943: 10 instructions: aaaac32507c4 main+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6) ffff90bd233c call_init+0x9c (inlined) ffff90bd233c __libc_start_main_impl+0x9c (inlined) aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
callchain_test 17468 [005] 1031003.229943: 10 instructions: aaaac3250774 do_svc+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac3250798 print+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac32507b0 foo+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac32507c8 main+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6) ffff90bd233c call_init+0x9c (inlined) ffff90bd233c __libc_start_main_impl+0x9c (inlined) aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
callchain_test 17468 [005] 1031003.229944: 10 instructions: ffff800080010c20 vectors+0x420 ([kernel.kallsyms]) aaaac3250784 do_svc+0x1c (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac3250798 print+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac32507b0 foo+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac32507c8 main+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6) ffff90bd233c call_init+0x9c (inlined) ffff90bd233c __libc_start_main_impl+0x9c (inlined) aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
Note, the test fails on Juno board which is caused by many discontinuity packets (mainly caused by NO_SYNC elem). This is likely caused by the FIFO overflow on the path.
[1] https://lore.kernel.org/linux-arm-kernel/20200220052701.7754-1-leo.yan@linar...
Signed-off-by: Leo Yan leo.yan@arm.com --- Leo Yan (8): perf cs-etm: Decode ETE exception packets perf cs-etm: Refactor instruction size handling perf cs-etm: Use thread-stack for last branch entries perf cs-etm: Flush thread stacks after decoder reset perf cs-etm: Support call indentation perf cs-etm: Filter synthesized branch samples perf cs-etm: Synthesize callchains for instruction samples perf test: Add Arm CoreSight callchain test
.../tests/shell/test_arm_coresight_callchain.sh | 235 ++++++++++++++++ tools/perf/util/cs-etm.c | 309 ++++++++++++--------- 2 files changed, 408 insertions(+), 136 deletions(-) --- base-commit: bd2a5be1fe731bc7548205dd148db75f1d588da2 change-id: 20260521-b4-arm_cs_callchain_support_v1-2c2a70719bcc
Best regards,