This series adds thread-stack and synthesized callchain support for Arm CoreSight, which comes from older series [1] but heavily rewritten.
CS ETM previously kept last-branch state in a per-trace-queue buffer. That effectively makes the state per CPU, while the call/return history belongs to a thread. This series moves branch tracking to the common thread-stack code.
The series records CoreSight branches with thread_stack__event(), uses thread_stack__br_sample() for last branch entries, flushes thread stacks after decoder resets.
A decoder reset between AUX trace buffers is treated as a global trace discontinuity, so all thread stacks are flushed, so avoids carrying stale call/return history across a trace discontinuity.
One limitation remains for instructions emulated by the kernel. In that case the exception return address may not match the return address stored in the thread stack, because after exception return can be one instruction ahead. The stack can still recover when a later return matches an upper caller. Given emulated instructions are not the common target for performance callchain analysis. Supporting this would require extending the common thread-stack path to accept both the real target address and an adjusted address for stack matching, so this series leaves that extra complexity out.
The series has been tested on Orion6 board:
perf test 150 -vvv
150: Check Arm CoreSight synthesized callchain: --- start --- test child forked, pid 13528 Test callchain push: PASS Test callchain pop: PASS ---- end(0) ---- 150: Check Arm CoreSight synthesized callchain : Ok
perf script --itrace=g16i10il64
callchain_test 17468 [005] 1031003.229943: 10 instructions: aaaac32507c4 main+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6) ffff90bd233c call_init+0x9c (inlined) ffff90bd233c __libc_start_main_impl+0x9c (inlined) aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
callchain_test 17468 [005] 1031003.229943: 10 instructions: aaaac3250774 do_svc+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac3250798 print+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac32507b0 foo+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac32507c8 main+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6) ffff90bd233c call_init+0x9c (inlined) ffff90bd233c __libc_start_main_impl+0x9c (inlined) aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
callchain_test 17468 [005] 1031003.229944: 10 instructions: ffff800080010c20 vectors+0x420 ([kernel.kallsyms]) aaaac3250784 do_svc+0x1c (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac3250798 print+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac32507b0 foo+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaac32507c8 main+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6) ffff90bd233c call_init+0x9c (inlined) ffff90bd233c __libc_start_main_impl+0x9c (inlined) aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
Note, the test fails on Juno board which is caused by many discontinuity packets (mainly caused by NO_SYNC elem). This is likely caused by the FIFO overflow on the path.
[1] https://lore.kernel.org/linux-arm-kernel/20200220052701.7754-1-leo.yan@linar...
Signed-off-by: Leo Yan leo.yan@arm.com --- Leo Yan (8): perf cs-etm: Decode ETE exception packets perf cs-etm: Refactor instruction size handling perf cs-etm: Use thread-stack for last branch entries perf cs-etm: Flush thread stacks after decoder reset perf cs-etm: Support call indentation perf cs-etm: Filter synthesized branch samples perf cs-etm: Synthesize callchains for instruction samples perf test: Add Arm CoreSight callchain test
.../tests/shell/test_arm_coresight_callchain.sh | 235 ++++++++++++++++ tools/perf/util/cs-etm.c | 309 ++++++++++++--------- 2 files changed, 408 insertions(+), 136 deletions(-) --- base-commit: bd2a5be1fe731bc7548205dd148db75f1d588da2 change-id: 20260521-b4-arm_cs_callchain_support_v1-2c2a70719bcc
Best regards,
ETE shares the same packet format as ETMv4, but exception decoding handled ETMv4 packets only. As a result, ETE exception packets were not classified.
Recognize the ETE magic for exception number decoding.
Signed-off-by: Leo Yan leo.yan@arm.com --- tools/perf/util/cs-etm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 6ec48de29441012f3d827d50616349c6c0d1f037..ab79d08f5a6095448470e2c3ec85ff3db2fb5634 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -2138,7 +2138,7 @@ static bool cs_etm__is_syscall(struct cs_etm_queue *etmq, * HVC cases; need to check if it's SVC instruction based on * packet address. */ - if (magic == __perf_cs_etmv4_magic) { + if (magic == __perf_cs_etmv4_magic || magic == __perf_cs_ete_magic) { if (packet->exception_number == CS_ETMV4_EXC_CALL && cs_etm__is_svc_instr(etmq, trace_chan_id, prev_packet, prev_packet->end_addr)) @@ -2161,7 +2161,7 @@ static bool cs_etm__is_async_exception(struct cs_etm_traceid_queue *tidq, packet->exception_number == CS_ETMV3_EXC_FIQ) return true;
- if (magic == __perf_cs_etmv4_magic) + if (magic == __perf_cs_etmv4_magic || magic == __perf_cs_ete_magic) if (packet->exception_number == CS_ETMV4_EXC_RESET || packet->exception_number == CS_ETMV4_EXC_DEBUG_HALT || packet->exception_number == CS_ETMV4_EXC_SYSTEM_ERROR || @@ -2192,7 +2192,7 @@ static bool cs_etm__is_sync_exception(struct cs_etm_queue *etmq, packet->exception_number == CS_ETMV3_EXC_GENERIC) return true;
- if (magic == __perf_cs_etmv4_magic) { + if (magic == __perf_cs_etmv4_magic || magic == __perf_cs_ete_magic) { if (packet->exception_number == CS_ETMV4_EXC_TRAP || packet->exception_number == CS_ETMV4_EXC_ALIGNMENT || packet->exception_number == CS_ETMV4_EXC_INST_FAULT ||
From: Leo Yan leo.yan@linaro.org
This patch introduces a new function cs_etm__instr_size() to calculate the instruction size based on ISA type and instruction address.
Given the trace data can be MB and most likely that will be A64/A32 on a lot of platforms, cs_etm__instr_addr() keeps a single ISA type check for A64/A32 and executes an optimized calculation (addr + offset * 4).
Signed-off-by: Leo Yan leo.yan@linaro.org Signed-off-by: Leo Yan leo.yan@arm.com --- tools/perf/util/cs-etm.c | 44 +++++++++++++++++++++++--------------------- 1 file changed, 23 insertions(+), 21 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index ab79d08f5a6095448470e2c3ec85ff3db2fb5634..5bff8811d61e423463b7bd4e20d599d5b5307a1a 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1347,6 +1347,17 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq, return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2; }
+static inline int cs_etm__instr_size(struct cs_etm_queue *etmq, + u8 trace_chan_id, + enum cs_etm_isa isa, u64 addr) +{ + if (isa == CS_ETM_ISA_T32) + return cs_etm__t32_instr_size(etmq, trace_chan_id, addr); + + /* Otherwise, 4-byte instruction size for A32/A64 */ + return 4; +} + static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) { /* @@ -1375,19 +1386,18 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq, const struct cs_etm_packet *packet, u64 offset) { - if (packet->isa == CS_ETM_ISA_T32) { - u64 addr = packet->start_addr; + u64 addr = packet->start_addr;
- while (offset) { - addr += cs_etm__t32_instr_size(etmq, - trace_chan_id, addr); - offset--; - } - return addr; - } + /* 4-byte instruction size for A32/A64 */ + if (packet->isa == CS_ETM_ISA_A64 || packet->isa == CS_ETM_ISA_A32) + return addr + offset * 4;
- /* Assume a 4 byte instruction size (A32/A64) */ - return packet->start_addr + offset * 4; + while (offset) { + addr += cs_etm__instr_size(etmq, trace_chan_id, + packet->isa, addr); + offset--; + } + return addr; }
static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq, @@ -1540,16 +1550,8 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq, return; }
- /* - * T32 instruction size might be 32-bit or 16-bit, decide by calling - * cs_etm__t32_instr_size(). - */ - if (packet->isa == CS_ETM_ISA_T32) - sample->insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id, - sample->ip); - /* Otherwise, A64 and A32 instruction size are always 32-bit. */ - else - sample->insn_len = 4; + sample->insn_len = cs_etm__instr_size(etmq, trace_chan_id, + packet->isa, sample->ip);
cs_etm__mem_access(etmq, trace_chan_id, sample->ip, sample->insn_len, (void *)sample->insn, 0);
CS ETM maintains its own circular array for last branch entries, with local helpers to update, copy and reset the branch stack. This duplicates logic already provided by the common code.
Record branch with thread_stack__event() and synthesize branch stack with thread_stack__br_sample(). This removes the local last_branch_rb buffer and position tracking. Keep the buffer number updated via thread_stack__set_trace_nr(), which is used when exporting samples to Python scripts.
The output should remain same, except that be->flags.predicted is no longer set. Since CoreSight trace does not provide branch prediction information, clearing the flag avoids confusion.
Signed-off-by: Leo Yan leo.yan@arm.com --- tools/perf/util/cs-etm.c | 152 +++++++++++++---------------------------------- 1 file changed, 41 insertions(+), 111 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 5bff8811d61e423463b7bd4e20d599d5b5307a1a..398ab3b7a429d402cc8e5f6cccb35c0b7c253732 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -83,14 +83,13 @@ struct cs_etm_auxtrace { struct cs_etm_traceid_queue { u8 trace_chan_id; u64 period_instructions; - size_t last_branch_pos; union perf_event *event_buf; struct thread *thread; struct thread *prev_packet_thread; ocsd_ex_level prev_packet_el; ocsd_ex_level el; + unsigned int br_stack_sz; struct branch_stack *last_branch; - struct branch_stack *last_branch_rb; struct cs_etm_packet *prev_packet; struct cs_etm_packet *packet; struct cs_etm_packet_queue packet_queue; @@ -635,9 +634,8 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq, tidq->last_branch = zalloc(sz); if (!tidq->last_branch) goto out_free; - tidq->last_branch_rb = zalloc(sz); - if (!tidq->last_branch_rb) - goto out_free; + + tidq->br_stack_sz = etm->synth_opts.last_branch_sz; }
tidq->event_buf = malloc(PERF_SAMPLE_MAX_SIZE); @@ -647,7 +645,6 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq, return 0;
out_free: - zfree(&tidq->last_branch_rb); zfree(&tidq->last_branch); zfree(&tidq->prev_packet); zfree(&tidq->packet); @@ -941,7 +938,6 @@ static void cs_etm__free_traceid_queues(struct cs_etm_queue *etmq) thread__zput(tidq->prev_packet_thread); zfree(&tidq->event_buf); zfree(&tidq->last_branch); - zfree(&tidq->last_branch_rb); zfree(&tidq->prev_packet); zfree(&tidq->packet); zfree(&tidq); @@ -1281,57 +1277,6 @@ static int cs_etm__queue_first_cs_timestamp(struct cs_etm_auxtrace *etm, return ret; }
-static inline -void cs_etm__copy_last_branch_rb(struct cs_etm_queue *etmq, - struct cs_etm_traceid_queue *tidq) -{ - struct branch_stack *bs_src = tidq->last_branch_rb; - struct branch_stack *bs_dst = tidq->last_branch; - size_t nr = 0; - - /* - * Set the number of records before early exit: ->nr is used to - * determine how many branches to copy from ->entries. - */ - bs_dst->nr = bs_src->nr; - - /* - * Early exit when there is nothing to copy. - */ - if (!bs_src->nr) - return; - - /* - * As bs_src->entries is a circular buffer, we need to copy from it in - * two steps. First, copy the branches from the most recently inserted - * branch ->last_branch_pos until the end of bs_src->entries buffer. - */ - nr = etmq->etm->synth_opts.last_branch_sz - tidq->last_branch_pos; - memcpy(&bs_dst->entries[0], - &bs_src->entries[tidq->last_branch_pos], - sizeof(struct branch_entry) * nr); - - /* - * If we wrapped around at least once, the branches from the beginning - * of the bs_src->entries buffer and until the ->last_branch_pos element - * are older valid branches: copy them over. The total number of - * branches copied over will be equal to the number of branches asked by - * the user in last_branch_sz. - */ - if (bs_src->nr >= etmq->etm->synth_opts.last_branch_sz) { - memcpy(&bs_dst->entries[nr], - &bs_src->entries[0], - sizeof(struct branch_entry) * tidq->last_branch_pos); - } -} - -static inline -void cs_etm__reset_last_branch_rb(struct cs_etm_traceid_queue *tidq) -{ - tidq->last_branch_pos = 0; - tidq->last_branch_rb->nr = 0; -} - static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq, u8 trace_chan_id, u64 addr) { @@ -1400,38 +1345,6 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq, return addr; }
-static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq, - struct cs_etm_traceid_queue *tidq) -{ - struct branch_stack *bs = tidq->last_branch_rb; - struct branch_entry *be; - - /* - * The branches are recorded in a circular buffer in reverse - * chronological order: we start recording from the last element of the - * buffer down. After writing the first element of the stack, move the - * insert position back to the end of the buffer. - */ - if (!tidq->last_branch_pos) - tidq->last_branch_pos = etmq->etm->synth_opts.last_branch_sz; - - tidq->last_branch_pos -= 1; - - be = &bs->entries[tidq->last_branch_pos]; - be->from = cs_etm__last_executed_instr(tidq->prev_packet); - be->to = cs_etm__first_executed_instr(tidq->packet); - /* No support for mispredict */ - be->flags.mispred = 0; - be->flags.predicted = 1; - - /* - * Increment bs->nr until reaching the number of last branches asked by - * the user on the command line. - */ - if (bs->nr < etmq->etm->synth_opts.last_branch_sz) - bs->nr += 1; -} - static int cs_etm__inject_event(struct cs_etm_auxtrace *etm, union perf_event *event, struct perf_sample *sample, u64 type) { @@ -1579,6 +1492,37 @@ static inline u64 cs_etm__resolve_sample_time(struct cs_etm_queue *etmq, return etm->latest_kernel_timestamp; }
+static void cs_etm__add_stack_event(struct cs_etm_queue *etmq, + struct cs_etm_traceid_queue *tidq) +{ + u64 from, to; + int size; + + if (!tidq->prev_packet->last_instr_taken_branch) + return; + + if (tidq->prev_packet->sample_type != CS_ETM_RANGE || + tidq->packet->sample_type != CS_ETM_RANGE) + return; + + if (etmq->etm->synth_opts.last_branch) { + from = cs_etm__last_executed_instr(tidq->prev_packet); + to = cs_etm__first_executed_instr(tidq->packet); + + size = cs_etm__instr_size(etmq, tidq->trace_chan_id, + tidq->prev_packet->isa, from); + + /* Enable callchain so thread stack entry can be allocated */ + thread_stack__event(tidq->thread, tidq->prev_packet->cpu, + tidq->prev_packet->flags, from, to, size, + etmq->buffer->buffer_nr + 1, true, + tidq->br_stack_sz, 0); + } else { + thread_stack__set_trace_nr(tidq->thread, tidq->prev_packet->cpu, + etmq->buffer->buffer_nr + 1); + } +} + static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, struct cs_etm_traceid_queue *tidq, u64 addr, u64 period) @@ -1608,8 +1552,12 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample);
- if (etm->synth_opts.last_branch) + if (etm->synth_opts.last_branch) { + thread_stack__br_sample(tidq->thread, tidq->packet->cpu, + tidq->last_branch, + tidq->br_stack_sz); sample.branch_stack = tidq->last_branch; + }
if (etm->synth_opts.inject) { ret = cs_etm__inject_event(etm, event, &sample, @@ -1798,14 +1746,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
tidq->period_instructions += tidq->packet->instr_count;
- /* - * Record a branch when the last instruction in - * PREV_PACKET is a branch. - */ - if (etm->synth_opts.last_branch && - tidq->prev_packet->sample_type == CS_ETM_RANGE && - tidq->prev_packet->last_instr_taken_branch) - cs_etm__update_last_branch_rb(etmq, tidq); + cs_etm__add_stack_event(etmq, tidq);
if (etm->synth_opts.instructions && tidq->period_instructions >= etm->instructions_sample_period) { @@ -1864,10 +1805,6 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, u64 offset = etm->instructions_sample_period - instrs_prev; u64 addr;
- /* Prepare last branches for instruction sample */ - if (etm->synth_opts.last_branch) - cs_etm__copy_last_branch_rb(etmq, tidq); - while (tidq->period_instructions >= etm->instructions_sample_period) { /* @@ -1947,10 +1884,6 @@ static int cs_etm__flush(struct cs_etm_queue *etmq, etmq->etm->synth_opts.instructions && tidq->prev_packet->sample_type == CS_ETM_RANGE) { u64 addr; - - /* Prepare last branches for instruction sample */ - cs_etm__copy_last_branch_rb(etmq, tidq); - /* * Generate a last branch event for the branches left in the * circular buffer at the end of the trace. @@ -1982,7 +1915,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq,
/* Reset last branches after flush the trace */ if (etm->synth_opts.last_branch) - cs_etm__reset_last_branch_rb(tidq); + thread_stack__flush(tidq->thread);
return err; } @@ -2006,9 +1939,6 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq, tidq->prev_packet->sample_type == CS_ETM_RANGE) { u64 addr;
- /* Prepare last branches for instruction sample */ - cs_etm__copy_last_branch_rb(etmq, tidq); - /* * Use the address of the end of the last reported execution * range.
Perf resets the CoreSight decoder when moving to a new AUX trace buffer, this causes trace discontinunity globally.
For callchain synthesis, keeping thread-stack state after decoder reset can leave stale call/return history attached to threads that are decoded later, producing incorrect synthesized callchains.
Flush all host thread stacks after a decoder reset. When virtualization is present, flush the guest thread stacks as well.
Signed-off-by: Leo Yan leo.yan@arm.com --- tools/perf/util/cs-etm.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 398ab3b7a429d402cc8e5f6cccb35c0b7c253732..ea2424175558ddc0a6f20a9de6c30f377facdc52 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1956,6 +1956,37 @@ static int cs_etm__end_block(struct cs_etm_queue *etmq,
return 0; } + +static int cs_etm__flush_stack_cb(struct thread *thread, + void *data __maybe_unused) +{ + thread_stack__flush(thread); + return 0; +} + +static void cs_etm__flush_machine_stack(struct cs_etm_queue *etmq, pid_t pid) +{ + struct machine *machine; + + machine = machines__find(&etmq->etm->session->machines, pid); + if (machine) + machine__for_each_thread(machine, cs_etm__flush_stack_cb, NULL); +} + +static void cs_etm__flush_all_stack(struct cs_etm_queue *etmq) +{ + enum cs_etm_pid_fmt pid_fmt = cs_etm__get_pid_fmt(etmq); + + if (!etmq->etm->synth_opts.last_branch) + return; + + cs_etm__flush_machine_stack(etmq, HOST_KERNEL_ID); + + /* Clear the guest stack if virtualization is supported */ + if (pid_fmt == CS_ETM_PIDFMT_CTXTID2) + cs_etm__flush_machine_stack(etmq, DEFAULT_GUEST_KERNEL_ID); +} + /* * cs_etm__get_data_block: Fetch a block from the auxtrace_buffer queue * if need be. @@ -1978,6 +2009,12 @@ static int cs_etm__get_data_block(struct cs_etm_queue *etmq) ret = cs_etm_decoder__reset(etmq->decoder); if (ret) return ret; + + /* + * Since the decoder is reset, this causes a global trace + * discontinuity. Flush all thread stacks. + */ + cs_etm__flush_all_stack(etmq); }
return etmq->buf_len;
From: Leo Yan leo.yan@linaro.org
This commit supports the field "callindent" to reflect the call stack depth.
The branch stack is used by both call indentation and the last branch record, which are separate features. Use a new flag "use_br_stack" to track whether the branch stack needs to be recorded.
Before:
perf script -F +callindent
callchain_test 9187 [002] 599611.826599: 1 branches: main ffff83312258 __libc_start_call_main+0x78 (/usr/lib/aarch64-linux-gnu/libc.so.6) callchain_test 9187 [002] 599611.826599: 1 branches: foo aaaae3ed07c4 main+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: print aaaae3ed07ac foo+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: do_svc aaaae3ed0794 print+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: aaaae3ed077c do_svc+0x14 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: vectors aaaae3ed0780 do_svc+0x18 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: ffff800080010c00 vectors+0x400 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff800080010c24 vectors+0x424 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff8000800114dc el0t_64_sync+0xd4 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff8000800114f8 el0t_64_sync+0xf0 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff800080011528 el0t_64_sync+0x120 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff800080011538 el0t_64_sync+0x130 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: ffff800080011568 el0t_64_sync+0x160 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: el0t_64_sync_handler ffff80008001159c el0t_64_sync+0x194 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: ffff800081829110 el0t_64_sync_handler+0x18 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: el0t_64_sync_handler ffff800081829140 el0t_64_sync_handler+0x48 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: el0_svc ffff800081829194 el0t_64_sync_handler+0x9c ([kernel.kallsyms])
After:
callchain_test 9187 [002] 599611.826599: 1 branches: main ffff83312258 __libc_start_call_main+0x78 (/usr/lib/aarch64-linux-gnu/libc.so.6) callchain_test 9187 [002] 599611.826599: 1 branches: foo aaaae3ed07c4 main+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: print aaaae3ed07ac foo+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: do_svc aaaae3ed0794 print+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: aaaae3ed077c do_svc+0x14 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: vectors aaaae3ed0780 do_svc+0x18 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: ffff800080010c00 vectors+0x400 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff800080010c24 vectors+0x424 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff8000800114dc el0t_64_sync+0xd4 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff8000800114f8 el0t_64_sync+0xf0 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff800080011528 el0t_64_sync+0x120 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff800080011538 el0t_64_sync+0x130 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: ffff800080011568 el0t_64_sync+0x160 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: el0t_64_sync_handler ffff80008001159c el0t_64_sync+0x194 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: ffff800081829110 el0t_64_sync_handler+0x18 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: el0t_64_sync_handler ffff800081829140 el0t_64_sync_handler+0x48 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: el0_svc ffff800081829194 el0t_64_sync_handler+0x9c ([kernel.kallsyms])
Signed-off-by: Leo Yan leo.yan@linaro.org Signed-off-by: Leo Yan leo.yan@arm.com --- tools/perf/util/cs-etm.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index ea2424175558ddc0a6f20a9de6c30f377facdc52..b31d0dd46a45dc365edd7c2f9e9b2eb077ca23db 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -66,6 +66,7 @@ struct cs_etm_auxtrace { bool snapshot_mode; bool data_queued; bool has_virtual_ts; /* Virtual/Kernel timestamps in the trace. */ + bool use_thread_stack;
int num_cpu; u64 latest_kernel_timestamp; @@ -626,7 +627,7 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq, if (!tidq->prev_packet) goto out_free;
- if (etm->synth_opts.last_branch) { + if (etm->use_thread_stack) { size_t sz = sizeof(struct branch_stack);
sz += etm->synth_opts.last_branch_sz * @@ -1505,7 +1506,7 @@ static void cs_etm__add_stack_event(struct cs_etm_queue *etmq, tidq->packet->sample_type != CS_ETM_RANGE) return;
- if (etmq->etm->synth_opts.last_branch) { + if (etmq->etm->use_thread_stack) { from = cs_etm__last_executed_instr(tidq->prev_packet); to = cs_etm__first_executed_instr(tidq->packet);
@@ -1914,7 +1915,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq, cs_etm__packet_swap(etm, tidq);
/* Reset last branches after flush the trace */ - if (etm->synth_opts.last_branch) + if (etm->use_thread_stack) thread_stack__flush(tidq->thread);
return err; @@ -1977,7 +1978,7 @@ static void cs_etm__flush_all_stack(struct cs_etm_queue *etmq) { enum cs_etm_pid_fmt pid_fmt = cs_etm__get_pid_fmt(etmq);
- if (!etmq->etm->synth_opts.last_branch) + if (!etmq->etm->use_thread_stack) return;
cs_etm__flush_machine_stack(etmq, HOST_KERNEL_ID); @@ -3438,6 +3439,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, itrace_synth_opts__set_default(&etm->synth_opts, session->itrace_synth_opts->default_no_sample); etm->synth_opts.callchain = false; + etm->synth_opts.thread_stack = session->itrace_synth_opts->thread_stack; }
etm->session = session; @@ -3489,6 +3491,10 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, etm->tc.cap_user_time_zero = tc->cap_user_time_zero; etm->tc.cap_user_time_short = tc->cap_user_time_short; } + + etm->use_thread_stack = etm->synth_opts.thread_stack || + etm->synth_opts.last_branch; + err = cs_etm__synth_events(etm, session); if (err) goto err_free_queues;
From: Leo Yan leo.yan@linaro.org
CS ETM currently emits branch samples for every decoded branch when branch synthesis is enabled. This delivers redundant info when users request only call or return branches.
Add a branch filter derived from the itrace "calls" and "returns" options. When no filter is set, keep the existing behavior and emit all branch samples. When call or return filtering is requested, only synthesize branch samples whose flags match the selected branch types, including trace begin and end markers.
Before:
perf script -F +callindent
callchain_test 9187 [002] 599611.826599: 1 branches: main ffff83312258 __libc_start_call_main+0x78 (/usr/lib/aarch64-linux-gnu/libc.so.6) callchain_test 9187 [002] 599611.826599: 1 branches: foo aaaae3ed07c4 main+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: print aaaae3ed07ac foo+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: do_svc aaaae3ed0794 print+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: aaaae3ed077c do_svc+0x14 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: vectors aaaae3ed0780 do_svc+0x18 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: ffff800080010c00 vectors+0x400 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff800080010c24 vectors+0x424 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff8000800114dc el0t_64_sync+0xd4 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff8000800114f8 el0t_64_sync+0xf0 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff800080011528 el0t_64_sync+0x120 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826600: 1 branches: ffff800080011538 el0t_64_sync+0x130 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: ffff800080011568 el0t_64_sync+0x160 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: el0t_64_sync_handler ffff80008001159c el0t_64_sync+0x194 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: ffff800081829110 el0t_64_sync_handler+0x18 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: el0t_64_sync_handler ffff800081829140 el0t_64_sync_handler+0x48 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: el0_svc ffff800081829194 el0t_64_sync_handler+0x9c ([kernel.kallsyms])
After:
callchain_test 9187 [002] 599611.826599: 1 branches: main ffff83312258 __libc_start_call_main+0x78 (/usr/lib/aarch64-linux-gnu/libc.so.6) callchain_test 9187 [002] 599611.826599: 1 branches: foo aaaae3ed07c4 main+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: print aaaae3ed07ac foo+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: do_svc aaaae3ed0794 print+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826599: 1 branches: vectors aaaae3ed0780 do_svc+0x18 (/home/kernel/leoy/test_cs_callchain/callchain_test) callchain_test 9187 [002] 599611.826601: 1 branches: el0t_64_sync_handler ffff80008001159c el0t_64_sync+0x194 ([kernel.kallsyms]) callchain_test 9187 [002] 599611.826601: 1 branches: el0_svc ffff800081829194 el0t_64_sync_handler+0x9c ([kernel.kallsyms])
Signed-off-by: Leo Yan leo.yan@linaro.org Signed-off-by: Leo Yan leo.yan@arm.com --- tools/perf/util/cs-etm.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index b31d0dd46a45dc365edd7c2f9e9b2eb077ca23db..8d98e772ecb307381b5ed1b4bbc4056e8779b261 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -71,6 +71,7 @@ struct cs_etm_auxtrace { int num_cpu; u64 latest_kernel_timestamp; u32 auxtrace_type; + u32 branches_filter; u64 branches_sample_type; u64 branches_id; u64 instructions_sample_type; @@ -1596,6 +1597,10 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq, } dummy_bs; u64 ip;
+ if (etm->branches_filter && + !(etm->branches_filter & tidq->prev_packet->flags)) + return 0; + ip = cs_etm__last_executed_instr(tidq->prev_packet);
event->sample.header.type = PERF_RECORD_SAMPLE; @@ -3442,6 +3447,16 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, etm->synth_opts.thread_stack = session->itrace_synth_opts->thread_stack; }
+ if (etm->synth_opts.calls) + etm->branches_filter |= PERF_IP_FLAG_CALL | + PERF_IP_FLAG_TRACE_BEGIN | + PERF_IP_FLAG_TRACE_END; + + if (etm->synth_opts.returns) + etm->branches_filter |= PERF_IP_FLAG_RETURN | + PERF_IP_FLAG_TRACE_BEGIN | + PERF_IP_FLAG_TRACE_END; + etm->session = session;
etm->num_cpu = num_cpu;
From: Leo Yan leo.yan@linaro.org
CS ETM already records branches into the thread stack, but instruction samples do not carry synthesized callchains. It misses to support the callchain and no output with the itrace option 'g'.
Allocate a callchain buffer per queue and use thread_stack__sample() when synthesizing instruction samples. Advertise PERF_SAMPLE_CALLCHAIN on the synthetic instruction event.
Allocate the callchain stack with one more entry than requested, as the first entry is reserved for storing context information.
After:
perf script --itrace=g16l64i100
callchain_test 9187 [002] 599611.826599: 1 instructions: aaaae3ed0774 do_svc+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaae3ed0798 print+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaae3ed07b0 foo+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) aaaae3ed07c8 main+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test) ffff8331225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6) ffff8331233c call_init+0x9c (inlined) ffff8331233c __libc_start_main_impl+0x9c (inlined) aaaae3ed0670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
Signed-off-by: Leo Yan leo.yan@linaro.org Signed-off-by: Leo Yan leo.yan@arm.com --- tools/perf/util/cs-etm.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 48 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 8d98e772ecb307381b5ed1b4bbc4056e8779b261..90e0beb910156093d8bd0f320bb0210aca95dd26 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -17,6 +17,7 @@ #include <stdlib.h>
#include "auxtrace.h" +#include "callchain.h" #include "color.h" #include "cs-etm.h" #include "cs-etm-decoder/cs-etm-decoder.h" @@ -85,6 +86,7 @@ struct cs_etm_auxtrace { struct cs_etm_traceid_queue { u8 trace_chan_id; u64 period_instructions; + u64 kernel_start; union perf_event *event_buf; struct thread *thread; struct thread *prev_packet_thread; @@ -92,6 +94,7 @@ struct cs_etm_traceid_queue { ocsd_ex_level el; unsigned int br_stack_sz; struct branch_stack *last_branch; + struct ip_callchain *callchain; struct cs_etm_packet *prev_packet; struct cs_etm_packet *packet; struct cs_etm_packet_queue packet_queue; @@ -640,6 +643,16 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq, tidq->br_stack_sz = etm->synth_opts.last_branch_sz; }
+ if (etm->synth_opts.callchain) { + size_t sz = sizeof(struct ip_callchain); + + /* Add 1 to callchain_sz for callchain context */ + sz += (etm->synth_opts.callchain_sz + 1) * sizeof(u64); + tidq->callchain = zalloc(sz); + if (!tidq->callchain) + goto out_free; + } + tidq->event_buf = malloc(PERF_SAMPLE_MAX_SIZE); if (!tidq->event_buf) goto out_free; @@ -647,6 +660,7 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq, return 0;
out_free: + zfree(&tidq->callchain); zfree(&tidq->last_branch); zfree(&tidq->prev_packet); zfree(&tidq->packet); @@ -939,6 +953,7 @@ static void cs_etm__free_traceid_queues(struct cs_etm_queue *etmq) thread__zput(tidq->thread); thread__zput(tidq->prev_packet_thread); zfree(&tidq->event_buf); + zfree(&tidq->callchain); zfree(&tidq->last_branch); zfree(&tidq->prev_packet); zfree(&tidq->packet); @@ -1431,6 +1446,7 @@ static void cs_etm__set_thread(struct cs_etm_queue *etmq, tidq->thread = machine__idle_thread(machine);
tidq->el = el; + tidq->kernel_start = machine__kernel_start(machine); }
int cs_etm__etmq_set_tid_el(struct cs_etm_queue *etmq, pid_t tid, @@ -1561,6 +1577,25 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq, sample.branch_stack = tidq->last_branch; }
+ if (etm->synth_opts.callchain) { + if (tidq->kernel_start) + thread_stack__sample(tidq->thread, tidq->packet->cpu, + tidq->callchain, + etm->synth_opts.callchain_sz + 1, + sample.ip, tidq->kernel_start); + else + /* + * Clear the callchain when the kernel start address is + * not available yet. The empty callchain can then be + * consumed by cs_etm__inject_event(). + */ + memset(tidq->callchain, 0, + sizeof(struct ip_callchain) + + (etm->synth_opts.callchain_sz + 1) * sizeof(u64)); + + sample.callchain = tidq->callchain; + } + if (etm->synth_opts.inject) { ret = cs_etm__inject_event(etm, event, &sample, etm->instructions_sample_type); @@ -1724,6 +1759,9 @@ static int cs_etm__synth_events(struct cs_etm_auxtrace *etm, attr.branch_sample_type |= PERF_SAMPLE_BRANCH_HW_INDEX; }
+ if (etm->synth_opts.callchain) + attr.sample_type |= PERF_SAMPLE_CALLCHAIN; + if (etm->synth_opts.instructions) { attr.config = PERF_COUNT_HW_INSTRUCTIONS; attr.sample_period = etm->synth_opts.period; @@ -3457,6 +3495,14 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, PERF_IP_FLAG_TRACE_BEGIN | PERF_IP_FLAG_TRACE_END;
+ if (etm->synth_opts.callchain && !symbol_conf.use_callchain) { + symbol_conf.use_callchain = true; + if (callchain_register_param(&callchain_param) < 0) { + symbol_conf.use_callchain = false; + etm->synth_opts.callchain = false; + } + } + etm->session = session;
etm->num_cpu = num_cpu; @@ -3508,7 +3554,8 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, }
etm->use_thread_stack = etm->synth_opts.thread_stack || - etm->synth_opts.last_branch; + etm->synth_opts.last_branch || + etm->synth_opts.callchain;
err = cs_etm__synth_events(etm, session); if (err)
Add a shell test for synthesized callchains from Arm CoreSight trace. The test runs only on arm64 systems with cs_etm event and gcc available.
Build a small test program for syscall, record them with CoreSight trace data, and decode with itrace callchain synthesis enabled. Verify that the push and pop callchain.
After:
perf test 150 -vvv
150: Check Arm CoreSight synthesized callchain: --- start --- test child forked, pid 13528 Test callchain push: PASS Test callchain pop: PASS ---- end(0) ---- 150: Check Arm CoreSight synthesized callchain : Ok
Assisted-by: Codex:GPT-5.5 Signed-off-by: Leo Yan leo.yan@arm.com --- .../tests/shell/test_arm_coresight_callchain.sh | 235 +++++++++++++++++++++ 1 file changed, 235 insertions(+)
diff --git a/tools/perf/tests/shell/test_arm_coresight_callchain.sh b/tools/perf/tests/shell/test_arm_coresight_callchain.sh new file mode 100755 index 0000000000000000000000000000000000000000..0e5a5d1129ae7d34f8e0c5942fb62d27db3e862d --- /dev/null +++ b/tools/perf/tests/shell/test_arm_coresight_callchain.sh @@ -0,0 +1,235 @@ +#!/bin/bash +# Check Arm CoreSight synthesized callchain (exclusive) +# SPDX-License-Identifier: GPL-2.0 + +glb_err=1 + +if ! tmpdir=$(mktemp -d /tmp/perf-cs-callchain-test.XXXXXX); then + echo "mktemp failed" + exit 1 +fi + +cleanup_files() +{ + rm -rf "$tmpdir" +} + +trap cleanup_files EXIT +trap 'cleanup_files; exit $glb_err' TERM INT + +skip_if_system_is_not_ready() +{ + [ "$(uname -m)" = "aarch64" ] || { + echo "Skip: arm64 only test" >&2 + return 2 + } + + perf list | grep -q 'cs_etm//' || { + echo "Skip: cs_etm event is not available" >&2 + return 2 + } + + command -v gcc >/dev/null 2>&1 || { + echo "Skip: gcc is not available" >&2 + return 2 + } + + return 0 +} + +build_test_program() +{ + local src=$1 + local bin=$2 + + gcc -g -O0 -o "$bin" "$src" +} + +record_trace() +{ + local bin=$1 + local data=$2 + local script=$3 + + perf record -m ,32M -o "$data" --per-thread -e cs_etm// -- "$bin" >/dev/null 2>&1 && + perf script --itrace=g16i10il64 -i "$data" > "$script" +} + +check_regex() +{ + local name=$1 + local regex=$2 + local script=$3 + + if grep -Pzo "$regex" "$script" >/dev/null; then + echo "Test $name: PASS" + return 0 + else + echo "Test $name: FAIL" + return 1 + fi +} + +run_test() +{ + local name=$1 + local src=$tmpdir/$name.S + local bin=$tmpdir/$name + local data=$tmpdir/perf.$name.data + local script=$tmpdir/perf.$name.script + local regex + + "${name}_src" "$src" + + if ! build_test_program "$src" "$bin"; then + echo "$name: build failed" + return + fi + + if ! record_trace "$bin" "$data" "$script"; then + echo "$name: perf record/script failed" + return + fi + + regex=$("${name}_push_regex") + check_regex "${name} push" "$regex" "$script" || return + + regex=$("${name}_pop_regex") + check_regex "${name} pop" "$regex" "$script" || return + + glb_err=0 +} + +callchain_src() +{ + cat > "$1" <<'EOF' +/* callchain.S */ + .text + + .global do_svc + .type do_svc, %function +do_svc: + stp x29, x30, [sp, #-16]! + mov x29, sp + + mov x0, #1 + adr x1, msg + mov x2, #23 + mov x8, #64 + + nop + nop // Pad nops for 9 insns before svc + + b 1f +1: + svc #0 + + nop + nop + nop + nop + nop + nop + nop + nop // Pad nops for 10 insns after svc + + ldp x29, x30, [sp], #16 + ret + .size do_svc, .-do_svc + + .global foo + .type foo, %function +foo: + stp x29, x30, [sp, #-16]! + mov x29, sp + nop + nop + nop + nop + nop + nop + nop // Pad nops for 9 insns before call + + bl do_svc + + nop + nop + nop + nop + nop + nop + nop + nop // Pad nops for 10 insns after call + + ldp x29, x30, [sp], #16 + ret + .size foo, .-foo + + .global main + .type main, %function +main: + stp x29, x30, [sp, #-16]! + mov x29, sp + + bl foo + + mov w0, #0 + ldp x29, x30, [sp], #16 + .size main, .-main + ret + + .section .rodata +msg: + .asciz "hello from svc syscall\n" +EOF +} + +callchain_push_regex() +{ + printf '%s' \ +'callchain[[:space:]]+[0-9]+ [[0-9]+][[:space:]]+10 instructions:[[:space:]]*\n'\ +'[[:space:]]+[[:xdigit:]]+ foo+0x[[:xdigit:]]+ (.*/callchain)\n'\ +'[[:space:]]+[[:xdigit:]]+ main+0xc (.*/callchain)\n'\ +'([[:space:]]+[[:xdigit:]]+ .*\n)*'\ +'\n'\ +'callchain[[:space:]]+[0-9]+ [[0-9]+][[:space:]]+10 instructions:[[:space:]]*\n'\ +'[[:space:]]+[[:xdigit:]]+ do_svc+0x[[:xdigit:]]+ (.*/callchain)\n'\ +'[[:space:]]+[[:xdigit:]]+ foo+0x28 (.*/callchain)\n'\ +'[[:space:]]+[[:xdigit:]]+ main+0xc (.*/callchain)\n'\ +'([[:space:]]+[[:xdigit:]]+ .*\n)*'\ +'\n'\ +'callchain[[:space:]]+[0-9]+ [[0-9]+][[:space:]]+10 instructions:[[:space:]]*\n'\ +'[[:space:]]+[[:xdigit:]]+ (vectors|el.*_64_sync|tramp_vectors)+0x[[:xdigit:]]+ ([kernel.kallsyms])\n'\ +'[[:space:]]+[[:xdigit:]]+ do_svc+0x28 (.*/callchain)\n'\ +'[[:space:]]+[[:xdigit:]]+ foo+0x28 (.*/callchain)\n'\ +'[[:space:]]+[[:xdigit:]]+ main+0xc (.*/callchain)\n'\ +'([[:space:]]+[[:xdigit:]]+ .*\n)*' +} + +callchain_pop_regex() +{ + printf '%s' \ +'callchain[[:space:]]+[0-9]+ [[0-9]+][[:space:]]+10 instructions:[[:space:]]*\n'\ +'[[:space:]]+[[:xdigit:]]+ (ret_to_user|tramp_exit)+0x[[:xdigit:]]+ ([kernel.kallsyms])\n'\ +'[[:space:]]+[[:xdigit:]]+ do_svc+0x28 (.*/callchain)\n'\ +'[[:space:]]+[[:xdigit:]]+ foo+0x28 (.*/callchain)\n'\ +'[[:space:]]+[[:xdigit:]]+ main+0xc (.*/callchain)\n'\ +'([[:space:]]+[[:xdigit:]]+ .*\n)*'\ +'\n'\ +'callchain[[:space:]]+[0-9]+ [[0-9]+][[:space:]]+10 instructions:[[:space:]]*\n'\ +'[[:space:]]+[[:xdigit:]]+ do_svc+0x[[:xdigit:]]+ (.*/callchain)\n'\ +'[[:space:]]+[[:xdigit:]]+ foo+0x28 (.*/callchain)\n'\ +'[[:space:]]+[[:xdigit:]]+ main+0xc (.*/callchain)\n' \ +'([[:space:]]+[[:xdigit:]]+ .*\n)*'\ +'\n'\ +'callchain[[:space:]]+[0-9]+ [[0-9]+][[:space:]]+10 instructions:[[:space:]]*\n'\ +'[[:space:]]+[[:xdigit:]]+ foo+0x[[:xdigit:]]+ (.*/callchain)\n'\ +'[[:space:]]+[[:xdigit:]]+ main+0xc (.*/callchain)\n'\ +'([[:space:]]+[[:xdigit:]]+ .*\n)*' +} + +skip_if_system_is_not_ready || exit 2 + +run_test "callchain" + +exit $glb_err