perf cs-etm module converts decoder elements to packets and then we have more context crossing packets to generate synthenize samples, finally perf tool can faciliate samples for statistics and report the results.
This patch series is to address several issues found related with packets handling and samples generation when worked firstly on branch sample flags support for Arm CoreSight trace data, so this patch series also is dependency for another patch series for sample flags.
The first two patches are mainly to fix issues in cs_etm__flush(): Patch 0001 corrects packets swapping in cs_etm__flush() and this can fix the wrong branch sample caused by the missed packets swapping; patch 0002 is to fix the wrong samples generation with stale packets at the end of every trace buffer.
Patch 0003 is used to support NO_SYNC packet, otherwise the trace decoding cannot reflect the tracing discontinuity caused by NO_SYNC packet.
Patch 0004/0005 has been published in the patch series 'perf cs-etm: Add support for sample flags' before but at this time I move them into this patch series due these two patches are more relative with packets handling. Patch 0004 is used to generate branch sample for exception packets; and patch 0005 is to track the exception number.
This patch series is applied on the acme's perf core branch [1] with the latest commit f1d23afaf677 ("perf bpf: Reduce the hardcoded .max_entries for pid_maps") and has one prerequisite from Rob's patch 'perf: Support for Arm A32/T32 instruction sets in CoreSight trace' [2].
With applying the dependency patch, this patch series has been tested for branch samples dumping with below command on Juno board:
# perf script -F,-time,+ip,+sym,+dso,+addr,+symoff -k vmlinux
[1] https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf/c... [2] http://archive.armlinux.org.uk/lurker/message/20181109.091126.9d69489d.en.ht...
Leo Yan (5): perf cs-etm: Correct packets swapping in cs_etm__flush() perf cs-etm: Avoid stale branch samples when flush packet perf cs-etm: Support for NO_SYNC packet perf cs-etm: Generate branch sample for exception packet perf cs-etm: Track exception number
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 91 ++++++++++++++++++++++--- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 11 +-- tools/perf/util/cs-etm.c | 65 +++++++++++++++--- 3 files changed, 146 insertions(+), 21 deletions(-)
The structure cs_etm_queue uses 'prev_packet' to point to previous packet, this can be used to combine with new coming packet to generate samples.
In function cs_etm__flush() it swaps packets only when the flag 'etm->synth_opts.last_branch' is true, this means that it will not swap packets if without option '--itrace=il' to generate last branch entries; thus for this case the 'prev_packet' doesn't point to the correct previous packet and the stale packet still will be used to generate sequential sample. Thus if dump trace with 'perf script' command we can see the incorrect flow with the stale packet's address info.
This patch corrects packets swapping in cs_etm__flush(); except using the flag 'etm->synth_opts.last_branch' it also checks the another flag 'etm->sample_branches', if any flag is true then it swaps packets so can save correct content to 'prev_packet'. Finally this can fix the wrong program flow dumping issue.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 48ad217..fe18d7b 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -997,7 +997,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) }
swap_packet: - if (etmq->etm->synth_opts.last_branch) { + if (etm->sample_branches || etmq->etm->synth_opts.last_branch) { /* * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for * the next incoming packet.
On Sun, Nov 11, 2018 at 12:59:39PM +0800, Leo Yan wrote:
The structure cs_etm_queue uses 'prev_packet' to point to previous packet, this can be used to combine with new coming packet to generate samples.
In function cs_etm__flush() it swaps packets only when the flag 'etm->synth_opts.last_branch' is true, this means that it will not swap packets if without option '--itrace=il' to generate last branch entries; thus for this case the 'prev_packet' doesn't point to the correct previous packet and the stale packet still will be used to generate sequential sample. Thus if dump trace with 'perf script' command we can see the incorrect flow with the stale packet's address info.
This patch corrects packets swapping in cs_etm__flush(); except using the flag 'etm->synth_opts.last_branch' it also checks the another flag 'etm->sample_branches', if any flag is true then it swaps packets so can save correct content to 'prev_packet'. Finally this can fix the wrong program flow dumping issue.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 48ad217..fe18d7b 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -997,7 +997,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) } swap_packet:
- if (etmq->etm->synth_opts.last_branch) {
- if (etm->sample_branches || etmq->etm->synth_opts.last_branch) {
This seems like the right thing to do, if only to be consistent with that is done in cs_etm__sample().
Reviewed-by: Mathieu Poirier mathieu.poirier@linaro.org
/* * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for * the next incoming packet.
-- 2.7.4
At the end of trace buffer handling, function cs_etm__flush() is invoked to flush any remaining branch stack entries. As a side effect, it also generates branch sample, because the 'etmq->packet' doesn't contains any new coming packet but point to one stale packet after packets swapping, so it wrongly makes synthesize branch samples with stale packet info.
We could review below detailed flow which causes issue:
Packet1: start_addr=0xffff000008b1fbf0 end_addr=0xffff000008b1fbfc Packet2: start_addr=0xffff000008b1fb5c end_addr=0xffff000008b1fb6c
step 1: cs_etm__sample(): sample: ip=(0xffff000008b1fbfc-4) addr=0xffff000008b1fb5c
step 2: flush packet in cs_etm__run_decoder(): cs_etm__run_decoder() `-> err = cs_etm__flush(etmq, false); sample: ip=(0xffff000008b1fb6c-4) addr=0xffff000008b1fbf0
Packet1 and packet2 are two continuous packets, when packet2 is the new coming packet, cs_etm__sample() generates branch sample for these two packets and use [packet1::end_addr - 4 => packet2::start_addr] as branch jump flow, thus we can see the first generated branch sample in step 1. At the end of cs_etm__sample() it swaps packets so 'etm->prev_packet'= packet2 and 'etm->packet'=packet1, so far it's okay for branch sample.
If packet2 is the last one packet in trace buffer, even there have no any new coming packet, cs_etm__run_decoder() invokes cs_etm__flush() to flush branch stack entries as expected, but it also generates branch samples by taking 'etm->packet' as a new coming packet, thus the branch jump flow is as [packet2::end_addr - 4 => packet1::start_addr]; this is the second sample which is generated in step 2. So actually the second sample is a stale sample and we should not generate it.
This patch is to add new argument 'new_packet' for cs_etm__flush(), we can pass 'true' for this argument if there have a new packet, otherwise it will pass 'false' for the purpose of only flushing branch stack entries and avoid to generate sample for stale packet.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index fe18d7b..f4fa877 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -955,7 +955,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) return 0; }
-static int cs_etm__flush(struct cs_etm_queue *etmq) +static int cs_etm__flush(struct cs_etm_queue *etmq, bool new_packet) { int err = 0; struct cs_etm_auxtrace *etm = etmq->etm; @@ -989,6 +989,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)
}
+ /* + * If 'new_packet' is false, this time call has no a new packet + * coming and 'etmq->packet' contains the stale packet which is + * set at the previous time with packets swapping. In this case + * this function is invoked only for flushing branch stack at + * the end of buffer handling. + * + * Simply to say, branch samples should be generated when every + * time receive one new packet; otherwise, directly bail out to + * avoid generate branch sample with stale packet. + */ + if (!new_packet) + return 0; + if (etm->sample_branches && etmq->prev_packet->sample_type == CS_ETM_RANGE) { err = cs_etm__synth_branch_sample(etmq); @@ -1075,7 +1089,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) * Discontinuity in trace, flush * previous branch stack */ - cs_etm__flush(etmq); + cs_etm__flush(etmq, true); break; case CS_ETM_EMPTY: /* @@ -1092,7 +1106,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
if (err == 0) /* Flush any remaining branch stack entries */ - err = cs_etm__flush(etmq); + err = cs_etm__flush(etmq, false); }
return err;
On Sun, Nov 11, 2018 at 12:59:40PM +0800, Leo Yan wrote:
At the end of trace buffer handling, function cs_etm__flush() is invoked to flush any remaining branch stack entries. As a side effect, it also generates branch sample, because the 'etmq->packet' doesn't contains any new coming packet but point to one stale packet after packets swapping, so it wrongly makes synthesize branch samples with stale packet info.
We could review below detailed flow which causes issue:
Packet1: start_addr=0xffff000008b1fbf0 end_addr=0xffff000008b1fbfc Packet2: start_addr=0xffff000008b1fb5c end_addr=0xffff000008b1fb6c
step 1: cs_etm__sample(): sample: ip=(0xffff000008b1fbfc-4) addr=0xffff000008b1fb5c
step 2: flush packet in cs_etm__run_decoder(): cs_etm__run_decoder() `-> err = cs_etm__flush(etmq, false); sample: ip=(0xffff000008b1fb6c-4) addr=0xffff000008b1fbf0
Packet1 and packet2 are two continuous packets, when packet2 is the new coming packet, cs_etm__sample() generates branch sample for these two packets and use [packet1::end_addr - 4 => packet2::start_addr] as branch jump flow, thus we can see the first generated branch sample in step 1. At the end of cs_etm__sample() it swaps packets so 'etm->prev_packet'= packet2 and 'etm->packet'=packet1, so far it's okay for branch sample.
If packet2 is the last one packet in trace buffer, even there have no any new coming packet, cs_etm__run_decoder() invokes cs_etm__flush() to flush branch stack entries as expected, but it also generates branch samples by taking 'etm->packet' as a new coming packet, thus the branch jump flow is as [packet2::end_addr - 4 => packet1::start_addr]; this is the second sample which is generated in step 2. So actually the second sample is a stale sample and we should not generate it.
This patch is to add new argument 'new_packet' for cs_etm__flush(), we can pass 'true' for this argument if there have a new packet, otherwise it will pass 'false' for the purpose of only flushing branch stack entries and avoid to generate sample for stale packet.
Very good explanation, thanks for taking the time to write this.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index fe18d7b..f4fa877 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -955,7 +955,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) return 0; } -static int cs_etm__flush(struct cs_etm_queue *etmq) +static int cs_etm__flush(struct cs_etm_queue *etmq, bool new_packet) { int err = 0; struct cs_etm_auxtrace *etm = etmq->etm; @@ -989,6 +989,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) }
- /*
* If 'new_packet' is false, this time call has no a new packet
* coming and 'etmq->packet' contains the stale packet which is
* set at the previous time with packets swapping. In this case
* this function is invoked only for flushing branch stack at
* the end of buffer handling.
*
* Simply to say, branch samples should be generated when every
* time receive one new packet; otherwise, directly bail out to
* avoid generate branch sample with stale packet.
*/
- if (!new_packet)
return 0;
- if (etm->sample_branches && etmq->prev_packet->sample_type == CS_ETM_RANGE) { err = cs_etm__synth_branch_sample(etmq);
@@ -1075,7 +1089,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) * Discontinuity in trace, flush * previous branch stack */
cs_etm__flush(etmq);
cs_etm__flush(etmq, true); break; case CS_ETM_EMPTY: /*
@@ -1092,7 +1106,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) if (err == 0) /* Flush any remaining branch stack entries */
err = cs_etm__flush(etmq);
err = cs_etm__flush(etmq, false);
I understand what you're doing and it will yield the correct results. What I'm not sure about is if we wouldn't be better off splitting cs_etm__flush() in order to reduce the complexity of the main decoding loop. That is rename cs_etm__flush() to something like cs_etm__trace_on() and spin off a new cs_etm__end_block().
It does introduce a little bit of code duplication but I think we'd win in terms of readability and flexibility.
Thanks, Mathieu
} return err; -- 2.7.4
On Fri, Nov 16, 2018 at 04:05:11PM -0700, Mathieu Poirier wrote:
[...]
-static int cs_etm__flush(struct cs_etm_queue *etmq) +static int cs_etm__flush(struct cs_etm_queue *etmq, bool new_packet) { int err = 0; struct cs_etm_auxtrace *etm = etmq->etm; @@ -989,6 +989,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) }
- /*
* If 'new_packet' is false, this time call has no a new packet
* coming and 'etmq->packet' contains the stale packet which is
* set at the previous time with packets swapping. In this case
* this function is invoked only for flushing branch stack at
* the end of buffer handling.
*
* Simply to say, branch samples should be generated when every
* time receive one new packet; otherwise, directly bail out to
* avoid generate branch sample with stale packet.
*/
- if (!new_packet)
return 0;
- if (etm->sample_branches && etmq->prev_packet->sample_type == CS_ETM_RANGE) { err = cs_etm__synth_branch_sample(etmq);
@@ -1075,7 +1089,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) * Discontinuity in trace, flush * previous branch stack */
cs_etm__flush(etmq);
cs_etm__flush(etmq, true); break; case CS_ETM_EMPTY: /*
@@ -1092,7 +1106,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) if (err == 0) /* Flush any remaining branch stack entries */
err = cs_etm__flush(etmq);
err = cs_etm__flush(etmq, false);
I understand what you're doing and it will yield the correct results. What I'm not sure about is if we wouldn't be better off splitting cs_etm__flush() in order to reduce the complexity of the main decoding loop. That is rename cs_etm__flush() to something like cs_etm__trace_on() and spin off a new cs_etm__end_block().
It does introduce a little bit of code duplication but I think we'd win in terms of readability and flexibility.
Thanks for reviewing, Mathieu.
I agree with your suggestion to split cs_etm__flush() into two functions, will spin this patch with the suggestion in next series for reviewing.
Thanks, Leo Yan
On Fri, Nov 16, 2018 at 04:05:11PM -0700, Mathieu Poirier wrote:
[...]
-static int cs_etm__flush(struct cs_etm_queue *etmq) +static int cs_etm__flush(struct cs_etm_queue *etmq, bool new_packet) { int err = 0; struct cs_etm_auxtrace *etm = etmq->etm; @@ -989,6 +989,20 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) }
- /*
* If 'new_packet' is false, this time call has no a new packet
* coming and 'etmq->packet' contains the stale packet which is
* set at the previous time with packets swapping. In this case
* this function is invoked only for flushing branch stack at
* the end of buffer handling.
*
* Simply to say, branch samples should be generated when every
* time receive one new packet; otherwise, directly bail out to
* avoid generate branch sample with stale packet.
*/
- if (!new_packet)
return 0;
- if (etm->sample_branches && etmq->prev_packet->sample_type == CS_ETM_RANGE) { err = cs_etm__synth_branch_sample(etmq);
@@ -1075,7 +1089,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) * Discontinuity in trace, flush * previous branch stack */
cs_etm__flush(etmq);
cs_etm__flush(etmq, true); break; case CS_ETM_EMPTY: /*
@@ -1092,7 +1106,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) if (err == 0) /* Flush any remaining branch stack entries */
err = cs_etm__flush(etmq);
err = cs_etm__flush(etmq, false);
I understand what you're doing and it will yield the correct results. What I'm not sure about is if we wouldn't be better off splitting cs_etm__flush() in order to reduce the complexity of the main decoding loop. That is rename cs_etm__flush() to something like cs_etm__trace_on() and spin off a new cs_etm__end_block().
It does introduce a little bit of code duplication but I think we'd win in terms of readability and flexibility.
Sorry for long delay, Mathieu.
Agree with the idea of splitting cs_etm__flush() into two functions. Will spin patch for new version.
Thanks, Leo Yan
} return err; -- 2.7.4
As described in OpenCSD (CoreSight decoder lib), in the decoding stream it includes one trace element with type OCSD_GEN_TRC_ELEM_NO_SYNC; the element indicates 'either at start of decode, or after overflow / bad packet', we should take it as a signal for the tracing off and this will cause tracing discontinuity. From the trace dump with 'perf script', sometimes the element OCSD_GEN_TRC_ELEM_NO_SYNC collaborates with element OCSD_GEN_TRC_ELEM_TRACE_ON to show the tracing flow have been turned off and on, in this case the cs-etm code has handled TRACE_ON packet well so we observe the tracing discontinuity; but in another case it only inserts the element OCSD_GEN_TRC_ELEM_NO_SYNC into instructions packets, we miss to handle the case if has only standalone NO_SYNC element and users cannot receive the info for tracing discontinuity.
This patch introduces new type CS_ETM_TRACE_OFF to generate packet for receiving element OCSD_GEN_TRC_ELEM_NO_SYNC from decoder; when generate sample, CS_ETM_TRACE_OFF packet has almost the same behaviour with CS_ETM_TRACE_ON packet: both of them invokes cs_etm__flush() to generate samples for the previous instructions packet, and in cs_etm__sample() it also needs to generate samples if TRACE_OFF packet is followed by one sequential instructions packet. This patch also converts the address to 0 for TRACE_OFF packet, this is same with TRACE_ON packet as well.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 10 ++++++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ++++--- tools/perf/util/cs-etm.c | 15 +++++++++++---- 3 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 5efb616..9d52727 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -369,6 +369,14 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder, }
static ocsd_datapath_resp_t +cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder, + const uint8_t trace_chan_id) +{ + return cs_etm_decoder__buffer_packet(decoder, trace_chan_id, + CS_ETM_TRACE_OFF); +} + +static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) { @@ -389,6 +397,8 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( case OCSD_GEN_TRC_ELEM_UNKNOWN: break; case OCSD_GEN_TRC_ELEM_NO_SYNC: + resp = cs_etm_decoder__buffer_trace_off(decoder, + trace_chan_id); decoder->trace_on = false; break; case OCSD_GEN_TRC_ELEM_TRACE_ON: diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 9351bd1..a38c97c 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -23,9 +23,10 @@ struct cs_etm_buffer { };
enum cs_etm_sample_type { - CS_ETM_EMPTY = 0, - CS_ETM_RANGE = 1 << 0, - CS_ETM_TRACE_ON = 1 << 1, + CS_ETM_EMPTY = 0, + CS_ETM_RANGE = 1 << 0, + CS_ETM_TRACE_ON = 1 << 1, + CS_ETM_TRACE_OFF = 1 << 2, };
enum cs_etm_isa { diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index f4fa877..2a0cef9 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -517,8 +517,9 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) { - /* Returns 0 for the CS_ETM_TRACE_ON packet */ - if (packet->sample_type == CS_ETM_TRACE_ON) + /* Returns 0 for TRACE_ON and TRACE_OFF packets */ + if (packet->sample_type == CS_ETM_TRACE_ON || + packet->sample_type == CS_ETM_TRACE_OFF) return 0;
return packet->start_addr; @@ -527,8 +528,9 @@ static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) static inline u64 cs_etm__last_executed_instr(const struct cs_etm_packet *packet) { - /* Returns 0 for the CS_ETM_TRACE_ON packet */ - if (packet->sample_type == CS_ETM_TRACE_ON) + /* Returns 0 for TRACE_ON and TRACE_OFF packets */ + if (packet->sample_type == CS_ETM_TRACE_ON || + packet->sample_type == CS_ETM_TRACE_OFF) return 0;
return packet->end_addr - packet->last_instr_size; @@ -930,6 +932,10 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON) generate_sample = true;
+ /* Generate sample for tracing off packet */ + if (etmq->prev_packet->sample_type == CS_ETM_TRACE_OFF) + generate_sample = true; + /* Generate sample for branch taken packet */ if (etmq->prev_packet->sample_type == CS_ETM_RANGE && etmq->prev_packet->last_instr_taken_branch) @@ -1085,6 +1091,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) cs_etm__sample(etmq); break; case CS_ETM_TRACE_ON: + case CS_ETM_TRACE_OFF: /* * Discontinuity in trace, flush * previous branch stack
On Sun, Nov 11, 2018 at 12:59:41PM +0800, Leo Yan wrote:
As described in OpenCSD (CoreSight decoder lib), in the decoding stream it includes one trace element with type OCSD_GEN_TRC_ELEM_NO_SYNC; the element indicates 'either at start of decode, or after overflow / bad packet', we should take it as a signal for the tracing off and this will cause tracing discontinuity. From the trace dump with 'perf script', sometimes the element OCSD_GEN_TRC_ELEM_NO_SYNC collaborates with element OCSD_GEN_TRC_ELEM_TRACE_ON to show the tracing flow have been turned off and on, in this case the cs-etm code has handled TRACE_ON packet well so we observe the tracing discontinuity; but in another case it only inserts the element OCSD_GEN_TRC_ELEM_NO_SYNC into instructions packets, we miss to handle the case if has only standalone NO_SYNC element and users cannot receive the info for tracing discontinuity.
This patch introduces new type CS_ETM_TRACE_OFF to generate packet for receiving element OCSD_GEN_TRC_ELEM_NO_SYNC from decoder; when generate sample, CS_ETM_TRACE_OFF packet has almost the same behaviour with
Here you have used the word "almost", leading me to beleive there are cases where the handling of TRACE_ON and NO_SYNC packets differ, but the implementation enacts the same behavior for both.
Mike or Robert, can you confirm that TRACE_ON and NO_SYNC packets should be treated the same way?
Leo, if handling of the TRACE_ON and NO_SYNC packets is the same then we should treat them the same way in cs_etm_decoder__gen_trace_elem_printer(), i.e simply call cs_etm_decoder__buffer_trace_on(). That way packet handling in cs-etm.c doesn't change. Otherwise see my comments below.
CS_ETM_TRACE_ON packet: both of them invokes cs_etm__flush() to generate samples for the previous instructions packet, and in cs_etm__sample() it also needs to generate samples if TRACE_OFF packet is followed by one sequential instructions packet. This patch also converts the address to 0 for TRACE_OFF packet, this is same with TRACE_ON packet as well.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 10 ++++++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ++++--- tools/perf/util/cs-etm.c | 15 +++++++++++---- 3 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 5efb616..9d52727 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -369,6 +369,14 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder, } static ocsd_datapath_resp_t +cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder,
const uint8_t trace_chan_id)
+{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_OFF);
+}
+static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) { @@ -389,6 +397,8 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( case OCSD_GEN_TRC_ELEM_UNKNOWN: break; case OCSD_GEN_TRC_ELEM_NO_SYNC:
resp = cs_etm_decoder__buffer_trace_off(decoder,
trace_chan_id);
If we want to handle NO_SYNC element types, why introduce a "trace_off" function? Wouldn't it make more sense to call it cs_etm_decoder__buffer_no_sync() ?
decoder->trace_on = false; break;
case OCSD_GEN_TRC_ELEM_TRACE_ON: diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 9351bd1..a38c97c 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -23,9 +23,10 @@ struct cs_etm_buffer { }; enum cs_etm_sample_type {
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_TRACE_OFF = 1 << 2,
CS_ETM_NO_SYNC, see above comment. And please don't use indentation.
}; enum cs_etm_isa { diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index f4fa877..2a0cef9 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -517,8 +517,9 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq, static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) {
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
- /* Returns 0 for TRACE_ON and TRACE_OFF packets */
- if (packet->sample_type == CS_ETM_TRACE_ON ||
return 0;packet->sample_type == CS_ETM_TRACE_OFF)
return packet->start_addr; @@ -527,8 +528,9 @@ static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) static inline u64 cs_etm__last_executed_instr(const struct cs_etm_packet *packet) {
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
- /* Returns 0 for TRACE_ON and TRACE_OFF packets */
- if (packet->sample_type == CS_ETM_TRACE_ON ||
return 0;packet->sample_type == CS_ETM_TRACE_OFF)
return packet->end_addr - packet->last_instr_size; @@ -930,6 +932,10 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON) generate_sample = true;
/* Generate sample for tracing off packet */
if (etmq->prev_packet->sample_type == CS_ETM_TRACE_OFF)
generate_sample = true;
- /* Generate sample for branch taken packet */ if (etmq->prev_packet->sample_type == CS_ETM_RANGE && etmq->prev_packet->last_instr_taken_branch)
@@ -1085,6 +1091,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) cs_etm__sample(etmq); break; case CS_ETM_TRACE_ON:
case CS_ETM_TRACE_OFF: /* * Discontinuity in trace, flush * previous branch stack
-- 2.7.4
On Mon, Nov 19, 2018 at 11:27:59AM -0700, Mathieu Poirier wrote:
On Sun, Nov 11, 2018 at 12:59:41PM +0800, Leo Yan wrote:
As described in OpenCSD (CoreSight decoder lib), in the decoding stream it includes one trace element with type OCSD_GEN_TRC_ELEM_NO_SYNC; the element indicates 'either at start of decode, or after overflow / bad packet', we should take it as a signal for the tracing off and this will cause tracing discontinuity. From the trace dump with 'perf script', sometimes the element OCSD_GEN_TRC_ELEM_NO_SYNC collaborates with element OCSD_GEN_TRC_ELEM_TRACE_ON to show the tracing flow have been turned off and on, in this case the cs-etm code has handled TRACE_ON packet well so we observe the tracing discontinuity; but in another case it only inserts the element OCSD_GEN_TRC_ELEM_NO_SYNC into instructions packets, we miss to handle the case if has only standalone NO_SYNC element and users cannot receive the info for tracing discontinuity.
This patch introduces new type CS_ETM_TRACE_OFF to generate packet for receiving element OCSD_GEN_TRC_ELEM_NO_SYNC from decoder; when generate sample, CS_ETM_TRACE_OFF packet has almost the same behaviour with
Here you have used the word "almost", leading me to beleive there are cases where the handling of TRACE_ON and NO_SYNC packets differ, but the implementation enacts the same behavior for both.
Mike or Robert, can you confirm that TRACE_ON and NO_SYNC packets should be treated the same way?
I also would like to get suggestions/comments from Mike and Rob.
For NO_SYNC packet, Mike before has given some explination for it:
"looking at the decoder flow, when the decoder is reset, then it is returned to an unsynchronised state and a NO_SYNC will be output. With perf, it can concatenate multiple trace buffers into a single section of the perf.data file. To ensure that the decode process can see the start of a new buffer, the drivers will insert a barrier packet between different buffers so that the decoder can spot the boundaries. When the decoder hits a barrier packet it will automatically reset so that it correctly decodes the next trace buffer. This could be what you are seeing here."
So I think there still have some difference between TRACE_ON and NO_SYNC packets, TRACE_ON packet indicates the start of trace and it's also possible caused by tracing discontinuity; NO_SYNC packets usually caused by an unsynchronised state. But both of them presents discontinuity in perf trace data.
I prefer to use two different packet types to present them, this can let the code to be more readable and easier to be maintained later.
If you agree with this, I also will rephrase the commit log to avoid confusion that these two packets are the same thing.
Leo, if handling of the TRACE_ON and NO_SYNC packets is the same then we should treat them the same way in cs_etm_decoder__gen_trace_elem_printer(), i.e simply call cs_etm_decoder__buffer_trace_on(). That way packet handling in cs-etm.c doesn't change. Otherwise see my comments below.
CS_ETM_TRACE_ON packet: both of them invokes cs_etm__flush() to generate samples for the previous instructions packet, and in cs_etm__sample() it also needs to generate samples if TRACE_OFF packet is followed by one sequential instructions packet. This patch also converts the address to 0 for TRACE_OFF packet, this is same with TRACE_ON packet as well.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 10 ++++++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ++++--- tools/perf/util/cs-etm.c | 15 +++++++++++---- 3 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 5efb616..9d52727 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -369,6 +369,14 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder, } static ocsd_datapath_resp_t +cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder,
const uint8_t trace_chan_id)
+{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_OFF);
+}
+static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) { @@ -389,6 +397,8 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( case OCSD_GEN_TRC_ELEM_UNKNOWN: break; case OCSD_GEN_TRC_ELEM_NO_SYNC:
resp = cs_etm_decoder__buffer_trace_off(decoder,
trace_chan_id);
If we want to handle NO_SYNC element types, why introduce a "trace_off" function? Wouldn't it make more sense to call it cs_etm_decoder__buffer_no_sync() ?
Will change to cs_etm_decoder__buffer_no_sync().
decoder->trace_on = false; break;
case OCSD_GEN_TRC_ELEM_TRACE_ON: diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 9351bd1..a38c97c 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -23,9 +23,10 @@ struct cs_etm_buffer { }; enum cs_etm_sample_type {
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_TRACE_OFF = 1 << 2,
CS_ETM_NO_SYNC, see above comment. And please don't use indentation.
Will do this.
}; enum cs_etm_isa { diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index f4fa877..2a0cef9 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -517,8 +517,9 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq, static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) {
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
- /* Returns 0 for TRACE_ON and TRACE_OFF packets */
- if (packet->sample_type == CS_ETM_TRACE_ON ||
return 0;packet->sample_type == CS_ETM_TRACE_OFF)
return packet->start_addr; @@ -527,8 +528,9 @@ static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) static inline u64 cs_etm__last_executed_instr(const struct cs_etm_packet *packet) {
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
- /* Returns 0 for TRACE_ON and TRACE_OFF packets */
- if (packet->sample_type == CS_ETM_TRACE_ON ||
return 0;packet->sample_type == CS_ETM_TRACE_OFF)
return packet->end_addr - packet->last_instr_size; @@ -930,6 +932,10 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON) generate_sample = true;
/* Generate sample for tracing off packet */
if (etmq->prev_packet->sample_type == CS_ETM_TRACE_OFF)
generate_sample = true;
- /* Generate sample for branch taken packet */ if (etmq->prev_packet->sample_type == CS_ETM_RANGE && etmq->prev_packet->last_instr_taken_branch)
@@ -1085,6 +1091,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) cs_etm__sample(etmq); break; case CS_ETM_TRACE_ON:
case CS_ETM_TRACE_OFF: /* * Discontinuity in trace, flush * previous branch stack
-- 2.7.4
On Tue, 4 Dec 2018 at 20:31, leo.yan@linaro.org wrote:
On Mon, Nov 19, 2018 at 11:27:59AM -0700, Mathieu Poirier wrote:
On Sun, Nov 11, 2018 at 12:59:41PM +0800, Leo Yan wrote:
As described in OpenCSD (CoreSight decoder lib), in the decoding stream it includes one trace element with type OCSD_GEN_TRC_ELEM_NO_SYNC; the element indicates 'either at start of decode, or after overflow / bad packet', we should take it as a signal for the tracing off and this will cause tracing discontinuity. From the trace dump with 'perf script', sometimes the element OCSD_GEN_TRC_ELEM_NO_SYNC collaborates with element OCSD_GEN_TRC_ELEM_TRACE_ON to show the tracing flow have been turned off and on, in this case the cs-etm code has handled TRACE_ON packet well so we observe the tracing discontinuity; but in another case it only inserts the element OCSD_GEN_TRC_ELEM_NO_SYNC into instructions packets, we miss to handle the case if has only standalone NO_SYNC element and users cannot receive the info for tracing discontinuity.
This patch introduces new type CS_ETM_TRACE_OFF to generate packet for receiving element OCSD_GEN_TRC_ELEM_NO_SYNC from decoder; when generate sample, CS_ETM_TRACE_OFF packet has almost the same behaviour with
Here you have used the word "almost", leading me to beleive there are cases where the handling of TRACE_ON and NO_SYNC packets differ, but the implementation enacts the same behavior for both.
Mike or Robert, can you confirm that TRACE_ON and NO_SYNC packets should be treated the same way?
I also would like to get suggestions/comments from Mike and Rob.
For NO_SYNC packet, Mike before has given some explination for it:
"looking at the decoder flow, when the decoder is reset, then it is returned to an unsynchronised state and a NO_SYNC will be output. With perf, it can concatenate multiple trace buffers into a single section of the perf.data file. To ensure that the decode process can see the start of a new buffer, the drivers will insert a barrier packet between different buffers so that the decoder can spot the boundaries. When the decoder hits a barrier packet it will automatically reset so that it correctly decodes the next trace buffer. This could be what you are seeing here."
So I think there still have some difference between TRACE_ON and NO_SYNC packets, TRACE_ON packet indicates the start of trace and it's also possible caused by tracing discontinuity; NO_SYNC packets usually caused by an unsynchronised state. But both of them presents discontinuity in perf trace data.
I prefer to use two different packet types to present them, this can let the code to be more readable and easier to be maintained later.
Absolutely. Make sure to get to the bottom of the story with Mike and/or Robert before sending your next patchset.
If you agree with this, I also will rephrase the commit log to avoid confusion that these two packets are the same thing.
Yes please.
Leo, if handling of the TRACE_ON and NO_SYNC packets is the same then we should treat them the same way in cs_etm_decoder__gen_trace_elem_printer(), i.e simply call cs_etm_decoder__buffer_trace_on(). That way packet handling in cs-etm.c doesn't change. Otherwise see my comments below.
CS_ETM_TRACE_ON packet: both of them invokes cs_etm__flush() to generate samples for the previous instructions packet, and in cs_etm__sample() it also needs to generate samples if TRACE_OFF packet is followed by one sequential instructions packet. This patch also converts the address to 0 for TRACE_OFF packet, this is same with TRACE_ON packet as well.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 10 ++++++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 7 ++++--- tools/perf/util/cs-etm.c | 15 +++++++++++---- 3 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 5efb616..9d52727 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -369,6 +369,14 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder, }
static ocsd_datapath_resp_t +cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder,
const uint8_t trace_chan_id)
+{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_OFF);
+}
+static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) { @@ -389,6 +397,8 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( case OCSD_GEN_TRC_ELEM_UNKNOWN: break; case OCSD_GEN_TRC_ELEM_NO_SYNC:
resp = cs_etm_decoder__buffer_trace_off(decoder,
trace_chan_id);
If we want to handle NO_SYNC element types, why introduce a "trace_off" function? Wouldn't it make more sense to call it cs_etm_decoder__buffer_no_sync() ?
Will change to cs_etm_decoder__buffer_no_sync().
decoder->trace_on = false; break; case OCSD_GEN_TRC_ELEM_TRACE_ON:
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 9351bd1..a38c97c 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -23,9 +23,10 @@ struct cs_etm_buffer { };
enum cs_etm_sample_type {
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_EMPTY = 0,
- CS_ETM_RANGE = 1 << 0,
- CS_ETM_TRACE_ON = 1 << 1,
- CS_ETM_TRACE_OFF = 1 << 2,
CS_ETM_NO_SYNC, see above comment. And please don't use indentation.
Will do this.
};
enum cs_etm_isa { diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index f4fa877..2a0cef9 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -517,8 +517,9 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) {
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
/* Returns 0 for TRACE_ON and TRACE_OFF packets */
if (packet->sample_type == CS_ETM_TRACE_ON ||
packet->sample_type == CS_ETM_TRACE_OFF) return 0;
return packet->start_addr;
@@ -527,8 +528,9 @@ static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) static inline u64 cs_etm__last_executed_instr(const struct cs_etm_packet *packet) {
- /* Returns 0 for the CS_ETM_TRACE_ON packet */
- if (packet->sample_type == CS_ETM_TRACE_ON)
/* Returns 0 for TRACE_ON and TRACE_OFF packets */
if (packet->sample_type == CS_ETM_TRACE_ON ||
packet->sample_type == CS_ETM_TRACE_OFF) return 0;
return packet->end_addr - packet->last_instr_size;
@@ -930,6 +932,10 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) if (etmq->prev_packet->sample_type == CS_ETM_TRACE_ON) generate_sample = true;
/* Generate sample for tracing off packet */
if (etmq->prev_packet->sample_type == CS_ETM_TRACE_OFF)
generate_sample = true;
/* Generate sample for branch taken packet */ if (etmq->prev_packet->sample_type == CS_ETM_RANGE && etmq->prev_packet->last_instr_taken_branch)
@@ -1085,6 +1091,7 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) cs_etm__sample(etmq); break; case CS_ETM_TRACE_ON:
case CS_ETM_TRACE_OFF: /* * Discontinuity in trace, flush * previous branch stack
-- 2.7.4
On Wed, Dec 05, 2018 at 10:48:45AM -0700, Mathieu Poirier wrote:
[...]
So I think there still have some difference between TRACE_ON and NO_SYNC packets, TRACE_ON packet indicates the start of trace and it's also possible caused by tracing discontinuity; NO_SYNC packets usually caused by an unsynchronised state. But both of them presents discontinuity in perf trace data.
I prefer to use two different packet types to present them, this can let the code to be more readable and easier to be maintained later.
Absolutely. Make sure to get to the bottom of the story with Mike and/or Robert before sending your next patchset.
Sure, will sync with Mike/Rob.
Thanks, Leo Yan
The exception packet appears as one element with 'elem_type' == OCSD_GEN_TRC_ELEM_EXCEPTION or OCSD_GEN_TRC_ELEM_EXCEPTION_RET, which present for exception entry and exit respectively. The decoder set packet fields 'packet->exc' and 'packet->exc_ret' to indicate the exception packets; but exception packets don't have dedicated sample type and shares the same sample type CS_ETM_RANGE with normal instruction packets.
As result, the exception packets are taken as normal instruction packets and this introduces confusion to mix different packet types. Furthermore, these instruction range packets will be processed for branch sample only when 'packet->last_instr_taken_branch' is true, otherwise they will be omitted, this can introduce mess for exception and exception returning due we don't have complete address range info for context switching.
To process exception packets properly, this patch introduce two new sample type: CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET; for these two kind packets, they will be handled by cs_etm__exception(). The func cs_etm__exception() forces to set previous CS_ETM_RANGE packet flag 'prev_packet->last_instr_taken_branch' to true, this matches well with the program flow when the exception is trapped from user space to kernel space, no matter if the most recent flow has branch taken or not; this is also safe for returning to user space after exception handling.
After exception packets have their own sample type, the packet fields 'packet->exc' and 'packet->exc_ret' aren't needed anymore, so remove them.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 26 +++++++++++++++++------ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 4 ++-- tools/perf/util/cs-etm.c | 28 +++++++++++++++++++++++++ 3 files changed, 50 insertions(+), 8 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 9d52727..b8cb7a3e 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -269,8 +269,6 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->packet_buffer[i].instr_count = 0; decoder->packet_buffer[i].last_instr_taken_branch = false; decoder->packet_buffer[i].last_instr_size = 0; - decoder->packet_buffer[i].exc = false; - decoder->packet_buffer[i].exc_ret = false; decoder->packet_buffer[i].cpu = INT_MIN; } } @@ -298,8 +296,6 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder,
decoder->packet_buffer[et].sample_type = sample_type; decoder->packet_buffer[et].isa = CS_ETM_ISA_UNKNOWN; - decoder->packet_buffer[et].exc = false; - decoder->packet_buffer[et].exc_ret = false; decoder->packet_buffer[et].cpu = *((int *)inode->priv); decoder->packet_buffer[et].start_addr = CS_ETM_INVAL_ADDR; decoder->packet_buffer[et].end_addr = CS_ETM_INVAL_ADDR; @@ -384,6 +380,22 @@ cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder, CS_ETM_TRACE_ON); }
+static ocsd_datapath_resp_t +cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder, + const uint8_t trace_chan_id) +{ + return cs_etm_decoder__buffer_packet(decoder, trace_chan_id, + CS_ETM_EXCEPTION); +} + +static ocsd_datapath_resp_t +cs_etm_decoder__buffer_exception_ret(struct cs_etm_decoder *decoder, + const uint8_t trace_chan_id) +{ + return cs_etm_decoder__buffer_packet(decoder, trace_chan_id, + CS_ETM_EXCEPTION_RET); +} + static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( const void *context, const ocsd_trc_index_t indx __maybe_unused, @@ -411,10 +423,12 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION: - decoder->packet_buffer[decoder->tail].exc = true; + resp = cs_etm_decoder__buffer_exception(decoder, + trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION_RET: - decoder->packet_buffer[decoder->tail].exc_ret = true; + resp = cs_etm_decoder__buffer_exception_ret(decoder, + trace_chan_id); break; case OCSD_GEN_TRC_ELEM_PE_CONTEXT: case OCSD_GEN_TRC_ELEM_EO_TRACE: diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index a38c97c..0d1c18d 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -27,6 +27,8 @@ enum cs_etm_sample_type { CS_ETM_RANGE = 1 << 0, CS_ETM_TRACE_ON = 1 << 1, CS_ETM_TRACE_OFF = 1 << 2, + CS_ETM_EXCEPTION = 1 << 3, + CS_ETM_EXCEPTION_RET = 1 << 4, };
enum cs_etm_isa { @@ -44,8 +46,6 @@ struct cs_etm_packet { u32 instr_count; u8 last_instr_taken_branch; u8 last_instr_size; - u8 exc; - u8 exc_ret; int cpu; };
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 2a0cef9..455f132 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -961,6 +961,25 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) return 0; }
+static int cs_etm__exception(struct cs_etm_queue *etmq) +{ + /* + * When the exception packet is inserted, whether the last instruction + * in previous range packet is taken branch or not, we need to force + * to set 'prev_packet->last_instr_taken_branch' to true. This ensures + * to generate branch sample for the instruction range before the + * exception is trapped to kernel or before the exception returning. + * + * The exception packet includes the dummy address values, so don't + * swap PACKET with PREV_PACKET. This keeps PREV_PACKET to be useful + * for generating instruction and branch samples. + */ + if (etmq->prev_packet->sample_type == CS_ETM_RANGE) + etmq->prev_packet->last_instr_taken_branch = true; + + return 0; +} + static int cs_etm__flush(struct cs_etm_queue *etmq, bool new_packet) { int err = 0; @@ -1090,6 +1109,15 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) */ cs_etm__sample(etmq); break; + case CS_ETM_EXCEPTION: + case CS_ETM_EXCEPTION_RET: + /* + * If the exception packet is coming, + * make sure the previous instruction + * range packet to be handled properly. + */ + cs_etm__exception(etmq); + break; case CS_ETM_TRACE_ON: case CS_ETM_TRACE_OFF: /*
When an exception packet comes, it contains the info for exception number; the exception number indicates the exception types, so from it we can know if the exception is taken for interrupt, system call or other traps, etc. But because the exception return packet cannot delivery exception number correctly by decoder thus when prepare sample flags we cannot know what's type for exception return.
This patch adds a new 'exc_num' array in decoder structure to record exception number per CPU, the exception number is recorded in the array when the exception packet comes and this exception number can be used by exception return packet. If detect there have discontinuous trace with TRACE_ON or TRACE_OFF packet, the exception number is set to invalid value.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 67 ++++++++++++++++++++++--- 1 file changed, 59 insertions(+), 8 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index b8cb7a3e..d1a6cbc 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -43,6 +43,7 @@ struct cs_etm_decoder { u32 packet_count; u32 head; u32 tail; + u32 *exc_num; struct cs_etm_packet packet_buffer[MAX_BUFFER]; };
@@ -368,24 +369,64 @@ static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) { - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id, - CS_ETM_TRACE_OFF); + int ret; + struct cs_etm_packet *packet; + + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id, + CS_ETM_TRACE_OFF); + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT) + return ret; + + packet = &decoder->packet_buffer[decoder->tail]; + + /* Clear execption number for discontinuous trace */ + decoder->exc_num[packet->cpu] = UINT32_MAX; + + return ret; }
static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) { - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id, - CS_ETM_TRACE_ON); + int ret; + struct cs_etm_packet *packet; + + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id, + CS_ETM_TRACE_ON); + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT) + return ret; + + packet = &decoder->packet_buffer[decoder->tail]; + + /* Clear execption number for discontinuous trace */ + decoder->exc_num[packet->cpu] = UINT32_MAX; + + return ret; }
static ocsd_datapath_resp_t cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder, + const ocsd_generic_trace_elem *elem, const uint8_t trace_chan_id) { - return cs_etm_decoder__buffer_packet(decoder, trace_chan_id, - CS_ETM_EXCEPTION); + int ret; + struct cs_etm_packet *packet; + + ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id, + CS_ETM_EXCEPTION); + if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT) + return ret; + + packet = &decoder->packet_buffer[decoder->tail]; + + /* + * Exception number is recorded per CPU and later can be used + * for exception return instruction analysis. + */ + decoder->exc_num[packet->cpu] = elem->exception_number; + + return ret; }
static ocsd_datapath_resp_t @@ -423,7 +464,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION: - resp = cs_etm_decoder__buffer_exception(decoder, + resp = cs_etm_decoder__buffer_exception(decoder, elem, trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION_RET: @@ -511,6 +552,10 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params, if (!decoder) return NULL;
+ decoder->exc_num = zalloc(sizeof(*decoder->exc_num) * num_cpu); + if (!decoder->exc_num) + goto err_free_decoder; + decoder->data = d_params->data; decoder->prev_return = OCSD_RESP_CONT; cs_etm_decoder__clear_buffer(decoder); @@ -531,7 +576,7 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params, decoder->dcd_tree = ocsd_create_dcd_tree(format, flags);
if (decoder->dcd_tree == 0) - goto err_free_decoder; + goto err_free_decoder_exc_num;
/* init library print logging support */ ret = cs_etm_decoder__init_def_logger_printing(d_params, decoder); @@ -542,6 +587,9 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params, cs_etm_decoder__init_raw_frame_logging(d_params, decoder);
for (i = 0; i < num_cpu; i++) { + /* init expcetion number to an invalid value */ + decoder->exc_num[i] = UINT32_MAX; + ret = cs_etm_decoder__create_etm_decoder(d_params, &t_params[i], decoder); @@ -553,6 +601,8 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params,
err_free_decoder_tree: ocsd_destroy_dcd_tree(decoder->dcd_tree); +err_free_decoder_exc_num: + free(decoder->exc_num); err_free_decoder: free(decoder); return NULL; @@ -613,5 +663,6 @@ void cs_etm_decoder__free(struct cs_etm_decoder *decoder)
ocsd_destroy_dcd_tree(decoder->dcd_tree); decoder->dcd_tree = NULL; + free(decoder->exc_num); free(decoder); }
On Sun, Nov 11, 2018 at 12:59:43PM +0800, Leo Yan wrote:
When an exception packet comes, it contains the info for exception number; the exception number indicates the exception types, so from it we can know if the exception is taken for interrupt, system call or other traps, etc. But because the exception return packet cannot delivery exception number correctly by decoder thus when prepare sample flags we cannot know what's type for exception return.
This patch adds a new 'exc_num' array in decoder structure to record exception number per CPU, the exception number is recorded in the array when the exception packet comes and this exception number can be used by exception return packet. If detect there have discontinuous trace with TRACE_ON or TRACE_OFF packet, the exception number is set to invalid value.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 67 ++++++++++++++++++++++--- 1 file changed, 59 insertions(+), 8 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index b8cb7a3e..d1a6cbc 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -43,6 +43,7 @@ struct cs_etm_decoder { u32 packet_count; u32 head; u32 tail;
- u32 *exc_num; struct cs_etm_packet packet_buffer[MAX_BUFFER];
}; @@ -368,24 +369,64 @@ static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) {
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_OFF);
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_OFF);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- /* Clear execption number for discontinuous trace */
- decoder->exc_num[packet->cpu] = UINT32_MAX;
- return ret;
} static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) {
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_ON);
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_ON);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- /* Clear execption number for discontinuous trace */
- decoder->exc_num[packet->cpu] = UINT32_MAX;
- return ret;
} static ocsd_datapath_resp_t cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
const ocsd_generic_trace_elem *elem, const uint8_t trace_chan_id)
{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- /*
* Exception number is recorded per CPU and later can be used
* for exception return instruction analysis.
*/
- decoder->exc_num[packet->cpu] = elem->exception_number;
Am I missing something or the information about the exception number that is recorded here isn't used anywhere? If you want to use this in perf report/script, the exception number will have to be added to the cs_etm_packet struct.
I am done with the revision of this set.
Thanks, Mathieu
- return ret;
} static ocsd_datapath_resp_t @@ -423,7 +464,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION:
resp = cs_etm_decoder__buffer_exception(decoder,
break; case OCSD_GEN_TRC_ELEM_EXCEPTION_RET:resp = cs_etm_decoder__buffer_exception(decoder, elem, trace_chan_id);
@@ -511,6 +552,10 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params, if (!decoder) return NULL;
- decoder->exc_num = zalloc(sizeof(*decoder->exc_num) * num_cpu);
- if (!decoder->exc_num)
goto err_free_decoder;
- decoder->data = d_params->data; decoder->prev_return = OCSD_RESP_CONT; cs_etm_decoder__clear_buffer(decoder);
@@ -531,7 +576,7 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params, decoder->dcd_tree = ocsd_create_dcd_tree(format, flags); if (decoder->dcd_tree == 0)
goto err_free_decoder;
goto err_free_decoder_exc_num;
/* init library print logging support */ ret = cs_etm_decoder__init_def_logger_printing(d_params, decoder); @@ -542,6 +587,9 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params, cs_etm_decoder__init_raw_frame_logging(d_params, decoder); for (i = 0; i < num_cpu; i++) {
/* init expcetion number to an invalid value */
decoder->exc_num[i] = UINT32_MAX;
- ret = cs_etm_decoder__create_etm_decoder(d_params, &t_params[i], decoder);
@@ -553,6 +601,8 @@ cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params, err_free_decoder_tree: ocsd_destroy_dcd_tree(decoder->dcd_tree); +err_free_decoder_exc_num:
- free(decoder->exc_num);
err_free_decoder: free(decoder); return NULL; @@ -613,5 +663,6 @@ void cs_etm_decoder__free(struct cs_etm_decoder *decoder) ocsd_destroy_dcd_tree(decoder->dcd_tree); decoder->dcd_tree = NULL;
- free(decoder->exc_num); free(decoder);
}
2.7.4
On Mon, Nov 19, 2018 at 01:47:49PM -0700, Mathieu Poirier wrote:
On Sun, Nov 11, 2018 at 12:59:43PM +0800, Leo Yan wrote:
When an exception packet comes, it contains the info for exception number; the exception number indicates the exception types, so from it we can know if the exception is taken for interrupt, system call or other traps, etc. But because the exception return packet cannot delivery exception number correctly by decoder thus when prepare sample flags we cannot know what's type for exception return.
This patch adds a new 'exc_num' array in decoder structure to record exception number per CPU, the exception number is recorded in the array when the exception packet comes and this exception number can be used by exception return packet. If detect there have discontinuous trace with TRACE_ON or TRACE_OFF packet, the exception number is set to invalid value.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 67 ++++++++++++++++++++++--- 1 file changed, 59 insertions(+), 8 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index b8cb7a3e..d1a6cbc 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -43,6 +43,7 @@ struct cs_etm_decoder { u32 packet_count; u32 head; u32 tail;
- u32 *exc_num; struct cs_etm_packet packet_buffer[MAX_BUFFER];
}; @@ -368,24 +369,64 @@ static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) {
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_OFF);
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_OFF);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- /* Clear execption number for discontinuous trace */
- decoder->exc_num[packet->cpu] = UINT32_MAX;
- return ret;
} static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) {
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_ON);
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_ON);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- /* Clear execption number for discontinuous trace */
- decoder->exc_num[packet->cpu] = UINT32_MAX;
- return ret;
} static ocsd_datapath_resp_t cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
const ocsd_generic_trace_elem *elem, const uint8_t trace_chan_id)
{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- /*
* Exception number is recorded per CPU and later can be used
* for exception return instruction analysis.
*/
- decoder->exc_num[packet->cpu] = elem->exception_number;
Am I missing something or the information about the exception number that is recorded here isn't used anywhere?
The exception number will be used to set branch flag patch [1]. According to exception number we can know it's for system call, interrupt or other traps.
[1] http://archive.armlinux.org.uk/lurker/message/20181111.050755.d1c1b257.en.ht...
If you want to use this in perf report/script, the exception number will have to be added to the cs_etm_packet struct.
Actually before has discussed this with Mike but found it's hard to save the exception number in cs_etm_packet struct. The reason is the exception packet contains the correct exception number, but the exception return packet doesn't contain exception number. Thus this patch uses cs_etm_decoder struct to save exception number per CPU context when receive exception packet, and later the saved exception number will be used by exception return packet.
Please see related discussion at the end of page [2].
[2] https://lists.linaro.org/pipermail/coresight/2018-October/001832.html
I am done with the revision of this set.
Thanks a lot for reviewing.
[...]
Thanks, Leo Yan
On Tue, 4 Dec 2018 at 20:49, leo.yan@linaro.org wrote:
On Mon, Nov 19, 2018 at 01:47:49PM -0700, Mathieu Poirier wrote:
On Sun, Nov 11, 2018 at 12:59:43PM +0800, Leo Yan wrote:
When an exception packet comes, it contains the info for exception number; the exception number indicates the exception types, so from it we can know if the exception is taken for interrupt, system call or other traps, etc. But because the exception return packet cannot delivery exception number correctly by decoder thus when prepare sample flags we cannot know what's type for exception return.
This patch adds a new 'exc_num' array in decoder structure to record exception number per CPU, the exception number is recorded in the array when the exception packet comes and this exception number can be used by exception return packet. If detect there have discontinuous trace with TRACE_ON or TRACE_OFF packet, the exception number is set to invalid value.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 67 ++++++++++++++++++++++--- 1 file changed, 59 insertions(+), 8 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index b8cb7a3e..d1a6cbc 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -43,6 +43,7 @@ struct cs_etm_decoder { u32 packet_count; u32 head; u32 tail;
- u32 *exc_num; struct cs_etm_packet packet_buffer[MAX_BUFFER];
};
@@ -368,24 +369,64 @@ static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_off(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) {
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_OFF);
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_OFF);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- /* Clear execption number for discontinuous trace */
- decoder->exc_num[packet->cpu] = UINT32_MAX;
- return ret;
}
static ocsd_datapath_resp_t cs_etm_decoder__buffer_trace_on(struct cs_etm_decoder *decoder, const uint8_t trace_chan_id) {
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_ON);
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_TRACE_ON);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- /* Clear execption number for discontinuous trace */
- decoder->exc_num[packet->cpu] = UINT32_MAX;
- return ret;
}
static ocsd_datapath_resp_t cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
const ocsd_generic_trace_elem *elem, const uint8_t trace_chan_id)
{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- /*
- Exception number is recorded per CPU and later can be used
- for exception return instruction analysis.
- */
- decoder->exc_num[packet->cpu] = elem->exception_number;
Am I missing something or the information about the exception number that is recorded here isn't used anywhere?
The exception number will be used to set branch flag patch [1].
Right, I realised that when I started reviewing that set. The rule of thumb here is to introduce code in the same patchset it is used so that we avoid adding needless code to the kernel.
According to exception number we can know it's for system call, interrupt or other traps.
[1] http://archive.armlinux.org.uk/lurker/message/20181111.050755.d1c1b257.en.ht...
If you want to use this in perf report/script, the exception number will have to be added to the cs_etm_packet struct.
Actually before has discussed this with Mike but found it's hard to save the exception number in cs_etm_packet struct. The reason is the exception packet contains the correct exception number, but the exception return packet doesn't contain exception number. Thus this patch uses cs_etm_decoder struct to save exception number per CPU context when receive exception packet, and later the saved exception number will be used by exception return packet.
Please see related discussion at the end of page [2].
I find Mike's point about the possibility of seeing exception returns without having a prior exception due to various factors very interesting. I will make sure to keep an eye out for that in the next revision.
[2] https://lists.linaro.org/pipermail/coresight/2018-October/001832.html
I am done with the revision of this set.
Thanks a lot for reviewing.
[...]
Thanks, Leo Yan
On Wed, Dec 05, 2018 at 11:03:29AM -0700, Mathieu Poirier wrote:
[...]
static ocsd_datapath_resp_t cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder,
const ocsd_generic_trace_elem *elem, const uint8_t trace_chan_id)
{
- return cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
- int ret;
- struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(decoder, trace_chan_id,
CS_ETM_EXCEPTION);
- if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
- packet = &decoder->packet_buffer[decoder->tail];
- /*
- Exception number is recorded per CPU and later can be used
- for exception return instruction analysis.
- */
- decoder->exc_num[packet->cpu] = elem->exception_number;
Am I missing something or the information about the exception number that is recorded here isn't used anywhere?
The exception number will be used to set branch flag patch [1].
Right, I realised that when I started reviewing that set. The rule of thumb here is to introduce code in the same patchset it is used so that we avoid adding needless code to the kernel.
Will move this patch into sample flag series.
Thanks, Leo Yan