From: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Add handling of ITRACE events in order to add the tid/pid of the
executing process to the perf tools machine infrastructure. This
information is later retrieved when a contextID packet is found in the
trace stream.
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Tested-by: Leo Yan <leo.yan(a)linaro.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose(a)arm.com>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org
Link: http://lkml.kernel.org/r/20190524173508.29044-5-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
tools/perf/util/cs-etm.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index de488b43f440..0742c50fce46 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -1657,6 +1657,29 @@ static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm,
return 0;
}
+static int cs_etm__process_itrace_start(struct cs_etm_auxtrace *etm,
+ union perf_event *event)
+{
+ struct thread *th;
+
+ if (etm->timeless_decoding)
+ return 0;
+
+ /*
+ * Add the tid/pid to the log so that we can get a match when
+ * we get a contextID from the decoder.
+ */
+ th = machine__findnew_thread(etm->machine,
+ event->itrace_start.pid,
+ event->itrace_start.tid);
+ if (!th)
+ return -ENOMEM;
+
+ thread__put(th);
+
+ return 0;
+}
+
static int cs_etm__process_event(struct perf_session *session,
union perf_event *event,
struct perf_sample *sample,
@@ -1694,6 +1717,9 @@ static int cs_etm__process_event(struct perf_session *session,
return cs_etm__process_timeless_queues(etm,
event->fork.tid);
+ if (event->header.type == PERF_RECORD_ITRACE_START)
+ return cs_etm__process_itrace_start(etm, event);
+
return 0;
}
--
2.20.1
From: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Ask the perf core to generate an event when processes are swapped in/out
of context. That way proper action can be taken by the decoding code
when faced with such event.
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Tested-by: Leo Yan <leo.yan(a)linaro.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose(a)arm.com>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org
Link: http://lkml.kernel.org/r/20190524173508.29044-4-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
tools/perf/arch/arm/util/cs-etm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index be1e4f20affa..cc7f1cd23b14 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -257,6 +257,9 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
ptr->evlist = evlist;
ptr->snapshot_mode = opts->auxtrace_snapshot_mode;
+ if (perf_can_record_switch_events())
+ opts->record_switch_events = true;
+
evlist__for_each_entry(evlist, evsel) {
if (evsel->attr.type == cs_etm_pmu->type) {
if (cs_etm_evsel) {
--
2.20.1
Hello,
I am a graduate student from Virginia Commonwealth University working on
execution and data monitoring of my application running on a STM32F4 board.
I have a runtime monitor on an FPGA that would analyse the trace
information.
But, to transfer the instruction traces from ETM and data traces to the
FPGA, I would need a trace decoder on the FPGA. This trace decoder would be
help decode the ETM traces from STM32 board to the FPGA monitor. I want to
do monitoring at runtime. I came across the OpenCSD github repository. Do
you think this could fit well for our application? Can the OpenCSD be
implemented on an FPGA to decode ETM traces?
Thank you for any inputs on this.
regards,
Smitha
Hi Jeremy,
Please CC the coresight mailing list when asking questions.
On Thu, 6 Jun 2019 at 02:55, Student - Ng Yi Zher Jeremy
<jeremy_ng(a)mymail.sutd.edu.sg> wrote:
>
> Dear Sir,
>
> I have been looking at the documentations for Coresight to understand how I may be able to set parameters and options to tracing units through sysFS.
>
> Looking at the documentations for etm4x and tmc (https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-coresight-de… and https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-coresight-de… respectively), I understand that the special files that I have access to read from registers directly are not writeable. However, in the coresight documentations,
What special files are you referring to?
(https://static.docs.arm.com/ihi0064/f/etm_v4_4_architecture_specification_I…
and http://infocenter.arm.com/help/topic/com.arm.doc.ddi0461b/DDI0461B_tmc_r0p1…
respectively), some of these registers are actually writeable.
Particularly, TRCCONFIGR in ETM drivers and MODE register in ETF
drivers are RW accessible. However, when I try to write to these
addresses directly from /dev/mem (or rather, mmap), I often get bus
errors (even for those that claims to be readable).
Register TRCCONFIGR has been set as RO because, from sysfs, there was
no use case to make it otherwise. That can be altered if you need to
use some of the functionality in that register. Simply get back to me
with the one you're looking for and we can discuss how it will be made
available.
The ETF's MODE register does not need to be configured by users - the
framework will place the ETF in the correct mode based on its role in
the trace session. If the ETF's "enable_sink" entry is selected, the
ETF is used as a sink and will be configured in circular buffer mode.
If another sink is selected and the ETF is part of the path from a
source to that sink, the framework will configure it in HW FIFO mode.
>
> I am using Hikey960 device on AOSP Android version R, Linux kernel 4.9. $ uname -a returns Linux localhost 4.9.176-12953-g7c09ed7b46a4-dirty #13 SMP PREEMPT Tue Jun 4 10:16:26 +08 2019 aarch64
>
> Hikey960 have 2 CPUs with 4 processors each: Cortex-A53 and Cortex-A73. Both have ETM4.0 r4 chips installed (this was derived from TRCIDR1, which yields 0x4100f404 when read).
>
> It will be a great help if you can assist me or point me to any link for reference.
I can't guide you to anything specific without a question.
Thanks,
Mathieu
>
> I look forward to your reply!
>
> Yours Sincerely,
> Jeremy
>
> This email may contain confidential and/or proprietary information that is exempt from disclosure under applicable law and is intended for receipt and use solely by the addressee(s) named above. If you are not the intended recipient, you are notified that any use, dissemination, distribution, or copying of this email, or any attachment, is strictly prohibited. Please delete the email immediately and inform the sender. Thank You
>
> The above message may contain confidential and/or proprietary information that is exempt from disclosure under applicable law and is intended for receipt and use solely by the addressee(s) named above. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this message, or any attachment, is strictly prohibited. If you have received this email in error, please inform the sender immediately by reply e-mail or telephone, reversing the charge if necessary. Please delete the message thereafter. Thank you.
This patchset adds support for CoreSight CPU-wide trace scenarios. More
specifically it extends the work that was done for per thread scenarios to
handle more than a single trace ID. It also temporally correlate traces
based on timestamp generated by the tracers so that rendering by the perf
mechanic is ordered.
Everything is based on Arnaldo's perf/core branch (46d4c9a05285). I will
send another revision when it is rebased to a 5.2 rc candidate.
Before this set:
# root@juno:/home/linaro# perf record -e cs_etm/(a)20070000.etr/ -C 2,3 sleep 1
failed to mmap with 12 (Cannot allocate memory)
After this set:
# root@juno:/home/linaro# perf record -e cs_etm/(a)20070000.etr/ -C 2,3 sleep 1
[ perf record: Captured and wrote 1.352 MB perf.data ]
Regards,
Mathieu
Changes for V2:
* Fixed error condition in function cs_etm_set_option() (Leo)
* Fixed changelog spelling error (Leo).
* Moved from calloc() to malloc() in cs_etm__etmq_get_traceid_queue()
* Got rid of CS_ETM_PACKET_QUEUE_NR macro
* Fixed indentation problem in function cs_etm__process_traceid_queue() (Leo).
Mathieu Poirier (17):
perf tools: Configure contextID tracing in CPU-wide mode
perf tools: Configure timestsamp generation in CPU-wide mode
perf tools: Configure SWITCH_EVENTS in CPU-wide mode
perf tools: Add handling of itrace start events
perf tools: Add handling of switch-CPU-wide events
perf tools: Refactor error path in cs_etm_decoder__new()
perf tools: Move packet queue out of decoder structure
perf tools: Fix indentation in function
cs_etm__process_decoder_queue()
perf tools: Introduce the concept of trace ID queues
perf tools: Get rid of unused cpu in struct cs_etm_queue
perf tools: Move thread to traceid_queue
perf tools: Move tid/pid to traceid_queue
perf tools: Use traceID aware memory callback API
perf tools: Add support for multiple traceID queues
perf tools: Linking PE contextID with perf thread mechanic
perf tools: Add notion of time to decoding code
perf tools: Add support for CPU-wide trace scenarios
tools/perf/Makefile.config | 3 +
tools/perf/arch/arm/util/cs-etm.c | 186 ++-
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 269 +++--
.../perf/util/cs-etm-decoder/cs-etm-decoder.h | 39 +-
tools/perf/util/cs-etm.c | 1026 +++++++++++++----
tools/perf/util/cs-etm.h | 103 ++
6 files changed, 1252 insertions(+), 374 deletions(-)
--
2.17.1
Update the documentation to reflect the new naming scheme with
latest changes.
Reported-by: Leo Yan <leo.yan(a)linaro.org>
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
---
Documentation/trace/coresight.txt | 34 +++++++++++++++++++---------------
1 file changed, 19 insertions(+), 15 deletions(-)
diff --git a/Documentation/trace/coresight.txt b/Documentation/trace/coresight.txt
index efbc832..7b427cf 100644
--- a/Documentation/trace/coresight.txt
+++ b/Documentation/trace/coresight.txt
@@ -326,16 +326,20 @@ amount of processor cores), the "cs_etm" PMU will be listed only once.
A Coresight PMU works the same way as any other PMU, i.e the name of the PMU is
listed along with configuration options within forward slashes '/'. Since a
Coresight system will typically have more than one sink, the name of the sink to
-work with needs to be specified as an event option. Names for sink to choose
-from are listed in sysFS under ($SYSFS)/bus/coresight/devices:
+work with needs to be specified as an event option.
+On newer kernels the available sinks are listed in sysFS under:
+($SYSFS)/bus/event_source/devices/cs_etm/sinks/
- root@linaro-nano:~# ls /sys/bus/coresight/devices/
- 20010000.etf 20040000.funnel 20100000.stm 22040000.etm
- 22140000.etm 230c0000.funnel 23240000.etm 20030000.tpiu
- 20070000.etr 20120000.replicator 220c0000.funnel
- 23040000.etm 23140000.etm 23340000.etm
+ root@localhost:/sys/bus/event_source/devices/cs_etm/sinks# ls
+ tmc_etf0 tmc_etr0 tpiu0
- root@linaro-nano:~# perf record -e cs_etm/(a)20070000.etr/u --per-thread program
+On older kernels, this may need to be found from the list of coresight devices,
+available under ($SYSFS)/bus/coresight/devices/:
+
+ root@localhost:/sys/bus/coresight/devices# ls
+ etm0 etm1 etm2 etm3 etm4 etm5 funnel0 funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0
+
+ root@linaro-nano:~# perf record -e cs_etm/@tmc_etr0/u --per-thread program
The syntax within the forward slashes '/' is important. The '@' character
tells the parser that a sink is about to be specified and that this is the sink
@@ -352,7 +356,7 @@ perf can be used to record and analyze trace of programs.
Execution can be recorded using 'perf record' with the cs_etm event,
specifying the name of the sink to record to, e.g:
- perf record -e cs_etm/(a)20070000.etr/u --per-thread
+ perf record -e cs_etm/@tmc_etr0/u --per-thread
The 'perf report' and 'perf script' commands can be used to analyze execution,
synthesizing instruction and branch events from the instruction trace.
@@ -381,7 +385,7 @@ sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tuto
Bubble sorting array of 30000 elements
5910 ms
- $ perf record -e cs_etm/(a)20070000.etr/u --per-thread taskset -c 2 ./sort
+ $ perf record -e cs_etm/@tmc_etr0/u --per-thread taskset -c 2 ./sort
Bubble sorting array of 30000 elements
12543 ms
[ perf record: Woken up 35 times to write data ]
@@ -405,7 +409,7 @@ than the program flow through the code.
As with any other CoreSight component, specifics about the STM tracer can be
found in sysfs with more information on each entry being found in [1]:
-root@genericarmv8:~# ls /sys/bus/coresight/devices/20100000.stm
+root@genericarmv8:~# ls /sys/bus/coresight/devices/stm0
enable_source hwevent_select port_enable subsystem uevent
hwevent_enable mgmt port_select traceid
root@genericarmv8:~#
@@ -413,14 +417,14 @@ root@genericarmv8:~#
Like any other source a sink needs to be identified and the STM enabled before
being used:
-root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/20010000.etf/enable_sink
-root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/20100000.stm/enable_source
+root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
+root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/stm0/enable_source
From there user space applications can request and use channels using the devfs
interface provided for that purpose by the generic STM API:
-root@genericarmv8:~# ls -l /dev/20100000.stm
-crw------- 1 root root 10, 61 Jan 3 18:11 /dev/20100000.stm
+root@genericarmv8:~# ls -l /dev/stm0
+crw------- 1 root root 10, 61 Jan 3 18:11 /dev/stm0
root@genericarmv8:~#
Details on how to use the generic STM API can be found here [2].
--
2.7.4
CTIs are defined in the device tree and associated with other CoreSight
devices. The core CoreSight code has been modified to enable the registration
of the CTI devices on the same bus as the other CoreSight components,
but as these are not actually trace generation / capture devices, they
are not part of the Coresight path when generating trace.
However, the definition of the standard CoreSight device has been extended
to include a reference to an associated CTI device, and the enable / disable
trace path operations will auto enable/disable any associated CTI devices at
the same time.
Programming is at present via sysfs - a full API is provided to utilise the
hardware capabilities. As CTI devices are unprogrammed by default, the auto
enable describe above will have no effect until explicit programming takes
place.
A set of device tree bindings specific to the CTI topology has been defined.
Documentation has been updated to describe both the CTI hardware, its use and
programming in sysfs, and the new dts bindings required.
Tested on DB410 board, 5.1-rc5
Changes since v1:
1) Significant restructuring of the source code. Adds cti-sysfs file and
cti device tree file. Patches add per feature rather than per source
file.
2) CPU type power event handling for hotplug moved to CoreSight core,
with generic registration interface provided for all CPU bound CS devices
to use.
3) CTI signal interconnection details in sysfs now generated dynamically
from connection lists in driver. This to fix issue with multi-line sysfs
output in previous version.
4) Full device tree bindings for DB410 and Juno provided (to the extent
that CTI information is available).
5) AMBA driver update for UCI IDs are now upstream so no longer included
in this set.
Mike Leach (13):
drivers: coresight: cti: Initial CoreSight CTI Driver
drivers: coresight: cti: Adds sysfs functionality to CTI driver.
drivers: coresight: cti: Add device tree support for v8 arch CTI
drivers: coresight: cti: Add device tree support for impdef CTI.
drivers: coresight: cti: Enable CTI associated with devices.
drivers: coresight: cti: Add connection information to sysfs
drivers: coresight: cti: Add CoreSight cpu power notifications.
devicetree: bindings: Documentation for CTI bindings.
devicetree: bindings: Add header file with CTI trigger signal type
constants.
drivers: dts: Add CTI options for qcom msm8916
drivers: dts: Juno platform - add CTI entries to device tree.
docs: coresight: Update documentation for CoreSight to cover CTI.
docs: sysfs: coresight: Add sysfs documentation for CTI
.../testing/sysfs-bus-coresight-devices-cti | 225 +++
.../bindings/arm/coresight-ect-cti.txt | 203 +++
.../devicetree/bindings/arm/coresight.txt | 7 +
Documentation/trace/coresight.txt | 139 ++
arch/arm64/boot/dts/arm/juno-base.dtsi | 149 +-
arch/arm64/boot/dts/arm/juno-cs-r1r2.dtsi | 31 +-
arch/arm64/boot/dts/arm/juno-r1.dts | 25 +
arch/arm64/boot/dts/arm/juno-r2.dts | 25 +
arch/arm64/boot/dts/arm/juno.dts | 25 +
arch/arm64/boot/dts/qcom/msm8916.dtsi | 102 +-
drivers/hwtracing/coresight/Kconfig | 13 +
drivers/hwtracing/coresight/Makefile | 4 +
.../hwtracing/coresight/coresight-cti-sysfs.c | 1250 +++++++++++++++++
drivers/hwtracing/coresight/coresight-cti.c | 853 +++++++++++
drivers/hwtracing/coresight/coresight-cti.h | 280 ++++
drivers/hwtracing/coresight/coresight-priv.h | 37 +
drivers/hwtracing/coresight/coresight.c | 185 ++-
.../hwtracing/coresight/of_coresight-cti.c | 447 ++++++
include/dt-bindings/arm/coresight-cti-dt.h | 36 +
include/linux/coresight.h | 30 +
20 files changed, 4056 insertions(+), 10 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-cti
create mode 100644 Documentation/devicetree/bindings/arm/coresight-ect-cti.txt
create mode 100644 drivers/hwtracing/coresight/coresight-cti-sysfs.c
create mode 100644 drivers/hwtracing/coresight/coresight-cti.c
create mode 100644 drivers/hwtracing/coresight/coresight-cti.h
create mode 100644 drivers/hwtracing/coresight/of_coresight-cti.c
create mode 100644 include/dt-bindings/arm/coresight-cti-dt.h
--
2.20.1
We have a few places where we call smp_processor_id() from preemptible
contexts during the perf buffer handling. We do this to figure out the
numa node for the allocation in case the event is not CPU bound. Use
numa_node_id() instead in such cases to avoid a splat.
Changes since V2:
- Use NUMA_NO_NODE instead of numa_node_id() for event->cpu == -1. (Robin Murphy)
Suzuki K Poulose (4):
coresight: tmc-etr: Do not call smp_processor_id() from preemptible
coresight: tmc-etr: alloc_perf_buf: Do not call smp_processor_id from
preemptible
coresight: tmc-etf: Do not call smp_processor_id from preemptible
coresight: etb10: Do not call smp_processor_id from preemptible
drivers/hwtracing/coresight/coresight-etb10.c | 6 ++----
drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++----
drivers/hwtracing/coresight/coresight-tmc-etr.c | 13 ++++---------
3 files changed, 8 insertions(+), 17 deletions(-)
--
2.7.4