OpenCSD 0v004 provides a more scaleable and generic API for decoder
creation. This patch fixes the cs-etm-decoder in perf report to use
the new API.
Mike Leach (1):
cs-etm: Update to perf cs-etm decode for new C API
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 34 +++++++++++++++++--------
1 file changed, 24 insertions(+), 10 deletions(-)
--
2.7.4
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
---
HOWTO.md | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/HOWTO.md b/HOWTO.md
index ad19e9eb4aea..47e67f734964 100644
--- a/HOWTO.md
+++ b/HOWTO.md
@@ -7,8 +7,8 @@ This HOWTO explains how to use the perf cmd line tools and the openCSD
library to collect and extract program flow traces generated by the
CoreSight IP blocks on a Linux system. The examples have been generated using
an aarch64 Juno-r0 platform. All information is considered accurate and tested
-using library branches `opencsd-0v002` and `opencsd-0v003` (decode library only)
-and the latest perf branch `perf-opencsd-4.7-rc4` (decode library + perf tools)
+using library branches `opencsd-0v002` and `opencsd-0v003` (decode library only)
+and the latest perf branch `perf-opencsd-4.7` (decode library + perf tools)
on the [OpenCSD github repository][1].
@@ -17,8 +17,8 @@ On Target Trace Acquisition - Perf Record
All the enhancement to the Perf tools that support the new `cs_etm` pmu have
not been upstreamed yet. To get the required functionality branch
-`perf-opencsd-4.7-rc4` needs to be downloaded to the target system where
-traces are to be collected. This branch is an upstream v4.7-rc4 kernel
+`perf-opencsd-4.7` needs to be downloaded to the target system where
+traces are to be collected. This branch is an upstream v4.7 kernel
supplemented with modifications to the CoreSight framework and drivers to be
usable by the Perf core. The remaining out of tree patches are being
upstreamed incrementally.
@@ -163,14 +163,14 @@ the host's (which has nothing to do with the target) architecture:
Off Target Perf Tools Compilation
---------------------------------
As stated above not all the pieces of the solution have been upstreamed. To
-get all the components branch `perf-opencsd-4.7-rc4` needs to be
+get all the components branch `perf-opencsd-4.7` needs to be
obtained:
- linaro@t430:~/linaro/coresight$ git clone -b perf-opencsd-4.7-rc4 https://github.com/Linaro/OpenCSD.git perf-opencsd-4.7-rc4
+ linaro@t430:~/linaro/coresight$ git clone -b perf-opencsd-4.7 https://github.com/Linaro/OpenCSD.git perf-opencsd-4.7
...
...
- linaro@t430:~/linaro/coresight$ ls perf-opencsd-4.7-rc4/
+ linaro@t430:~/linaro/coresight$ ls perf-opencsd-4.7/
arch certs CREDITS Documentation firmware include ipc Kconfig lib Makefile net REPORTING-BUGS scripts sound usr
block COPYING crypto drivers fs init Kbuild kernel MAINTAINERS mm README samples security tools virt
@@ -179,12 +179,12 @@ variable telling the build scripts where to find the library is needed. If
the `CSTRACE_PATH` variable is not defined the compilation will still be
successful, but handling of CoreSight trace data won't be supported.
- linaro@t430:~/linaro/coresight$ cd perf-opencsd-4.7-rc4
- linaro@t430:~/linaro/coresight/perf-opencsd-4.7-rc4$ export CSTRACE_PATH=~/linaro/coresight/opencsd-0v003/decoder
- linaro@t430:~/linaro/coresight/perf-opencsd-4.7-rc4$ make -C tools/perf
+ linaro@t430:~/linaro/coresight$ cd perf-opencsd-4.7
+ linaro@t430:~/linaro/coresight/perf-opencsd-4.7$ export CSTRACE_PATH=~/linaro/coresight/opencsd-0v003/decoder
+ linaro@t430:~/linaro/coresight/perf-opencsd-4.7$ make -C tools/perf
...
...
- linaro@t430:~/linaro/coresight/perf-opencsd-4.7-rc4$ ls -l tools/perf/perf
+ linaro@t430:~/linaro/coresight/perf-opencsd-4.7$ ls -l tools/perf/perf
-rwxrwxr-x 1 linaro linaro 6276360 Mar 3 10:05 tools/perf/perf
@@ -224,7 +224,7 @@ to be sure everything is clean.
linaro@t430:~/linaro/coresight/feb24$ rm -rf ~/.debug
linaro@t430:~/linaro/coresight/feb24$ cp -dpR .debug ~/
linaro@t430:~/linaro/coresight/feb24$ export LD_LIBRARY_PATH=~/linaro/coresight/opencsd-0v003/decoder/lib/linux64/dbg/
- linaro@t430:~/linaro/coresight/feb24$ ../perf-opencsd-4.7-rc4/tools/perf/perf report --stdio
+ linaro@t430:~/linaro/coresight/feb24$ ../perf-opencsd-4.7/tools/perf/perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
@@ -268,7 +268,7 @@ to be sure everything is clean.
Additional data can be obtained, which contains a dump of the trace packets received using the command
- mjl@ubuntu-vbox:./perf-opencsd-4.7-rc4/coresight/tools/perf/perf report --stdio --dump
+ mjl@ubuntu-vbox:./perf-opencsd-4.7/coresight/tools/perf/perf report --stdio --dump
resulting a large amount of data, trace looking like:-
@@ -317,10 +317,10 @@ Trace Decoding with Perf Script
Working with perf scripts needs more command line options but yields
interesting results.
- linaro@t430:~/linaro/coresight/feb24$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-4.7-rc4/tools/perf/
+ linaro@t430:~/linaro/coresight/feb24$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-4.7/tools/perf/
linaro@t430:~/linaro/coresight/feb24$ export SCRIPT_PATH=$EXEC_PATH/scripts/python/
linaro@t430:~/linaro/coresight/feb24$ export XTOOL_PATH=/your/aarch64/toolchain/path/bin/
- linaro@t430:~/linaro/coresight/feb24$ ../perf-opencsd-4.7-rc4/tools/perf/perf --exec-path=${EXEC_PATH} script --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d ${XTOOL_PATH}/aarch64-linux-gnu-objdump
+ linaro@t430:~/linaro/coresight/feb24$ ../perf-opencsd-4.7/tools/perf/perf --exec-path=${EXEC_PATH} script --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d ${XTOOL_PATH}/aarch64-linux-gnu-objdump
7f89f24d80: 910003e0 mov x0, sp
7f89f24d84: 94000d53 bl 7f89f282d0 <free@plt+0x3790>
--
2.7.4
On 13 July 2016 at 10:35, Al Grant <Al.Grant(a)arm.com> wrote:
> Hi,
>
> When you see the libraries being mapped multiple times, are you just seeing the code and data segments? I see that too, I just ignore the data segments.
>
(Taking the liberty of CC'ing the list as this is probably a topic of interest)
Each time a library is mapped perf gets notified by the mm subsystem.
Part of the notification is a new vm_area_struct that contains the new
start address of the library (vm_area_struct::vm_start). Upon
receiving the notification the new address is communicated to the ETM
drivers which do the required filter configuration. That is all good
and working well.
On ARM64 (because I _assume_ X86 folks didn't see this) we get 3
notifications. For example notification A will have address
0x7f93a60000 while, subsequently, notification B and C address
0x7f93a70000. Note that the latter two are 64K higher than the first
one.
Once the last notification has been received the code in the main
program is executed. That code (in the main program) jumps to library
code mapped at the address it got from the first notification and not
the last one, making the filter configuration all wrong.
As such I have to understand what notification B and C are for. Based
on the vm_area_struct::vm_flags I'm guessing some sort of accounting
feature but not sure yet. If I ignore notification B and C, things
work amazingly well and one can really see the power offered by
coresight.
That's where I'm at now.
Get back to me if you (or anyone else) want more information.
Mathieu
> Al
> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
>
An issue was noted while running tests in windows debug mode on the latest
development library.
As noted in the patch commit message, an initialisation issue for the
output packet structure was highlighted by the windows debug memory
initialisation (0xcdcdcdcd) which did not show up in linux tests (probably
due to default 0 init).
This led to the discovery of an additional issue with the setting of the
.isa field in the instruction range output packets. Prior to the patch this
was defaulting AArch32 in linux and not always being set correctly in the
ETMv4 and PTM decoders The default for PTM was probably OK in none-thumb
cases but the AArch64 juno captures were consistently reporting AArch32 in
the output packets which should have been AArch64.
The updated release has been tested against the opencsd-perf-4.7-rc4 build
of the perf report/script tools.
As expected the output from perf report is unchanged.
However, the perf script, which runs the architecture based disassembly is
also unchanged, suggesting that this code is not at present taking note of
the ISA supplied by the trace output.
Running against both the unpatched library, with packets marked as AArch32,
and the patched library, with packets marked as AArch64, resulted in the
disassembly correctly being output as AArch64.
I assume that the disassembly routines are obtaining the current core arch
from other information in the perf.data file. We should probably consider
if this is the best way to go in this case.
Regards
Mike
--
Mike Leach
Principal Engineer, ARM Ltd.
Blackburn Design Centre. UK
Patch updates the handler function in cs-etm-decoder to deal with a new generic packet type introduced in the latest decoder.
Mike Leach (1):
cs-etm: Update to cs-etm-decoder to handle new packet type from
OpenCSD.
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
--
1.9.1
Additional Generic element type was added in OpenCSD 0v003 (ETMv3 additions) - this broke the perf build as it was missing from the case statement handling the enum types.
Patch fixes this - generated relative to perf-opencsd-4.7-rc1 branch
Mike Leach (1):
csetm: Update to cs-etm-decoder to handle new packet type.
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 1 +
1 file changed, 1 insertion(+)
--
1.9.1
Good day all,
Here's the first draft of the "abstract" I intend to submit for ELC-E
in Berlin. Keeping in mind the 900 character limit (this is already
880), please have a read and get back to me with your thoughts.
[Begin]
The CoreSight framework available in the Linux kernel has recently
been integrated with the Perf core, making HW assisted tracing on ARM
systems accessible to developers working on a wide spectrum of
products. This presentation will start by giving a brief overview of
the CoreSight technology itself before presenting the current
solution, from trace collection in kernel space to off system trace
decoding. To help with the latter part the Open CoreSight Decoding
Library (openCSD) is introduced. OpenCSD is an open source library
assisting in the decoding of collected trace data. We will see how it
is used with the existing perf tools to provide an end-to-end solution
for CoreSight trace decoding. The presentation will conclude with
trace acquisition and decoding scenarios, along with tips on how to
interpret trace information rendered by the perf tools.
[End]
Many thanks,
Mathieu