Hello Mathieu,
Thank for the config file. It works. I was able to build the OpenCSD kernel (form the perf-opencsd-master branch) and install on the USB (I used the ArchLinuxARM-aarch64-latest.tar.gz). I also built the perf tool (make -C tools/perf). Everything is booting but the perf has some issues:
[root@alarm home]# ./perf record -vvv -e cs_etm/(a)20070000.etr/u --per-thread uname
map_groups__set_modules_path_dir: cannot open /lib/modules/4.13.0-rc1-ge565ad6 dir
Problems setting modules path maps, continuing anyway...
------------------------------------------------------------
perf_event_attr:
type 8
size 112
{ sample_period, sample_freq } 1
sample_type IP|TID|IDENTIFIER
read_format ID
disabled 1
exclude_kernel 1
exclude_hv 1
enable_on_exec 1
sample_id_all 1
------------------------------------------------------------
sys_perf_event_open: pid 2242 cpu -1 group_fd -1 flags 0x8 = 4
------------------------------------------------------------
perf_event_attr:
type 1
size 112
config 0x9
{ sample_period, sample_freq } 1
sample_type IP|TID|IDENTIFIER
read_format ID
disabled 1
exclude_kernel 1
exclude_hv 1
mmap 1
comm 1
enable_on_exec 1
task 1
sample_id_all 1
mmap2 1
comm_exec 1
------------------------------------------------------------
sys_perf_event_open: pid 2242 cpu -1 group_fd -1 flags 0x8 = 5
mmap size 528384B
AUX area mmap length 4194304
perf event ring buffer mmapped per thread
failed to mmap AUX area
failed to mmap with 12 (Cannot allocate memory)
I fixed the "map_groups__set_modules_path_dir: cannot open /lib/modules/4.13.0-rc1-ge565ad6 dir" issue by adding appropriate symbolic link but I still have an issue with the mmap. Any idea what can be wrong here (below limits that I have on my Juno)?
[root@alarm ~]# ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 31798
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 31798
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
[root@alarm ~]#
Regards
Marek
W dniu 2017-08-18 16:54:42 użytkownik Mathieu Poirier <mathieu.poirier(a)linaro.org> napisał:
> On 18 August 2017 at 04:22, marekzmyslowski
> <marekzmyslowski(a)poczta.onet.pl> wrote:
> > Hello Mathieu,
> >
> > I've decided that currently I don't need Android. The Linux is enough.
>
> That is probably a better place to start.
>
> > However I have another issue. I've downloaded the perf-opencsd-master branch. I run the config with the ARCH=arm64 and CROSS_COMPLIE=aarch64-linux-gnu- and added support for Versatile board. Then I compiled kernel - everything was OK. Next I built the USB using the following instruction:
> > https://archlinuxarm.org/platforms/armv8/arm/juno (it works fine. The linux boot on the Juno).
> > Next I copied the Image file and juno.dtb into the USB but it doesn't boot. It hangs here:
> >
> > initrd: address 0x0
> > initrd: length 0x0
> > PEI 1132 ms
> > DXE 1695 ms
> > BDS 368934875444 ms
> > BDS 368934873448 ms
> > BDS 1535 ms
> > Total Time = 368934871781 ms
> >
> > linux: address 0x80080000
> > linux: length 0x1150200
> > fdt: address 0x9FE00000
> > fdt: length 0x5F54
> >
> > Any idea what I'm doing wrong? Any help will be appreciated (I'm so close to have Juno + CoreSight + perf :) )
>
> I can't help you with booting the board itself. The best I can do is
> advise to use u-boot instead of UEFI and give you my kernel .config
> file (attached). For the rest there is plenty of documentation out
> there.
>
> >
> > Regards
> > Marek
> >
> > W dniu 2017-08-16 23:08:04 użytkownik Mathieu Poirier <mathieu.poirier(a)linaro.org> napisał:
> >> Hello Marek,
> >>
> >> Please CC the CoreSight mailing list when asking questions as someone
> >> else may also be able to answer.
> >>
> >> First and foremost I advise using the official CoreSight kernel found
> >> on the openCSD site [1] rather than my personal branch [2] - you
> >> never know what you'll get with the latter.
> >>
> >> That being said the CoreSight kernel on the openCSD site is not an
> >> Android kernel - it is simply a mainline kernel supplemented with
> >> patches that haven't made their way to mainline yet. You will have to
> >> either add the android patches to the CoreSight kernel or the other
> >> way around (CoreSight patches on android kernel).
> >>
> >> Android user space is also different and does not include the
> >> perf-tools. You will have to add them manually along with the
> >> dependencies they require. I haven't gone through that process and as
> >> such can't advise more on that portion.
> >>
> >> Get back to me with your questions if the above isn't sufficient.
> >>
> >> Best regards,
> >> Mathieu
> >>
> >> [1]. https://github.com/Linaro/OpenCSD/tree/perf-opencsd-master
> >> [2]. https://git.linaro.org/people/mathieu.poirier/coresight.git/
> >>
> >> On 16 August 2017 at 14:32, marekzmyslowski
> >> <marekzmyslowski(a)poczta.onet.pl> wrote:
> >> > Hello Mathieu,
> >> >
> >> > I'm sorry for bothering but I think you may be person that can help me. I'm trying to install and run Android on Juno Board r0. I tested Android 17.05 from Linaro and it works. Now I'm trying to have a perf using Coresight but I'm little confused. Do I need to build Android from Linaro and the kernel from here https://git.linaro.org/people/mathieu.poirier/coresight.git/ or here https://github.com/Linaro/OpenCSD/tree/perf-opencsd-4.12.
> >> > Any help with this will be appreciated :)
> >> >
> >> > Regards
> >> > Marek Zmysłowski
> >>
> >
> >
> >
>
Hi,
This patchset adds support for user space decoding of CoreSight traces [1]
of the ARM architecture. Kernel support for configuring CoreSight tracers
and collect the hardware trace data in the auxtrace section of the
perf.data file is already integrated [2]. The user space implementation
mirrors to a large degree that of the Intel Processor Trace (PT) [3]
implementation, except that the decoder library itself is separate from the
perf tool sources, and is built and maintained as a separate open source
project [4]. Instead, this patch set includes the necessary code and build
settings to interfaces to the decoder library. This approach was chosen as
the decoder library has uses outside the perf toolset and on non-linux
platforms.
The decoder library interface code in this patch set only supports ETMv4
trace decoding, though the library itself supports a broader range. Future
patches will add support for more versions of the ARM ETM trace encoding.
The trace decoder library used with this patch set is the most recent
version with tag v0.7.3
This patch set, instead of being based on commits in my private branch, has
been applied to a new copy of the perf-opencsd-master branch of [4] and
pushed to [5] with the same perf-opencsd-master branch name.
Changes since last revision:
Given this is the second time it is sent out to the new audience on
coresight(a)lists.linaro.org, I am resetting the version number to 2 to avoid
confusion with previous mailings with a more restricted audience.
Two additional commits have been added to the patches in this patch set to
be fully compatible with the most recent version of the decoder library. The
previous patch set assumed an older version.
[1] https://lwn.net/Articles/626463
[2] https://github.com/torvalds/linux/tree/master/drivers/hwtracing/coresight
[3] https://lwn.net/Articles/648154
[4] https://github.com/Linaro/OpenCSD
[5] https://github.com/tor-jeremiassen/OpenCSD
Tor Jeremiassen (23):
perf tools: Add initial hooks for decoding coresight traces
perf tools: Add processing of coresight metadata
perf tools: Add coresight trace decoder library interface
perf tools: Add data block processing function
perf tools: Add channel context item to track packet sources
perf tools: Add etmv4i packet printing capability
perf tools: Add decoder new and free
perf tools: Add trace packet print for dump_trace option
perf tools: Add code to process the auxtrace perf event
perf tools: Add function to read data from dsos
perf tools: Add mapping from cpu to cs_etm_queue
perf tools: Add functions to allocate and free queues
perf tools: Add functions to setup and initialize queues
perf tools: Add functions to allocate and free queues
perf tools: Add function to get trace data from aux buffer
perf tools: Add function to run the trace decoder and process samples
perf tools: Add functions to process queues and run the trace decoder
perf tools: Add perf event processing
perf tools: Add processing of queues when events are flushed
perf tools: Add synth_events and supporting functions
perf tools: Add function to clear the decoder packet buffer
perf tools: Add functions for full etmv4i packet decode
MAINTAINERS: Adding entry for CoreSight trace decoding
MAINTAINERS | 3 +-
tools/perf/Makefile.config | 26 +
tools/perf/util/Build | 6 +
tools/perf/util/auxtrace.c | 2 +
tools/perf/util/cs-etm-decoder/Build | 2 +
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 581 +++++++++++
tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 138 +++
tools/perf/util/cs-etm.c | 1180 +++++++++++++++++++++++
tools/perf/util/cs-etm.h | 50 +
9 files changed, 1987 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/util/cs-etm-decoder/Build
create mode 100644 tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
create mode 100644 tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
create mode 100644 tools/perf/util/cs-etm.c
--
2.7.4
The TMC-ETR supports routing the Coresight trace data to the
System memory. It supports two different modes in which the memory
could be used.
1) Contiguous memory - The memory is assumed to be physically
contiguous.
2) Scatter Gather list - The memory can be chunks of 4K pages,
which are specified in a table of pointers which itself could be
multiple 4K size pages.
To avoid the complications of the managing the buffer, this series
adds a layer for managing the ETR buffer, which makes the best possibly
choice based on what is available. The allocation can be tuned by passing
in flags, existing pages (e.g, perf ring buffer) etc.
Towards supporting ETR Scatter Gather mode, we introduce a generic TMC
scatter-gather table which can be used to manage the data and table pages.
The table can be filled in the format expected by the Scatter-Gather
mode.
The TMC ETR-SG mechanism doesn't allow starting the trace at non-zero
offset (required by perf). So we make some tricky changes to the table
at run time to allow starting at any "Page aligned" offset and then
wrap around to the beginning of the buffer with very less overhead.
See patches for more description.
The series also improves the way the ETR is controlled by different modes
(sysfs vs. perf) by keeping mode specific data. This allows access
to the trace data collected in sysfs mode, even when the ETR is
operated in perf mode. Also with the transparent management of the
buffer and scatter-gather mechanism, we can allow the user to
request for larger trace buffers for sysfs mode. This is supported
by providing a sysfs file, "buffer_size" which accepts a page aligned
size, which will be used by the ETR when allocating a buffer.
Finally, it cleans up the etm perf sink callbacks a little bit and
then adds the support for ETR sink. For the ETR, we try our best to
use the perf ring buffer as the target hardware buffer, provided :
1) The ETR is dma coherent (since the pages will be shared with
userspace perf tool).
2) The perf is used in snapshot mode (The ETR cannot be stopped
based on the size of the data written hence we could easily
overwrite the buffer. We may be able to fix this in the future)
3) The ETR supports the Scatter-Gather mode.
If we can't use the perf buffers directly, we fallback to using
software buffering where we have to copy the trace data back
to the perf ring buffer.
Suzuki K Poulose (17):
coresight etr: Disallow perf mode temporarily
coresight tmc: Hide trace buffer handling for file read
coresight: Add helper for inserting synchronization packets
coresight: Add generic TMC sg table framework
coresight: Add support for TMC ETR SG unit
coresight: tmc: Make ETR SG table circular
coresight: tmc etr: Add transparent buffer management
coresight: tmc: Add configuration support for trace buffer size
coresight: Convert driver messages to dev_dbg
coresight: etr: Track if the device is coherent
coresight etr: Handle driver mode specific ETR buffers
coresight etr: Relax collection of trace from sysfs mode
coresight etr: Do not clean ETR trace buffer
coresight: etr: Add support for save restore buffers
coresight: etr_buf: Add helper for padding an area of trace data
coresight: perf: Remove reset_buffer call back for sinks
coresight perf: Add ETR backend support for etm-perf
.../ABI/testing/sysfs-bus-coresight-devices-tmc | 8 +
.../coresight/coresight-dynamic-replicator.c | 4 +-
drivers/hwtracing/coresight/coresight-etb10.c | 72 +-
drivers/hwtracing/coresight/coresight-etm-perf.c | 9 +-
drivers/hwtracing/coresight/coresight-etm3x.c | 4 +-
drivers/hwtracing/coresight/coresight-etm4x.c | 4 +-
drivers/hwtracing/coresight/coresight-funnel.c | 4 +-
drivers/hwtracing/coresight/coresight-priv.h | 8 +
drivers/hwtracing/coresight/coresight-replicator.c | 4 +-
drivers/hwtracing/coresight/coresight-stm.c | 4 +-
drivers/hwtracing/coresight/coresight-tmc-etf.c | 109 +-
drivers/hwtracing/coresight/coresight-tmc-etr.c | 1665 ++++++++++++++++++--
drivers/hwtracing/coresight/coresight-tmc.c | 75 +-
drivers/hwtracing/coresight/coresight-tmc.h | 128 +-
drivers/hwtracing/coresight/coresight-tpiu.c | 4 +-
include/linux/coresight.h | 5 +-
16 files changed, 1837 insertions(+), 270 deletions(-)
--
2.13.6
Hi, I’ve recently acquired a ZedBoard with the Zynq-7000 SoC and was interested in finding out if I could use `perf` as described on https://github.com/Linaro/OpenCSD/blob/master/HOWTO.md to grab trace data.
Unfortunately, zynq-7000.dtsi on (recent) Linux kernels does not yet contain the necessary device definitions, and zynq-zed.dts wasn’t even syntactically correct (but it was just the syntax for the include-statement, so easy to fix).
Based on Muhammad Wahab’s patch floating around the interwebs and studying the Zynq manual, I enabled support for some more of the devices (like the tpiu) in the devicetree.
But most crucially (I guess), I can’t identify what “etr” in the HOWTO corresponds to on the Zynq. This means that the sample line from the HOWTO above
$ ./tools/perf/perf record -e cs_etm/(a)20070000.etr/ --per-thread uname
won’t work.
Does anyone have experience in configuring the devicetree correctly for the Zynq? Should the perf-incantation on the Zynq also use .etr, or is there some other mechanism that perf can use on the Zynq?
Sincerely,
Volker Stolz
Good day all,
The kernel branches on the openCSD repository[1] have been moved to
their new living quarters [2] and the HOWTO.md on [1] modified to
reflect that. All we need to do is remove the kernel branches from
[1], something I'm planning to do by end of business on Monday.
Please get back to me if you need more time so that we can sketch out
a plan.
Thanks,
Mathieu
[1]. https://github.com/Linaro/OpenCSD
[2]. https://github.com/Linaro/perf-opencsd
Good morning Reza,
As highlighted in yesterday's email the best way to get involved in
the CoreSight project is to subscribe to the mailing list [1] and
attend the meeting we hold every two weeks in the #linaro-coresight
channel on freenode. The next meeting is scheduled for October 18th @
4PM (UTC).
As for STM, Chunyan wrote documentation that is available in the
kernel tree [2].
It would be interesting if you guys could come up with a list of items
that you want to see addressed. From there we could post them to the
perf-opencsd wiki (to be officially published imminently) and have
some sort of a tick list for items that are currently being worked on
and those up for grab.
Best regards,
Mathieu
[1]. https://lists.linaro.org/mailman/listinfo/coresight
[2]. http://elixir.free-electrons.com/linux/latest/source/Documentation/trace/st…
Modifyting instructions to point to the new kernel repository on
gitHub so that kernel related branches in the openCSD repository
can be delete.
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
---
HOWTO.md | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/HOWTO.md b/HOWTO.md
index 835def5765e6..dd4fe2057548 100644
--- a/HOWTO.md
+++ b/HOWTO.md
@@ -7,8 +7,8 @@ This HOWTO explains how to use the perf cmd line tools and the openCSD
library to collect and extract program flow traces generated by the
CoreSight IP blocks on a Linux system. The examples have been generated using
an aarch64 Juno-r0 platform. All information is considered accurate and tested
-using library version v0.6 and the `perf-opencsd-master` branch on the
-[OpenCSD github repository][1].
+using the latest version of the library and the `master` branch on the
+[perf-opencsd github repository][1].
On Target Trace Acquisition - Perf Record
@@ -280,7 +280,7 @@ As stated above not all the pieces of the solution have been upstreamed. To
get all the components the latest `perf-opencsd-master` needs to be
obtained:
- linaro@t430:~/linaro/coresight$ git clone -b perf-opencsd-master https://github.com/Linaro/OpenCSD.git perf-opencsd-master
+ linaro@t430:~/linaro/coresight$ git clone -b perf-opencsd-master https://github.com/Linaro/perf-opencsd.git perf-opencsd-master
...
...
@@ -586,7 +586,7 @@ Best regards,
*The Linaro CoreSight Team*
--------------------------------------
-[1]: https://github.com/Linaro/OpenCSD "OpenCSD Github"
+[1]: https://github.com/Linaro/perf-opencsd "perf-opencsd Github"
[2]: http://people.linaro.org/~mathieu.poirier/openCSD/uname.v4.user.sept20.tgz
--
2.7.4